Add media file support for LLM prompter by Copilot · Pull Request #580 · digitalmethodsinitiative/4cat

Copilot · 2026-03-09T17:06:11Z

Extends llm-prompter to work with parent datasets that are media archives (zip files from image downloaders or media imports), not just text-based CSV/NDJSON datasets.

`common/lib/llm.py`

create_multimodal_content() now accepts media_files (local paths, base64-encoded) alongside existing media_urls
_format_media_block() — new helper for provider-specific content blocks:
- Anthropic: image blocks for images, document blocks for video/audio
- OpenAI: data URIs for images/video, input_audio format for audio
- Google/others: data URI with image_url wrapper
generate_text() gains media_files parameter to pass local file paths

`processors/machine_learning/llm_prompter.py`

is_compatible_with() — accepts zip datasets with media_type in (image, video, audio)
get_options() — when parent is a media archive:
- Shows media info panel instead of column bracket instructions
- Hides text-only options (column selection, batching, truncation, media URL toggle)
process() — new media archive code path: iterates zip contents, skips metadata files, base64-encodes each media file, sends to LLM via media_files param. Catches model incompatibility errors (e.g. non-vision model receiving images) with clear user-facing messages.
validate_query() — relaxes column bracket requirement for media archives; allows empty user prompt when system prompt is provided

All existing text-based processing behavior is preserved in the else branch. All models and custom model IDs remain available — incompatibility is caught at generation time rather than upfront.

Original prompt

This section details on the original issue you should resolve

<issue_title>Add media support for LLM prompter</issue_title>
<issue_description>Update llm-prompter.py so that it can work with parent datasets that are media files, e.g. as part of zip datasets generated through the download-images.py processor or as a media import.

This mostly works by checking the type of the parent dataset and changing the LLM prompter options and functionality accordingly.

If the parent dataset is a media file (e.g. png) or a zip with media items, llm-prompter.py is still available but option fields for text-based datasets (e.g. column selection) should not be shown. For instance, information on the column brackets is redundant here, since no 'column' data is inserted. Instead, an info UserInput field clearly indicates that media files are attached to the prompt. Also show other crucial information if needed.

llm.py should be updated accordingly to support video, image, and audio prompts.

Clear evaluation and alerts, both in the back-end and front-end (validate_query), should tell the user if there's any issues with the dataset media files.

In principle, the flexible setup of llm-prompter.py should be retained. All current models as well as custom model IDs should be allowed, and it should be caught whether the model can't process this data type on generating results after the fact (e.g. some don't support vision/audio) (unless LangChain has an easy way of checking this, then it can be called in validate_query).</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes Add media support for LLM prompter #579

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

…ideo/audio Co-authored-by: sal-uva <10960315+sal-uva@users.noreply.github.com>

…defaults, and error detection comments Co-authored-by: sal-uva <10960315+sal-uva@users.noreply.github.com>

…taset, where multiple media files belong to one post id

sal-uva · 2026-03-09T20:18:23Z

This should be mergeable. Not all vendors support video and audio, but that's acceptable in my opinion.

LLM prompter is now quite bulky and should be refactored, but maybe that's for another day.

Copilot

Pull request overview

This PR extends the llm-prompter processor to support media-archive parent datasets (ZIPs containing image/video/audio), enabling multimodal prompting using locally extracted media files in addition to existing URL-based media inputs.

Changes:

Add ZIP media-archive compatibility and a dedicated processing path in LLMPrompter (options/UI + iteration + annotation mapping).
Extend LLMAdapter multimodal support to accept local media file paths (base64-encoded) alongside media URLs, with provider-specific formatting.
Minor UI/UX tweaks for annotation rendering and link wrapping, plus marking AudioExtractor ZIP outputs as media_type="audio".

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
`common/lib/llm.py`	Adds local `media_files` support for multimodal prompts and provider-specific content block formatting.
`processors/machine_learning/llm_prompter.py`	Adds media-archive dataset compatibility, media-specific options, and a ZIP iteration + LLM prompting path.
`processors/audio/audio_extractor.py`	Marks resulting ZIP datasets as audio media type.
`common/assets/llms.json`	Updates/renames several predefined model IDs and model card links.
`webtool/templates/explorer/item-annotations.html`	Uses `item_id` consistently in DOM ids/classes and avoids variable shadowing.
`webtool/static/css/explorer-annotation-generic.css`	Improves wrapping behavior for long annotation label/link text.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

processors/machine_learning/llm_prompter.py

common/lib/llm.py

common/assets/llms.json

webtool/templates/explorer/item-annotations.html

processors/machine_learning/llm_prompter.py

… accurately show the processed items

Initial plan

d415a41

Copilot AI assigned Copilot and sal-uva Mar 9, 2026

Copilot started work on behalf of sal-uva March 9, 2026 17:06 View session

Copilot AI and others added 2 commits March 9, 2026 17:23

Add media support for LLM prompter: handle zip archives with images/v…

cb291e6

…ideo/audio Co-authored-by: sal-uva <10960315+sal-uva@users.noreply.github.com>

Address code review: fix filename checks, prompt fallback, MIME type …

8782a59

…defaults, and error detection comments Co-authored-by: sal-uva <10960315+sal-uva@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add media support for LLM prompter~~ Add media file support for LLM prompter Mar 9, 2026

Copilot finished work on behalf of sal-uva March 9, 2026 17:29

sal-uva added 5 commits March 9, 2026 20:25

feat: update LLMs.json

1857e75

fix: styling on explorer

35c9c97

fix: add media type to audio extractor

815750d

fix: merge annotations from top-dataset->media dataset->llm prompt da…

34d3dbc

…taset, where multiple media files belong to one post id

fix: remove signature outputs for google responses

16acdd0

sal-uva marked this pull request as ready for review March 11, 2026 10:29

Copilot AI review requested due to automatic review settings March 11, 2026 10:29

Copilot started reviewing on behalf of sal-uva March 11, 2026 10:29 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

sal-uva added 5 commits March 12, 2026 19:48

fix: rename .video-metadata.json to metadata.json in audio extractor,…

f3e3796

… accurately show the processed items

fix: truncate URLs in video downloader status update

ab35c7c

fix: more robust showing of processor options in item-annotations.html

cc23cb2

fix: video_metadata.json -> metadata.json in speech to text processor

0ea8db1

fix: various fixes and optimizations in media annotation through LLMs

67adf27

sal-uva merged commit 388687b into master Mar 12, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add media file support for LLM prompter#580

Add media file support for LLM prompter#580
sal-uva merged 13 commits intomasterfrom
copilot/add-media-support-to-llm-prompter

Copilot AI commented Mar 9, 2026 •

edited

Loading

Uh oh!

sal-uva commented Mar 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

common/lib/llm.py

processors/machine_learning/llm_prompter.py

Comments on the Issue (you are @copilot in this section)

Uh oh!

sal-uva commented Mar 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 9, 2026 •

edited

Loading

`common/lib/llm.py`

`processors/machine_learning/llm_prompter.py`