vex

I have almost 12,000 screenshots. Phone, laptop, years of them.

The pattern is always the same: see something interesting, screenshot it, think "I'll come back to this." Never do.

Recipes I meant to cook. Error messages I needed to debug. Articles I wanted to read properly. Code snippets. Memes. Random thoughts someone posted that resonated.

All sitting in folders, unsearchable, effectively lost.

This is a perfect task to outsource to AI. Each image takes a human 10-30 seconds to process mentally. A vision model does it in under a second, for fractions of a cent.

vex (vision extraction) batch-processes screenshots through Claude. It looks at each image, figures out what it is, and extracts the useful information. A recipe becomes searchable text. Technical content gets categorised. Pure images get described.

The output is a JSON dump of everything - searchable, sortable, finally usable.

The Problem It Solves

Screenshots are write-only storage. Easy to capture, impossible to retrieve.

Search doesn't work on images. You can't grep a photo of a recipe. You can't find that error message from six months ago unless you remember exactly when you saw it.

This tool turns screenshots into structured data. Now they're searchable.

Setup

git clone https://github.com/teejayen/vex.git
cd vex
uv sync

# Add your API key
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env

Usage

# Test with a few images first
vex process /path/to/screenshots --limit 10

# Process a directory (realtime, one at a time)
vex process /path/to/screenshots

# Use batch API for large sets (50% cheaper, results in <24hr)
vex process /path/to/screenshots --batch

# Check batch progress
vex status msgbatch_xxx

# Download batch results
vex results msgbatch_xxx

# Sort processed files into category folders
vex organise results.jsonl --target /path/to/sorted/

What It Extracts

The extraction adapts to what it sees:

Type	What You Get
Recipe	Title, full ingredients, method steps, servings, cook time
Code/Technical	Language, explanation, the actual code verbatim
Error Message	Exact error text, platform, stack trace, likely cause
Article/Text	Key points, quotes, author, source
Chat/Social	Who said what, platform, context, links shared
Document	Type, key content, dates, reference numbers
Meme/Image	Description, all text transcribed, context
UI/App	App name, screen shown, settings or data displayed
Shopping/Product	Item, price, store, specs, link
Map/Location	Place name, full address, directions
Receipt/Transaction	Merchant, amount, date, order number, items
Booking/Event	Event, date, time, venue, confirmation number
Contact Info	Name, phone, email, company, address
Music/Media	Song, artist, album, playlist, platform
Quote/Inspiration	The quote, attribution, source
Health/Fitness	Metrics, values, dates, what's tracked
Settings/Config	App, what settings, current values

Everything gets a category and tags for filtering later.

Output

Results land in a JSONL file (one JSON object per line):

{
    "path": "/screenshots/IMG_4521.png",
    "type": "article",
    "category": "tech",
    "summary": "Blog post about building Arc, a thinking partner that remembers context between sessions",
    "extracted_content": "Building Arc: A Thinking Partner That Remembers\n\nArc is a personal AI system built on Claude that maintains context across sessions. Key features: persistent state via markdown files, journal for pattern detection over time, decision capture with reasoning, weekly reviews surfacing insights.\n\nThe difference between a stateless chatbot and a genuine thinking partner is memory.",
    "tags": ["ai", "claude", "productivity", "arc", "thinking-partner"],
    "source": "tim.neilen.com.au",
    "processed_at": "2025-01-07T12:00:00",
    "tokens_used": 1847
}

Costs

Using Haiku 4.5 for 12,000 screenshots: roughly $10-15 USD.

Batch mode cuts that in half.

Model	Realtime	Batch Mode
Haiku 4.5	~$12	~$6
Sonnet 4.5	~$36	~$18

Commands

vex process <directory>     Extract content from images
vex status <batch_id>       Check batch processing status
vex results <batch_id>      Download completed batch results
vex organise <jsonl>        Sort files into category folders

Process Options

-o, --output FILE     Output file (default: vex-results.jsonl)
-m, --model MODEL     Model to use (default: claude-haiku-4-5-20251001)
-r, --rate SECONDS    Delay between requests (default: 0.5)
-l, --limit N         Process only N images
--batch               Use batch API (50% off, <24hr processing)
--no-resume           Start fresh, ignore previous progress
--dry-run             List files without processing

Organise Options

--target DIR          Target directory for sorted files (required)
--copy                Copy files instead of moving them

Resume Support

Processing saves progress as it goes. If it stops, run the same command again - it picks up where it left off.

For batch mode, use vex status and vex results to check progress and download when ready.

AgentSkill

vex is also available as an AgentSkill - a portable format that lets AI agents process screenshots directly using their own vision capabilities, no script required.

The SKILL.md file contains instructions for agentic extraction. Any compatible agent can:

Find images in a directory
Analyse each image using vision
Output structured JSONL

This means you can use vex's extraction logic in Claude Code, GitHub Copilot, or any agent that supports the AgentSkills format.

What Next?

The extracted data becomes useful beyond search. Feed it to a personal knowledge system, surface patterns in what you capture, auto-route content to the right places.

I built Arc - a thinking partner that holds context between sessions. Screenshot data is perfect input: attention signals, things that mattered enough to capture, patterns over time.

Built by Tim Neilen because I was tired of screenshots being a black hole.

Written with AI. I provided the problem and direction; Claude wrote the code. More on how I use AI.

MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock
vex.py		vex.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

vex

The Problem It Solves

Setup

Usage

What It Extracts

Output

Costs

Commands

Process Options

Organise Options

Resume Support

AgentSkill

What Next?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

teejayen/vex

Folders and files

Latest commit

History

Repository files navigation

vex

The Problem It Solves

Setup

Usage

What It Extracts

Output

Costs

Commands

Process Options

Organise Options

Resume Support

AgentSkill

What Next?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages