A Python application that uses Vision Language Models (VLMs) to iteratively create original artwork in the style of famous artists. The system starts with a blank canvas and progressively adds strokes suggested by VLMs, building unique images that embody specific artistic styles.
The project includes a Next.js viewer application for interactively exploring generated artworks, viewing stroke-by-stroke creation timelines, and examining metadata and evaluation scores. See output here: https://www.liamlaverty.com/paint-by-language-model/
- Python: 3.12.2 or higher
- Conda: Anaconda or Miniconda for environment management
- A VLM provider (one of):
- Mistral API: Get an API key at https://console.mistral.ai/
- LMStudio (local): Download from https://lmstudio.ai/ and run with server enabled on port 1234
- Anthropic API: Get an API key at https://console.anthropic.com/
- Node.js: 18.0.0 or higher
- pnpm: 9.15.0 or higher (install with
npm install -g pnpm)
conda create -n paint-by-language-model python=3.12.2
conda activate paint-by-language-model
pip install uvFrom the project root, navigate to the package directory and install in editable mode:
cd src/paint_by_language_model
uv pip install -e .This will install the required dependencies.
Copy .env.example to .env and set your API key:
cp .env.example .envEdit .env:
# VLM Provider ("mistral", "lmstudio", or "anthropic")
PROVIDER=mistral
# Mistral API key (required when PROVIDER=mistral)
MISTRAL_API_KEY=your_api_key_here
# Anthropic API key (required when PROVIDER=anthropic)
# ANTHROPIC_API_KEY=your_api_key_here| Provider | Use Case | Auth Required |
|---|---|---|
mistral |
Remote API, production use | Yes (MISTRAL_API_KEY) |
lmstudio |
Local development, no API costs | No |
anthropic |
Remote API, Claude models | Yes (ANTHROPIC_API_KEY) |
The provider can also be overridden via the --provider CLI flag.
For linting, type checking, and testing:
cd src/paint_by_language_model
uv pip install -e ".[dev]"To run the interactive web viewer:
pnpm -C src/viewer installGenerate artwork using the command-line interface:
conda activate paint-by-language-model
cd src/paint_by_language_model
python main.py \
--artist "Vincent van Gogh" \
--subject "Starry Night Landscape" \
--output-id vangogh-001| Argument | Description |
|---|---|
--artist |
Target artist name (e.g., "Claude Monet", "Vincent van Gogh") |
--subject |
Short one-sentence subject description |
--output-id |
Unique identifier for this artwork (used for output directory) |
| Argument | Short | Default | Description |
|---|---|---|---|
--expanded-subject |
None | Detailed multi-sentence description for richer planning | |
--planner-model |
from config | Override planner model (e.g., mistral-large-latest) |
|
--max-iterations |
-i |
10000 | Maximum iterations before stopping |
--target-score |
-t |
75.0 | Target style score (0-100) to stop generation early |
--strokes-per-query |
-n |
5 | Number of strokes per VLM query (1-20) |
--stroke-types |
all | Comma-separated stroke types to use | |
--provider |
from .env |
VLM provider: mistral, lmstudio, or anthropic |
|
--api-key |
from .env |
API key (overrides MISTRAL_API_KEY env var) |
|
--gif-frame-duration |
150 | GIF frame duration in milliseconds | |
--log-level |
INFO | Logging level (DEBUG, INFO, WARNING, ERROR) |
Basic generation:
python main.py \
--artist "Claude Monet" \
--subject "Water Lilies" \
--output-id monet-001With detailed planning (recommended):
python main.py \
--artist "Vincent van Gogh" \
--subject "Starry night over a quiet village" \
--expanded-subject "A swirling night sky filled with dynamic spiral patterns and bright stars dominates the upper canvas. Below, a peaceful village with a prominent church steeple nestles in rolling hills. The sky should feature Van Gogh's characteristic thick, energetic brushwork with deep blues transitioning to lighter tones near the horizon." \
--output-id vangogh-starry-001 \
--planner-model "mistral-large-latest"Custom iteration limits and target score:
python main.py \
--artist "Wassily Kandinsky" \
--subject "Abstract Composition" \
--output-id kandinsky-001 \
--max-iterations 50 \
--target-score 80Debug mode with verbose logging:
python main.py \
--artist "Frida Kahlo" \
--subject "Self Portrait with Flowers" \
--output-id frida-001 \
--log-level DEBUGGenerate with LMStudio (local):
python main.py \
--artist "Claude Monet" \
--subject "Water Lilies" \
--output-id monet-001 \
--provider lmstudioOverride API key at runtime:
python main.py \
--artist "Pablo Picasso" \
--subject "Abstract Portrait" \
--output-id picasso-001 \
--api-key sk-your-key-hereInstead of running individual commands, you can queue up multiple runs in src/datafiles/queue.json and execute them in sequence using run_queue.sh.
Edit src/datafiles/queue.json to define your runs. Each object maps directly to CLI arguments:
[
{
"artist": "Vincent van Gogh",
"subject": "Starry Night Landscape",
"output-id": "vangogh-001-claude-sonnet-4-6",
"isComplete": false,
"expanded-subject": "A swirling night sky over a quiet village...",
"provider": "anthropic",
"planner-model": "claude-sonnet-4-6",
"max-iterations": 500,
"target-score": 80,
"strokes-per-query": 5,
"stroke-types": "line,arc,polyline",
"log-level": "INFO"
}
]| Field | Required | Description |
|---|---|---|
artist |
Yes | Target artist name |
subject |
Yes | Short subject description |
output-id |
Yes | Unique identifier for this run |
isComplete |
Yes | Set to false; automatically updated to true on success |
expanded-subject |
No | Detailed subject description |
provider |
No | VLM provider (mistral, lmstudio, anthropic) |
planner-model |
No | Override planner model |
max-iterations |
No | Maximum iterations |
target-score |
No | Target style score (0-100) |
strokes-per-query |
No | Strokes per VLM query |
stroke-types |
No | Comma-separated stroke types |
api-key |
No | API key override |
gif-frame-duration |
No | GIF frame duration in ms |
log-level |
No | Logging level |
From the project root:
# Run all incomplete entries
./run_queue.sh
# Preview commands without executing
./run_queue.sh --dry-runThe script will:
- Skip any entries where
"isComplete": true - Execute each incomplete run in order
- Automatically set
"isComplete": truein the JSON file after each successful run - Halt immediately if any run fails
Requires
jq: install withbrew install jq
The application will:
- Planning Phase: Generate a structured multi-layer painting plan using a planner LLM
- Initialize a blank canvas (800×600 pixels)
- Query the VLM for stroke suggestions guided by the current layer's plan
- Apply suggested strokes to the canvas
- Evaluate the current canvas against the artist's style and layer objectives
- Advance to the next layer when the stroke-applying VLM indicates layer completion
- Update strategy based on evaluation feedback
- Repeat until target score is reached or max iterations exceeded
- Generate a final report with metrics and layer breakdown
- Create an animated GIF timelapse of the creation process
Resumable: If interrupted, re-running with the same --output-id will resume from the last completed iteration. The painting plan is saved and reused on resume.
All outputs are saved to src/output/{output-id}/:
src/output/vangogh-001/
├── painting_plan.json # Multi-layer painting plan (NEW in Phase 8)
├── timelapse.gif # Animated creation timelapse
├── final_artwork.png # Final completed artwork
├── final_artwork.jpeg # Final artwork (JPEG format)
├── metadata.json # Generation metadata
├── generation_report.md # Human-readable summary
├── viewer_data.json # Aggregated data for web viewer (auto-generated)
├── snapshots/ # Canvas images per iteration
│ ├── iteration-001.png
│ ├── iteration-002.png
│ └── ...
├── strokes/ # Stroke data per iteration (includes layer info)
│ ├── iteration-001.json
│ └── ...
├── evaluations/ # VLM evaluation results (includes layer completion)
│ ├── iteration-001.json
│ └── ...
└── strategies/ # Strategy updates
├── strategy-001.md
└── ...
Note: viewer_data.json is automatically generated after each successful generation. This aggregated file contains all iteration data needed by the web viewer.
The Next.js viewer provides an interactive web interface for exploring generated artworks:
pnpm -C src/viewer devView the gallery at: http://localhost:3000
- Gallery View: Browse all generated artworks with preview cards
- Inspector: Step through artwork creation stroke-by-stroke
- Timeline Playback: Animate the creation process with play/pause controls
- Metadata Display: View artist name, subject, scores, and generation statistics
- Evaluation Insights: See VLM feedback for each iteration (strengths, weaknesses, suggestions)
- Stroke Details: Examine individual stroke parameters, colors, and reasoning
Automatic Export: viewer_data.json is automatically generated after each successful artwork generation.
Manual Export: To re-export data for existing artworks:
conda activate paint-by-language-model
python -c "from src.paint_by_language_model.services.viewer_data_export import export_viewer_data; export_viewer_data('vangogh-001')"Copy to Viewer: Copy artwork directories to the viewer's public data folder:
# Copy (for production builds)
cp -r src/output/vangogh-001 src/viewer/public/data/vangogh-001Minify for Production: Reduce file sizes before deployment:
conda activate paint-by-language-model
python scripts/minify_viewer_data.pypnpm -C src/viewer build
pnpm -C src/viewer start # Serves optimized production buildOr export static HTML:
pnpm -C src/viewer build
# Static files are in src/viewer/out/================================================================================
GENERATION COMPLETE
================================================================================
Artwork ID: vangogh-001
Total Iterations: 45
Final Score: 78.5/100
Total Strokes: 45
Output Directory: src/output/vangogh-001
================================================================================
{
"artwork_id": "vangogh-001",
"artist_name": "Vincent van Gogh",
"subject": "Starry night over a quiet village",
"expanded_subject": "A swirling night sky filled with dynamic spiral patterns...",
"canvas_width": 800,
"canvas_height": 600,
"started_at": "2026-02-05T10:30:45.123456",
"vlm_model": "pixtral-large-latest",
"planner_model": "mistral-large-latest",
"provider": "mistral",
"painting_plan": {
"total_layers": 4,
"layers": [
{"layer_number": 1, "name": "Sky background", "..." : "..."}
]
},
"layer_progression": {
"1": 12,
"2": 15,
"3": 10,
"4": 8
}
}{
"type": "line",
"start_x": 40,
"start_y": 25,
"end_x": 75,
"end_y": 80,
"color_hex": "#0069AB",
"thickness": 8,
"opacity": 0.7,
"reasoning": "VLM's explanation of why this stroke was chosen..."
}{
"style_score": 45.5,
"strengths": ["Bold brushwork", "Good color palette"],
"weaknesses": ["Lacks texture variation"],
"suggestions": "Add more swirling patterns typical of Van Gogh",
"layer_complete": false,
"layer_number": 1
}"Millet style drawing of a person at a computer desk"
conda activate paint-by-language-model
cd src/paint_by_language_model
# Check for issues
ruff check .
# Auto-fix issues
ruff check --fix .
# Format code
ruff format .Configuration: Line length 100, includes pycodestyle, pyflakes, isort, pep8-naming, pyupgrade, bugbear checks.
conda activate paint-by-language-model
cd src/paint_by_language_model
# Check types
mypy .Configuration: Strict mode with required type annotations on all function definitions.
conda activate paint-by-language-model
cd src/paint_by_language_model
# Run all tests
pytest
# Run with coverage report
pytest --cov-report=term-missing
# Skip slow tests
pytest -m "not slow"
# Generate HTML coverage report
pytest --cov-report=htmlTest locations: tests/ directory at project root.
Edit src/paint_by_language_model/config.py to customize:
Canvas Settings:
CANVAS_WIDTH/CANVAS_HEIGHT- Canvas dimensions (default: 800×600)CANVAS_BACKGROUND_COLOR- Background color (default:#FFFFFF)
Stroke Sample Images:
STROKE_SAMPLE_WIDTH/STROKE_SAMPLE_HEIGHT- Sample canvas dimensions (default: 200×100)STROKE_SAMPLE_BACKGROUND- Sample background colour (default:#F5F5F5)STROKES_PER_SAMPLE- Number of example strokes per sample image (default: 5)STROKE_SAMPLE_DIR- Directory for persisted sample PNGs (default:src/datafiles/stroke_samples/)
Stroke Constraints:
MAX_STROKE_THICKNESS/MIN_STROKE_THICKNESS- Thickness range (default: 1-10 pixels)MAX_STROKE_OPACITY/MIN_STROKE_OPACITY- Opacity range (default: 0.1-1.0)SUPPORTED_STROKE_TYPES-["line", "arc", "polyline", "circle", "splatter"]
Provider Settings:
PROVIDER- VLM provider:mistral(default),lmstudio, oranthropicMISTRAL_API_KEY- API key for Mistral (loaded from.env)ANTHROPIC_API_KEY- API key for Anthropic (loaded from.env)VLM_MODEL- Stroke generation model (Mistral:pixtral-large-latest, LMStudio:lmistralai/ministral-3-3b)EVALUATION_VLM_MODEL- Style evaluation modelPLANNER_MODEL- Planning LLM model (Mistral:mistral-large-latest, LMStudio:local-model)VLM_TIMEOUT- Request timeout (default: 180 seconds)PLANNER_TIMEOUT- Planning request timeout (default: 180 seconds)STROKE_PROMPT_TEMPERATURE- Creativity setting (default: 0.7)EVALUATION_PROMPT_TEMPERATURE- Consistency setting (default: 0.3)PLANNER_PROMPT_TEMPERATURE- Planning creativity (default: 0.4)
Generation Loop:
MAX_ITERATIONS- Safety limit (default: 10000)MIN_ITERATIONS- Minimum before early stop (default: 20)
GIF Generation:
GIF_FRAME_DURATION_MS- Frame duration in milliseconds (default: 150)GIF_FINAL_FRAME_HOLD_MS- Final frame hold duration (default: 1500)GIF_MAX_DIMENSION- Max frame width/height for file size optimization (default: 400px)GIF_LOOP_COUNT- Animation loop count, 0 = infinite (default: 0)TARGET_STYLE_SCORE- Target score 0-100 (default: 75.0)
Strategy Management:
STRATEGY_CONTEXT_WINDOW- Recent strategies to include (default: 5)STROKE_PROMPT_INCLUDE_STRATEGY- Include strategy context (default: True)
-
Generation Orchestrator (generation_orchestrator.py)
- Main entry point for generation workflow
- Runs planning phase before generation begins
- Coordinates all components (canvas, VLMs, strategy)
- Handles layer-aware iteration loop and stopping conditions
- Tracks layer progression and advancement
- Supports resumable generation with plan persistence
- Auto-exports viewer data on completion
-
Canvas Manager (services/canvas_manager.py)
- Manages image canvas with PIL
- Applies strokes (lines, arcs, polylines, circles, splatters)
- Validates stroke parameters
- Saves snapshots
- Delegates rendering to
StrokeRendererFactory
-
Planner LLM Client (services/planner_llm_client.py)
- Generates structured multi-layer painting plans before generation
- Uses separate (potentially more capable) text-only model
- Produces layer-by-layer guidance with palettes, techniques, and objectives
- Validates and parses plan responses into structured data
- Plans are cached to disk for resume support
-
Stroke VLM Client (services/stroke_vlm_client.py)
- Queries VLMs with canvas image plus 5 stroke sample images (one per stroke type)
- Builds layer-aware prompts with artist context, plan, and strategy
- Prompt descriptions reference the attached visual sample for each stroke type
- References current layer's palette, techniques, and objectives
- Parses JSON responses robustly (handles malformed VLM output)
- Supports batch stroke generation
- Tracks interaction history for debugging
-
Stroke Sample Generator (services/stroke_sample_generator.py)
- Generates a 200×100 PNG sample image for each of the 5 stroke types
- Each sample contains 5 varied example strokes (differing thickness, opacity, colour, position)
- Uses the existing
CanvasManagerand renderer pipeline for pixel-accurate samples - Persists each PNG to
src/datafiles/stroke_samples/on first generation; subsequent runs load from disk - In-memory cache prevents redundant disk reads within a single run
- Initialised eagerly at
StrokeVLMClientstartup; inspect the saved PNGs to see exactly what the VLM was given
-
Evaluation VLM Client (services/evaluation_vlm_client.py)
- Evaluates canvas against target artist style and layer objectives
- Returns style scores (0-100) and layer completion status
- Provides strengths, weaknesses, and suggestions
- Triggers layer advancement when objectives are met
- Guides strategy updates
-
Strategy Manager (strategy_manager.py)
- Manages multi-iteration context
- Saves and loads strategy files
- Prepends current layer context to strategy guidance
- Provides recent strategy window for prompts
- Tracks strategic evolution over iterations and layers
-
VLM Client (vlm_client.py)
- Provider-aware client supporting Mistral, LMStudio, and Anthropic APIs
- Mistral and LMStudio use OpenAI-compatible format; Anthropic uses a non-OpenAI-compatible Messages API
- Supports single-image multimodal queries (
query_multimodal()) and multi-image queries (query_multimodal_multi_image()) - Multi-image method accepts a list of
(image_bytes, label)tuples; each image is preceded by its label text block - Bearer token authentication (Mistral),
x-api-keyheader (Anthropic), or no auth (LMStudio) - Rate-limit retry with exponential backoff (HTTP 429)
- Configurable temperature per request
-
Viewer Data Export (services/viewer_data_export.py)
- Aggregates iteration data into
viewer_data.json - Embeds base64-encoded snapshot images
- Auto-exports after successful generation
- Optimized format for web viewer performance
- Aggregates iteration data into
-
GIF Generator (services/gif_generator.py)
- Creates animated timelapses from iteration snapshots
- Resizes frames for manageable file sizes
- Configurable frame duration and looping
- Auto-generates
timelapse.gifon completion
-
Stroke Renderers (services/renderers/)
- Modular rendering system for different stroke types
- Implementations: line, arc, polyline, circle, splatter
- Factory pattern for extensibility
- Consistent parameter validation
-
Gallery (src/viewer/src/app/page.tsx)
- Homepage displaying all generated artworks
- Responsive grid of artwork cards
- Static generation at build time
-
Inspector (src/viewer/src/app/inspect/[artworkId]/page.tsx)
- Interactive artwork viewer with timeline
- Stroke-by-stroke playback controls
- Side panel with metadata, evaluation scores, and stroke details
- Canvas overlay rendering
-
Components:
StrokeCanvas: HTML5 canvas for rendering strokesTimeline: Interactive timeline with play/pause controlsSidePanel: Metadata, evaluations, and stroke informationToolbar: Playback controls and display optionsArtworkCard: Preview cards in gallery viewGallery: Responsive grid layout
- Generation: Python backend creates artwork → saves iteration files
- Export:
viewer_data.jsongenerated with aggregated data + base64 images - Deployment: Link/copy artwork folders to
src/viewer/public/data/ - Build: Next.js discovers artworks → generates static pages
- Runtime: Client-side React components render interactive UI
- PaintingPlan (models/painting_plan.py) - Multi-layer plan structure with layers, palettes, and techniques
- PlanLayer (models/painting_plan.py) - Individual layer specification within a plan
- Stroke (models/stroke.py) - Drawing operation parameters
- StrokeVLMResponse (models/stroke_vlm_response.py) - VLM response structure
- EvaluationResult (models/evaluation_result.py) - Style evaluation data with layer completion status
- CanvasState (models/canvas_state.py) - Canvas state snapshot