Skip to content

Visual capture

Colin Greenstreet edited this page Feb 8, 2026 · 5 revisions
Metadata

Author: Claude Opus 4.5 in Claude Ottoman Turkish Project, prompted and edited by Colin Greenstreet | Wiki entry created: Wednesday, February 4th 2026 | Wiki entry modified: Sunday, February 8th 2026

Version: v1.1

Version history:

  • v1.0 (4 February 2026): Initial draft.
  • v1.1 (8 February 2026): Minor formatting edits; added V3-S-Siyakat details; added collapsing metadata feature.

Validation: This wiki entry requires validation by Ottoman Turkish scholars


Stage one: Visual capture skill files (V3-S family)

These operate at Stage 1 of the two-stage pipeline. They run on Google Gemini 3 Pro Preview at extreme visual-grounding settings (Temperature 0.2, Media Resolution High, Thinking Low, Top-P 0.15). Their purpose is pure script-to-Unicode conversion with zero semantic interpretation.


V3-S-Minimal

File: V3-S-Minimal
Name: V3-S-Minimal
Date created: [ADD DATA]
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Multiple script, genre and document types

What the skill file does:

  • Why developed: Need for visual capture file for printed and handwritten Ottoman scripts
  • Scope: Single-column/single-page Perso-Arabic script capture.
  • Design features:[ADD DATA]

Relationship to other skill files:

  • Parent: No parent
  • Supersedes: First in new line of visual capture files
  • Downstream pair: All V3-S-Newspaper versions; V3-S-Siyakat
  • Key lesson embodied: N/A

V3-S-Newspaper

File: V3_S_Newspaper_Skill_File_with_User_Notes.md
Name: V3-S-Newspaper v1.0
Date created: December 30, 2025
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Experiment 153 — Peyam newspaper (c. 1919), a Second Constitutional Period–era newspaper. Source: Internet Archive. Result: 5.6s processing time but reversed column order.

What the skill file does:
Extends V3-S-Minimal for multi-column Ottoman newspaper pages. Introduces zone-based processing: a masthead zone (horizontal bands) followed by a column zone (right-to-left, top-to-bottom). Handles non-text elements (photographs, illustrations, advertisements, tables) via flag-and-skip. Outputs nested JSON with per-zone line arrays and visual anomaly flags. Includes a comprehensive 260-line "User Notes" section documenting design decisions, alternatives considered, and known limitations — not passed to Gemini.

Relationship to other skill files:

  • Parent: V3-S-Minimal
  • Downstream pair: V3-T-Newspaper (planned at time of writing; later created)
  • Superseded by: v1.1, then v1.2

V3-S-Newspaper v1.1

File: V3_S_Newspaper_Skill_File_v1_1_with_User_Notes.md
Name: V3-S-Newspaper v1.1
Date created: December 30, 2025
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Experiment 154 — same Peyam page. FAILED CATASTROPHICALLY. Processing time exploded from 5.6s to 236s. Gemini's thought trace showed massive semantic reasoning about article boundaries, date discrepancies, and content flow — precisely what V3-S-Minimal forbids. Column order still reversed.

What the skill file does:
Attempted to improve v1.0 by adding: operator pre-specification of visual elements ("Visual Element Inventory"), a verification checkpoint for column count, position labels ("rightmost," "center-left"), and enhanced anomaly flagging. All additions proved counterproductive.

Relationship to other skill files:

  • Parent: V3-S-Newspaper v1.0
  • Status: Deprecated. A critical negative result — the empirical demonstration of the Skill File Paradox. The language "LAYOUT SPECIFICATION (operator-verified)," "inventory," and "VERIFICATION CHECKPOINT" triggered catastrophic semantic processing.
  • Replaced by: V3-S-Newspaper v1.2

V3-S-Newspaper v1.2

File: V3_S_Newspaper_Skill_File_v1_2_with_User_Notes.md
Name: V3-S-Newspaper v1.2
Date created: December 30, 2025
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Designed post-Experiment 154 to restore V3-S-Minimal's processing efficiency. Target: <10 seconds processing, zero content words in thought trace.

What the skill file does:
Radical simplification responding to v1.1's failure. Returns to V3-S-Minimal's "camera" philosophy with one key innovation: edge-anchored column instructions using the "typewriter model" (start at right edge → scan down → step left → repeat). Bans all relational/semantic language ("rightmost" → "at right edge"; removes "verify," "inventory," "compare," "determine," "article," "headline"). Simplifies non-text elements to generic [IMAGE], [RULE], [END]. Removes position labels, verification checkpoints, and the visual element inventory. The Gemini-facing skill file is roughly half the length of v1.1's.

Relationship to other skill files:

  • Parent: V3-S-Newspaper v1.0 (conceptually returns to v1.0's simplicity with edge-anchoring fix)
  • Supersedes: v1.0 and v1.1
  • Downstream pair: V3-T-Newspaper
  • Key lesson embodied: The Skill File Paradox — complex instructions degrade visual capture by triggering semantic processing.

V3-S-Siyakat:

File: XXX
Name: [ADD DATA]
Date created: [ADD DATA]
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Untested

What the skill file does:

  • Why developed: Uniquely challenging visual characteristics of Siyakat script used for specialized Ottoman financial documents.
  • Design features:[ADD DATA]

Relationship to other skill files:

  • Parent: [ADD DATA]
  • Supersedes: First in new line of visual capture files
  • Downstream pair: N/A
  • Key lesson embodied: N/A

V3-S-Naskh:

File: XXX
Name: [ADD DATA]
Date created: [ADD DATA]
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Untested

What the skill file does:

  • Why developed: Visual capture variant for Arabic naskh script. Developed for one off exploration of complex image sourced from LinkedIn posting by British Library
  • Design features:[ADD DATA]

Relationship to other skill files:

  • Parent: [ADD DATA]
  • Supersedes: First in new line of visual capture files
  • Downstream pair: N/A
  • Key lesson embodied: N/A

Last updated: 8 February 2026 · v1.1

Clone this wiki locally