-
Notifications
You must be signed in to change notification settings - Fork 0
Visual capture
Metadata
Author: Claude Opus 4.5 in Claude Ottoman Turkish Project, prompted and edited by Colin Greenstreet | Wiki entry created: Wednesday, February 4th 2026 | Wiki entry modified: Sunday, February 8th 2026
Version: v1.1
Version history:
- v1.0 (4 February 2026): Initial draft.
- v1.1 (8 February 2026): Minor formatting edits; added V3-S-Siyakat details; added collapsing metadata feature.
Validation: This wiki entry requires validation by Ottoman Turkish scholars
These operate at Stage 1 of the two-stage pipeline. They run on Google Gemini 3 Pro Preview at extreme visual-grounding settings (Temperature 0.2, Media Resolution High, Thinking Low, Top-P 0.15). Their purpose is pure script-to-Unicode conversion with zero semantic interpretation.
File: V3-S-Minimal
Name: V3-S-Minimal
Date created: [ADD DATA]
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Multiple script, genre and document types
What the skill file does:
- Why developed: Need for visual capture file for printed and handwritten Ottoman scripts
- Scope: Single-column/single-page Perso-Arabic script capture.
- Design features:[ADD DATA]
Relationship to other skill files:
- Parent: No parent
- Supersedes: First in new line of visual capture files
- Downstream pair: All V3-S-Newspaper versions; V3-S-Siyakat
- Key lesson embodied: N/A
File: V3_S_Newspaper_Skill_File_with_User_Notes.md
Name: V3-S-Newspaper v1.0
Date created: December 30, 2025
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Experiment 153 — Peyam newspaper (c. 1919), a Second Constitutional Period–era newspaper. Source: Internet Archive. Result: 5.6s processing time but reversed column order.
What the skill file does:
Extends V3-S-Minimal for multi-column Ottoman newspaper pages. Introduces zone-based processing: a masthead zone (horizontal bands) followed by a column zone (right-to-left, top-to-bottom). Handles non-text elements (photographs, illustrations, advertisements, tables) via flag-and-skip. Outputs nested JSON with per-zone line arrays and visual anomaly flags. Includes a comprehensive 260-line "User Notes" section documenting design decisions, alternatives considered, and known limitations — not passed to Gemini.
Relationship to other skill files:
- Parent: V3-S-Minimal
- Downstream pair: V3-T-Newspaper (planned at time of writing; later created)
- Superseded by: v1.1, then v1.2
File: V3_S_Newspaper_Skill_File_v1_1_with_User_Notes.md
Name: V3-S-Newspaper v1.1
Date created: December 30, 2025
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Experiment 154 — same Peyam page. FAILED CATASTROPHICALLY. Processing time exploded from 5.6s to 236s. Gemini's thought trace showed massive semantic reasoning about article boundaries, date discrepancies, and content flow — precisely what V3-S-Minimal forbids. Column order still reversed.
What the skill file does:
Attempted to improve v1.0 by adding: operator pre-specification of visual elements ("Visual Element Inventory"), a verification checkpoint for column count, position labels ("rightmost," "center-left"), and enhanced anomaly flagging. All additions proved counterproductive.
Relationship to other skill files:
- Parent: V3-S-Newspaper v1.0
- Status: Deprecated. A critical negative result — the empirical demonstration of the Skill File Paradox. The language "LAYOUT SPECIFICATION (operator-verified)," "inventory," and "VERIFICATION CHECKPOINT" triggered catastrophic semantic processing.
- Replaced by: V3-S-Newspaper v1.2
File: V3_S_Newspaper_Skill_File_v1_2_with_User_Notes.md
Name: V3-S-Newspaper v1.2
Date created: December 30, 2025
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Designed post-Experiment 154 to restore V3-S-Minimal's processing efficiency. Target: <10 seconds processing, zero content words in thought trace.
What the skill file does:
Radical simplification responding to v1.1's failure. Returns to V3-S-Minimal's "camera" philosophy with one key innovation: edge-anchored column instructions using the "typewriter model" (start at right edge → scan down → step left → repeat). Bans all relational/semantic language ("rightmost" → "at right edge"; removes "verify," "inventory," "compare," "determine," "article," "headline"). Simplifies non-text elements to generic [IMAGE], [RULE], [END]. Removes position labels, verification checkpoints, and the visual element inventory. The Gemini-facing skill file is roughly half the length of v1.1's.
Relationship to other skill files:
- Parent: V3-S-Newspaper v1.0 (conceptually returns to v1.0's simplicity with edge-anchoring fix)
- Supersedes: v1.0 and v1.1
- Downstream pair: V3-T-Newspaper
- Key lesson embodied: The Skill File Paradox — complex instructions degrade visual capture by triggering semantic processing.
File: XXX
Name: [ADD DATA]
Date created: [ADD DATA]
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Untested
What the skill file does:
- Why developed: Uniquely challenging visual characteristics of Siyakat script used for specialized Ottoman financial documents.
- Design features:[ADD DATA]
Relationship to other skill files:
- Parent: [ADD DATA]
- Supersedes: First in new line of visual capture files
- Downstream pair: N/A
- Key lesson embodied: N/A
File: XXX
Name: [ADD DATA]
Date created: [ADD DATA]
Created by: Colin Greenstreet (with Claude Opus 4.5)
Tested: Untested
What the skill file does:
- Why developed: Visual capture variant for Arabic naskh script. Developed for one off exploration of complex image sourced from LinkedIn posting by British Library
- Design features:[ADD DATA]
Relationship to other skill files:
- Parent: [ADD DATA]
- Supersedes: First in new line of visual capture files
- Downstream pair: N/A
- Key lesson embodied: N/A
Last updated: 8 February 2026 · v1.1
ottoman-archive wiki