Turn prompt-heavy workflows into recoverable, validator-first harness projects with loop presets, task-family presets, and upgrade-friendly doctrine.
Why it matters · Before vs After · How it works · What you get · Project Preset Gallery · Examples · Quick Start · Decision Model · Contributing · Roadmap · Releasing
Most agent failures are not model failures. They are harness failures.
What actually breaks in practice:
- the execution contract is vague
- state lives only in chat memory
- a loop tries to do too much in one pass
- validation is weak or missing
- the scaffold is too generic for the real task
harness-engineer exists to fix that. It helps Codex design the harness before it improvises one.
The shift is the whole point of the project:
- Before: one giant prompt, hidden state, fuzzy boundaries, weak or missing validators
- After: explicit docs, file-based state, bounded loop passes, validator-first progression
|
1. Freeze the contract Clarify inputs, outputs, success criteria, the smallest verifiable unit, and what counts as failure before scaffolding anything. |
2. Choose the shape Pick a loop preset for runtime behavior and a project preset for task-family structure. |
3. Generate and verify Scaffold files, externalize state, run validators, and leave behind a harness that survives fresh-context restarts. |
|
Doctrine Layer Practical harness engineering guidance distilled from OpenAI, Anthropic, Ralph, OpenHarness, and hands-on local practice. |
Loop Presets Control how the harness runs: baseline for general scaffolds, ralph-loop for resumable multi-pass execution.
|
Project Presets Control the work shape: batch processing, repo coding, research collection, or UI validation. |
Scaffold Engine A modular Python generator that emits docs, progress state, manifests, validators, and runner placeholders. |
The visual language mirrors the skill itself:
- deep blue for structure and systems
- green for validated forward motion
- violet for loop orchestration and preset logic
- amber for controlled evolution and caution points
|
|
|
|
|
|
Example 1: Batch OCR and enrichment
Use:
Use $harness-engineer to scaffold a Ralph Loop project for OCR and post-processing on a folder of scanned documents.
Suggested shape:
--preset ralph-loop--project-preset batch-processing
What you get:
- bounded batch progression
tasks.jsonfor mutable unit state- input/output/artifact directories
- archive-friendly structure for final outputs
Example 2: Long-running code remediation
Use:
Use $harness-engineer to design a recoverable harness for fixing one codebase issue per pass.
Suggested shape:
--preset ralph-loop--project-preset repo-coding
What you get:
- feature or task state
- codebase pattern memory
- scoped feature-plan docs
- runner + validator flow that supports incremental repair
Example 3: Research collection and synthesis
Use:
Use $harness-engineer to scaffold a research harness that gathers sources, stores evidence, and synthesizes findings over multiple passes.
Suggested shape:
--preset baselineor--preset ralph-loopdepending on loop needs--project-preset research-collection
What you get:
- source manifest
- evidence and findings separation
- explicit research protocol
- structure that discourages mixing raw notes with validated output
Example 4: UI work with browser evidence
Use:
Use $harness-engineer to scaffold a harness for browser-visible feature work with screenshot-based validation.
Suggested shape:
--preset ralph-loop--project-preset ui-validation
What you get:
- screenshot, trace, and verdict directories
- UI validation reference doc
- stronger prompt guardrails around rendered-state evidence
- Current public state:
main - Stability: validated across loop presets, runner variants, and all current project presets
- Scope: one current skill, one historical snapshot, one modular scaffold engine
- Evolution model: doctrine first, scaffold second, trigger text last
Windows PowerShell
Copy-Item -LiteralPath .\skills\harness-engineer -Destination "$HOME\.codex\skills\harness-engineer" -Recurse -ForcemacOS / Linux
mkdir -p ~/.codex/skills
cp -R ./skills/harness-engineer ~/.codex/skills/harness-engineerUse $harness-engineer to clarify requirements and scaffold a robust harness project.
Typical prompts:
Use $harness-engineer to design a harness for a batch document-processing pipeline.Use $harness-engineer to refactor this prompt-only workflow into a recoverable harness.Use $harness-engineer to scaffold a Ralph Loop project for a multi-pass remediation task.
The skill has two independent control surfaces.
This answers: How should the harness run?
| Loop preset | Use it when | Typical result |
|---|---|---|
baseline |
one scaffolded harness is enough and no explicit repeated loop policy is needed yet | simple runner, validator, docs, progress file |
ralph-loop |
work advances in repeated passes and must survive fresh-context restarts | PROMPT.md, tasks.json, batch plan, Ralph runner, loop contract |
This answers: What shape should this work take?
| Project preset | Best for | Adds |
|---|---|---|
generic |
task-agnostic scaffolds | no extra overlays |
batch-processing |
OCR, conversion, enrichment, bulk transforms | input/, output/, artifacts/, batch manifest, batch contract |
repo-coding |
incremental codebase work | features.json, codebase patterns, current feature plan |
research-collection |
source gathering and evidence synthesis | sources/, notes/, findings/, evidence/, source manifest |
ui-validation |
browser-visible work | screenshots/, traces/, verdicts/, UI verdict template |
The skill ships with a modular scaffold engine:
skills/harness-engineer/scripts/init_harness_project.py
python .\skills\harness-engineer\scripts\init_harness_project.py .\output --project-name "Example Harness"python .\skills\harness-engineer\scripts\init_harness_project.py .\output --project-name "Example Ralph Batch" --preset ralph-loop --project-preset batch-processing --batch-size 5--preset baseline|ralph-loop--project-preset generic|batch-processing|repo-coding|research-collection|ui-validation--topology--runner--batch-size--with-features-file--with-failure-log--with-archives
AGENTS.mdconfig.yamlprogress.txtdocs/scripts/- validator placeholder
- summary placeholder
- baseline scaffold
PROMPT.mdtasks.jsondocs/exec-plans/current-batch-plan.mdlogs/failure-log.jsonlarchives/- Ralph-style runner placeholder
batch-processing: batch manifest, pipeline dirs, archive biasrepo-coding: feature state, codebase patterns, current feature planresearch-collection: source manifest, evidence dirs, findings docsui-validation: verdict template, screenshot and trace dirs
harness-engineer-skill/
├── assets/ # landing-page visuals and icon system
├── skills/
│ └── harness-engineer/
│ ├── SKILL.md
│ ├── agents/openai.yaml
│ ├── references/ # doctrine and decision rules
│ └── scripts/ # modular scaffold generator
├── snapshots/ # rollback and historical comparison
├── README.md
├── README.zh-CN.md
├── CONTRIBUTING.md
├── ROADMAP.md
├── RELEASING.md
└── versions.json
| Version | Path | Notes |
|---|---|---|
| Current | skills/harness-engineer/ |
Active release with Ralph Loop and project presets |
| Snapshot | snapshots/harness-engineer-backup-20260408-161519/ |
Backup from before the Ralph preset upgrade |
This repository is an original synthesis shaped by:
- OpenAI harness engineering ideas
- Anthropic articles on long-running harnesses
snarktank/ralphHKUDS/OpenHarness- distilled practitioner notes from real local use
It is not an official upstream release of any of those projects.
Better prompts help. Better harnesses survive.
The skill assumes:
- state should live in files, not chat memory
- validators matter more than optimistic self-reporting
- topology should stay as small as possible
- scaffolding should stay replaceable as models improve
The current skill has been validated with:
quick_validate.pyagainst the skill itself- Python compile checks for every scaffold module
- smoke tests for:
- baseline scaffold generation
- Ralph Loop scaffold generation
- generated validator execution
- generated Python, PowerShell, and Bash runners
- all current project preset overlays
- Human project owner and curator: repository maintainer
- AI implementation and packaging support: OpenAI Codex
This repository uses explicit README attribution for Codex. If you also want Codex-like attribution inside commit metadata, use a dedicated co-author trailer or bot/account identity in future commits.
- Contribution guide: CONTRIBUTING.md
- Roadmap: ROADMAP.md
- Release process: RELEASING.md
MIT. See LICENSE.