Axiom

axiom /ˈak.si.əm/ — from Greek axioma ("that which is thought worthy"), from axios ("worthy"). A statement accepted as true without proof, serving as a starting point for further reasoning.

In mathematics, axioms are the foundational truths from which all theorems are derived. You don't prove an axiom — you build on it. Everything that follows must be consistent with it, or the system collapses.

That's what this pipeline does. You state the premise — the feature you want. Axiom treats it as a foundational constraint and derives everything from it: the design, the contracts, the implementation, the tests, the review. Every stage must be consistent with the premise, or it doesn't pass the gate. No hand-waving. No "it looks about right." If the proof contradicts the axiom, the proof is wrong.

The most rigorous autonomous development pipeline for Claude Code. Contract-enforced, test-first, self-correcting.

One command. Eight stages. A pull request built from constraints — not vibes.

Opus derives the plan, challenges its own design choices, and reviews the result through three independent specialist agents. Sonnet implements in parallel against machine-enforceable contracts. Tests are generated before implementation. Failures are classified and corrected with surgical precision. The pipeline learns from every run.

/axiom "add stripe webhook handler"

What Makes Axiom Different

Most AI coding tools generate code and hope it works. Axiom enforces correctness at every boundary.

Capability	Axiom	Cursor / Copilot / Devin / Others
Contract enforcement	Per-unit contracts with exact signatures, return shapes, side effects, blast radius restrictions. Machine-enforced, not advisory.	No formal contracts. Plans are natural language guidance.
Tests before implementation	ATDD: executable acceptance tests generated from EARS-formatted requirements before agents write code. Agents implement against failing tests.	Tests generated after implementation (biased toward confirming what was built).
Failure classification	Failures typed as SYNTAX/LOGIC/ARCH/AMBIGUOUS with per-type correction strategies and budgets. SYNTAX gets cheap self-correction; ARCH triggers contract rewriting.	All failures treated the same: retry with error context.
Independent review	Three parallel specialist agents (Security, Code Quality, Acceptance) that cannot see the design intent — prevents anchoring bias.	Single reviewer with full context, or no independent review.
Structured deliberation	RALPLAN-DR: mandatory antithesis (steelman the alternative), tradeoff tensions, Architecture Decision Records. Architect and Critic agents challenge every design choice.	Plan-then-execute without structured challenge.
Cross-run learning	Project memory records codebase patterns, successful corrections, and reviewer feedback. Run N+1 is smarter than run N.	Sessions are ephemeral or cloud-persisted but not learnable.
Stage rollback	Git tags at every stage boundary. `--rollback-to-stage 2` to try a different decomposition without restarting.	Start from scratch on failure, or hope the retry works.
Requirements notation	EARS (Easy Approach to Requirements Syntax) — structured patterns that force completeness and are directly testable.	Free-text requirements prone to ambiguity.
Reusable playbooks	`.axiom/playbooks/` encode domain knowledge (procedures, constraints, forbidden actions) for repeatable task types.	Start from scratch each time, or rely on prompt engineering.
Worktree isolation	Per-feature worktree pools with per-unit sub-worktrees. Atomic slot claiming. Concurrent features without conflicts.	Most tools share a single working directory.

Start in 60 Seconds

Prerequisites: Claude Code installed, git, and gh CLI authenticated.

# Install the skill
mkdir -p ~/.claude/skills/axiom
cp SKILL.md ~/.claude/skills/axiom/SKILL.md

# Or clone and symlink (edits stay in sync)
git clone https://github.com/Guipetris/axiom.git
ln -s "$(pwd)/axiom/SKILL.md" ~/.claude/skills/axiom/SKILL.md

Then in any project:

/axiom "add user authentication with JWT"

What happens next:

Brainstorming explores your idea — clarifying questions, approaches, design spec with automated review
Opus scores the spec using EARS notation, asks targeted refinement questions, extracts structured design
Haiku inspects the repo state (clean tree, branch protection, conflicts, worktree claim)
Opus decomposes into units, authors per-unit contracts, runs RALPLAN-DR deliberation (Architect challenges, Critic validates, ADR recorded)
Acceptance tests generated from EARS criteria — tests exist before a single line of implementation
Parallel agents implement each unit against its contract + failing tests, with per-unit model routing (Haiku/Sonnet/Opus by complexity)
Iterative validation — up to 5 fix-recheck cycles before escalating to the classified correction system
Three independent reviewers (Security, Code Quality, Acceptance) evaluate without seeing the design conversation
Haiku commits and opens a PR with the Architecture Decision Record in the body

You interact during the design phase. Everything after that is autonomous.

What It Does

Axiom is a single Claude Code skill (SKILL.md) that orchestrates an end-to-end feature pipeline. It replaces the manual workflow of: brainstorm → plan → implement → test → review → commit — with a gated, self-correcting autonomous process governed by constraints and contracts.

The Stages

Stage	What happens	Model
Pre-stage	Establish the Premise. Explore context, clarify requirements, propose approaches, write & review a design spec	Opus
Stage 0	Refine the Premise. Score clarity, EARS-format requirements, extract structured design with project memory context	Opus
Stage 1	Inspect the System. Check git state, branch protection, detect conflicts, claim worktree. Optional MCP context engine for large codebases	Haiku
Stage 2	Derive the Axioms. RALPLAN-DR deliberation (Architect + Critic challenge design), decompose into units, author per-unit contracts with path-scoped instructions, generate PRD, record ADR	Opus
Stage 2.5	Generate the Tests. ATDD: create executable acceptance tests from EARS criteria. Tests must fail (RED). Agents implement against them	Sonnet
Stage 3	Execute the Inference. Parallel agents with per-unit model routing, each implementing against one contract + failing tests in an isolated worktree. Inter-agent signal protocol for ambiguities	Haiku/Sonnet/Opus
Stage 4	Verify the Proof. Iterative validation cycling (max 5 fix-recheck rounds), update PRD per-story status, optional browser-based verification via Playwright	Haiku
Stage 5	Test for Contradictions. Three parallel independent specialist reviewers (Security, Code Quality, Acceptance) who cannot see the design intent	Opus x 3
Stage 6	Commit the Result. PRD completion gate, conventional commit, PR with ADR in body, stage snapshot cleanup	Haiku

Each stage must pass its gate before the next begins. Failures enter a classified correction chain with per-type budgets and cross-run learning.

Invocation Modes

# Describe what you want — full brainstorming + pipeline
/axiom "add stripe webhook handler"

# From a GitHub issue — ticket context seeds brainstorming
/axiom --ticket GH-142

# From an existing design spec — validates and refines before pipeline
/axiom --spec docs/specs/2026-03-15-webhooks-design.md

# With a reusable playbook — domain knowledge seeds the design
/axiom --playbook api-endpoint "add user preferences endpoint"

# Plan only — design + contracts, no implementation
/axiom --plan-only "add billing integration"

# Review only — validate an existing branch against its contracts
/axiom --review-only --branch feature/billing

# Roll back to a stage and re-run with different approach
/axiom --rollback-to-stage 2

# Resume an interrupted run
/axiom --resume ax-a3f2c1d9

Use This When

You want a complete feature implemented end-to-end from a brief description
The feature touches multiple files and needs coordinated design
You want Opus reviewing every architectural decision, not just the final PR
You want contract-enforced implementation — agents can't deviate from the plan
You want parallel development with git worktree isolation
You have a GitHub issue and want it turned into a PR automatically

Don't Use This When

You need a quick single-file fix — just ask Claude directly
You want to explore options without committing to implementation — use /brainstorming alone
You only want a plan without execution — ask for a plan
The task is fully specified and trivial — direct implementation is faster
Your project has no git repository and you don't want one

Why This Exists

Most AI coding workflows generate code and hope for the best. Even sophisticated tools like Cursor, Copilot Agent, and Devin lack formal contracts between their planning and implementation phases. The result: implementation drift, unverified assumptions, and "tests that pass because they test what was built, not what was specified."

Axiom applies five constraints that no other tool enforces together:

Contract enforcement. Each implementation unit gets an Opus-authored contract specifying exact function signatures, return shapes, side effects, and restricted files. Agents implement against these contracts — not a vague description. Violations are caught at the contract gate.

Tests before code. Acceptance tests are generated from EARS-formatted requirements before implementation begins. Agents implement against failing tests, not just contract text. This eliminates the "tests confirm what was built" bias.

Separation of design and review. The model that designed the architecture (Opus) is not the same agent reviewing its implementation. Three independent specialist reviewers (Security, Quality, Acceptance) evaluate the code without seeing the design conversation — preventing anchoring bias.

Structured deliberation. Every design choice goes through a RALPLAN-DR challenge: an independent Architect steelmans the alternative, a Critic validates principles, and an Architecture Decision Record documents why the chosen approach was selected over alternatives.

Classified self-correction. Failures are typed (SYNTAX, LOGIC, ARCH, AMBIGUOUS) and routed to the appropriate fix strategy with per-unit budgets. The pipeline learns from corrections across runs via project memory.

How It Works

User: "add stripe webhook handler"
  │
  ▼
Brainstorming (Opus)
  │ Explore codebase, clarify requirements, propose approaches
  │ Write design spec → automated review → user approval
  │
  ▼
Stage 0 — Refine the Premise (Opus)
  │ Load project memory (cross-run learnings)
  │ EARS-format requirements, score clarity, extract design
  │ [snapshot]
  │
  ▼
Stage 1 — Inspect the System (Haiku)
  │ Git status, branch protection, conflict scan
  │ Claim worktree slot, optional MCP context engine
  │ [snapshot]
  │
  ▼
Stage 2 — Derive the Axioms (Opus)
  │ RALPLAN-DR deliberation (Architect challenges, Critic validates)
  │ Decompose → per-unit contracts + path-scoped instructions
  │ Generate PRD, record ADR
  │ [snapshot]
  │
  ▼
Stage 2.5 — Generate Tests (Sonnet)
  │ ATDD: create acceptance tests from EARS criteria
  │ Tests MUST fail (RED phase)
  │
  ▼
Stage 3 — Execute the Inference (Haiku/Sonnet/Opus × N)
  │ Per-unit model routing by complexity
  │ Each agent: contract + failing tests + isolated worktree
  │ Inter-agent signal protocol for ambiguities
  │ [snapshot]
  │
  ▼
Stage 4 — Verify the Proof (Haiku)
  │ Iterative validation cycling (max 5 rounds)
  │ Optional browser-based verification (Playwright)
  │ Update PRD per-story status
  │ [snapshot]
  │
  ▼
Stage 5 — Test for Contradictions (Opus × 3)
  │ Three parallel independent reviewers:
  │   Security │ Code Quality │ Acceptance
  │ Reviewers cannot see design intent
  │ [snapshot]
  │
  ▼
Stage 6 — Commit the Result (Haiku)
  │ PRD completion gate, conventional commit
  │ PR with ADR in body, worktree cleanup
  │ Update project memory with learnings
  │
  ▼
Pull request.

Self-Correction

Failure Type	Trigger	Resolution
`SYNTAX`	Parse/compile error	Sonnet self-corrects (max 2 retries per unit)
`LOGIC`	Test failure	Haiku diagnoses → Sonnet re-implements
`ARCH`	Contract violation	Opus revises contract → Sonnet retries
`AMBIGUOUS`	Unclear failure	Opus arbitrates contract vs implementation

Per-unit budgets prevent infinite loops. When budgets exhaust, the pipeline produces a clear escalation report.

State & Isolation

All state lives under .axiom/ in your project root:

.axiom/
├── config.json                      # Delivery config (PR vs direct merge)
├── project-memory.json              # Cross-run learnings (committed to repo)
├── playbooks/                       # Reusable task templates
│   └── api-endpoint.md
├── instructions/                    # Path-scoped conventions
│   └── api-conventions.md
├── worktrees/
│   ├── pool.json                    # Slot pool for concurrent features
│   └── {slot}/                      # Git worktrees, one per active feature
└── state/
    ├── {run-id}/
    │   ├── design.json              # Goals, constraints, EARS criteria
    │   ├── adr.json                 # Architecture Decision Record
    │   ├── interview-spec.md        # Crystallized feature spec
    │   ├── feature-plan.json        # Unit decomposition
    │   ├── contracts/{unit-id}.json # Per-unit contracts with conventions
    │   ├── prd.json                 # PRD with per-story pass/fail tracking
    │   ├── corrections.json         # Retry budget tracking
    │   ├── correction-loop.json     # Session-persistent correction state
    │   └── validation-report.json   # Stage 4 → Stage 5 handoff
    └── {run-id}-snapshots/          # Stage boundary snapshots for rollback
        ├── stage-0/
        ├── stage-2/
        └── ...

No external tools, no MCP servers, no plugin dependencies required. Only git and gh CLI. Optional enhancements: MCP context engine for large codebases, OMC notifications, Playwright for browser verification.

Configuration

On first run, Axiom asks three setup questions and writes answers to .axiom/config.json:

Setting	Options	Default
Delivery mode	PR to target branch, direct merge	PR
Target branch	Auto-detected (main/master)	main
Checkpoint mode	Fully autonomous, pause between stages	Autonomous

Config is project-local. .axiom/state/ and .axiom/worktrees/ are added to .gitignore automatically.

Checkpoints (Optional)

Enable checkpoint_mode to pause between stages for review:

Stage N completes → summary shown → you choose: proceed / pause / cancel

Useful for inspecting contracts before implementation, or reviewing the proof before the contradiction check.

Cancellation

touch .axiom/CANCEL

The pipeline checks for this sentinel at each stage boundary, releases the worktree slot, archives the branch, and cleans up.

Known Limitations

Context window pressure. Large codebases can push context limits. Mitigated by optional MCP context engine integration (Augment, Zilliz) and targeted exploration, but very large monorepos may still need scope constraints.
Test infrastructure benefits quality. Stage 2.5 generates acceptance tests from contracts, and Stage 4 can generate baseline tests if none exist. But projects with mature test suites get stronger verification.
Single-session pipeline. Each run lives in one Claude Code session. If the session crashes, --resume picks up from the last completed stage. Correction loops persist across session boundaries via correction-loop.json.
No CI integration. The pipeline runs locally. It creates the PR — your CI takes over from there.
Monolithic architecture. The pipeline is a single SKILL.md. Partial usage is supported via --plan-only and --review-only, but individual stages cannot be mixed with other tools' outputs (yet).

Example Artifacts

Want to see what the pipeline actually produces? Check out the example artifacts from a Stripe webhook handler run — includes design.json, per-unit contracts, and an explanation of what each stage outputs.

Project Structure

axiom/
├── SKILL.md           # The skill definition — this IS the product
├── README.md          # You are here
├── LICENSE            # Apache 2.0
└── docs/
    ├── architecture.md
    ├── faq.md
    └── examples/      # Example pipeline artifacts

Contributing

See CONTRIBUTING.md for guidelines on reporting bugs, suggesting features, and submitting improvements.

License

Apache 2.0 — see LICENSE.

From premise to pull request.

If this project helped you, consider giving it a star. It helps others find it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Axiom

What Makes Axiom Different

Start in 60 Seconds

What It Does

The Stages

Invocation Modes

Use This When

Don't Use This When

Why This Exists

How It Works

Self-Correction

State & Isolation

Configuration

Checkpoints (Optional)

Cancellation

Known Limitations

Example Artifacts

Project Structure

Contributing

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github		.github
docs		docs
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md

Folders and files

Latest commit

History

Repository files navigation

Axiom

What Makes Axiom Different

Start in 60 Seconds

What It Does

The Stages

Invocation Modes

Use This When

Don't Use This When

Why This Exists

How It Works

Self-Correction

State & Isolation

Configuration

Checkpoints (Optional)

Cancellation

Known Limitations

Example Artifacts

Project Structure

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Packages