Contract testing for coding agents.
Turn task intent into a structured contract. Get a clause-by-clause proof of done.
Coding agents guess at scope, skip constraints, and produce diffs you have to manually reconstruct. Accord fixes that:
- Intent becomes a spec. A YAML contract captures goal, scope, constraints, acceptance criteria, and required evidence.
- Evaluation becomes deterministic.
accord provechecks every clause and tells you what passed, what failed, and what needs a human eye.
pip install -e .
accord init --title "Add rate limiting" --id "rate-limit"
# edit contracts/contract.yaml
accord lint contracts/contract.yaml
accord prove contracts/contract.yamlpip install -e .
./scripts/demo_all.shThis runs three scenarios against a bundled fixture project:
| Script | Outcome | What it shows |
|---|---|---|
demo_pass.sh |
pass | All deterministic checks succeed |
demo_fail.sh |
fail | Budget violation + missing evidence |
demo_needs_human.sh |
needs_human | Deterministic pass, semantic clauses deferred |
Each writes artifacts to .accord/demo/. Inspect the verdict:
cat .accord/demo/fail/verdict.json | python3 -m json.tool
accord render .accord/demo/pass/verdict.jsonPre-generated output for all three scenarios is in examples/demo-output/.
| Status | Meaning | Default CI behavior |
|---|---|---|
| pass | All deterministic checks succeeded | exit 0 |
| fail | At least one check failed | exit 1 |
| needs_human | Deterministic checks passed; semantic clauses need review | exit 0 (configurable) |
Use --fail-on needs_human if you want semantic clauses to block CI.
Copy the workflow template into your repo:
cp templates/accord-pr.yml .github/workflows/accord-pr.yml
# edit CONTRACT_PATH to point at your contractOr add this to an existing workflow:
- run: pip install "accord @ git+https://github.com/JohnODowdAI/accord.git@v0.1.0-beta.1"
- name: Accord — Proof of Done
run: |
accord prove contracts/my-task.yaml \
--github \
--base-ref ${{ github.event.pull_request.base.sha }} \
--head-ref ${{ github.event.pull_request.head.sha }}
- uses: actions/upload-artifact@v4
if: always()
with:
name: accord-verdict
path: .accord/runs/--github writes a formatted summary to the GitHub Actions job summary and emits file-level annotations for scope/forbidden-path failures. Outside CI, it falls back to writing github_summary.md in the run directory.
The included workflow also posts a sticky PR comment with the verdict (updates on re-push, no duplicates).
version: "0.1"
id: my-task
title: Add input validation to /api/submit
acceptance:
deterministic:
- name: tests-pass
run: pytest tests/ -qThat's a working contract. Add scope, budgets, evidence, and semantic clauses as needed.
Full annotated template: templates/contract.full.yaml
version: "0.1"
id: login-rate-limit
title: Add IP-based rate limiting to POST /login
goal: >
Limit repeated failed login attempts by IP without changing the
public API except allowing HTTP 429 responses.
scope:
include: [services/auth/**, tests/auth/**]
exclude: [frontend/**]
constraints:
forbidden_paths: [infra/**]
budgets:
max_files_changed: 8
max_added_lines: 250
acceptance:
deterministic:
- name: auth-tests
run: pytest tests/auth/ -q
semantic:
- name: compatibility
rubric: >
Existing clients remain compatible. Only new behavior is
HTTP 429 on repeated failed logins.
evidence_from: [git_diff, test_logs]
evidence:
require: [plan.md]
handoff:
require_human_if:
- semantic.compatibility != pass{
"status": "needs_human",
"clauses": [
{"name": "auth-tests", "kind": "deterministic", "status": "pass", "reason": "exit 0"},
{"name": "max-files-changed", "kind": "deterministic", "status": "pass", "reason": "3/8 files changed"},
{"name": "compatibility", "kind": "semantic", "status": "needs_human",
"reason": "Needs human review: Existing clients remain compatible..."}
],
"summary": {"pass": 2, "fail": 0, "needs_human": 1, "skipped": 0}
}| File | Use |
|---|---|
templates/contract.min.yaml |
Narrowest useful contract |
templates/contract.full.yaml |
All fields with annotations |
templates/accord-pr.yml |
GitHub Actions workflow for PRs |
See docs/adopt.md for a step-by-step adoption guide.
| Command | What it does |
|---|---|
accord init |
Scaffold a starter contract |
accord lint <contract> |
Validate a contract |
accord prove <contract> |
Run the prove loop, emit a verdict |
accord render <verdict.json> |
Re-render a verdict |
| Option | Description |
|---|---|
--agent-cmd "..." |
Run a shell command before verification |
--evidence-dir <path> |
Base directory for evidence file checks |
--base-ref <ref> |
Git base ref for diff (auto-detects from CI) |
--head-ref <ref> |
Git head ref for diff (default: HEAD) |
--github |
Emit GitHub Actions summary and annotations |
--fail-on {fail,needs_human} |
Exit non-zero threshold (default: fail) |
--output-dir <path> |
Write artifacts to a specific directory |
--format terminal (default), summary, github-summary, pr-comment
| Check | What it verifies |
|---|---|
run |
Shell command exits 0 |
| Scope | Changed files match scope.include patterns |
| Forbidden paths | No changes to constraints.forbidden_paths |
| File budget | <= max_files_changed files touched |
| Line budget | <= max_added_lines lines added |
| Evidence exists | Required files/directories are present |
- LLM semantic grading. The
SemanticGraderinterface exists but defaults toneeds_human. LLM grading is the next milestone. - GitHub App / Checks API. Current integration uses workflow summaries and annotations.
- Web UI / dashboard. Accord is local-first and CLI-only.
- Plugin system. Custom checks require code changes for now.
- Local CLI with deterministic checks
- GitHub Actions integration (summaries, annotations, PR comments)
- Reproducible demos and adoption templates
- LLM semantic grader (pluggable, model-agnostic)
- Verdict diff (compare runs over time)
- Contract composition (shared constraints)
- GitHub App with Checks API
pip install -e ".[dev]"
pytest -v
ruff check src/ tests/
./scripts/demo_all.shMIT