Give it a task. Get back a tested, reviewed PR. Six AI agents handle the rest.
Antfarm is a multi-agent development pipeline built on OpenClaw that takes a task description and autonomously plans, implements, tests, reviews, and creates pull requests. Each agent runs in an isolated session with a defined role, communicating through structured KEY: VALUE pairs and a shared progress log.
You describe what you want built. Antfarm decomposes the work into user stories, then six specialized agents execute a pipeline: planning the stories, preparing the environment, implementing each story with tests, verifying correctness, running integration tests, creating a PR, and reviewing the code. The entire process runs autonomously -- you get a tested, reviewed pull request at the end.
Task Description
|
v
+-------------+
| Planner | Decomposes task into ordered user stories (max 20)
+-------------+
|
v
+-------------+
| Setup | Creates branch, discovers build/test commands, establishes baseline
+-------------+
|
v
+------------------------------------------+
| Story Execution Loop |
| |
| +-------------+ +-------------+ |
| | Developer | --> | Verifier | |
| +-------------+ +-------------+ |
| ^ | |
| | STATUS: retry | |
| +--------------------+ |
| |
| For each story: |
| Developer implements + writes tests |
| Verifier checks acceptance criteria |
| Pass -> next story |
| Fail -> Developer retries (max 2) |
+------------------------------------------+
|
v
+-------------+
| Tester | Integration/E2E testing across all stories
+-------------+
|
v
+-------------+
| PR | Creates pull request with summary
+-------------+
|
v
+-------------+
| Reviewer | Reviews PR, approves or requests changes
+-------------+
|
v
Done
The Developer and Verifier form a tight loop: the Developer implements a single story and writes tests, then the Verifier checks each acceptance criterion, runs the test suite, and either approves or sends specific feedback back for a retry. Each story gets a fresh agent session -- no accumulated context drift.
| Agent | Role Type | Description |
|---|---|---|
| Planner | analysis | Decomposes tasks into ordered user stories with verifiable acceptance criteria |
| Setup | coding | Creates branch, discovers build/test commands, establishes baseline |
| Developer | coding | Implements stories one at a time, writes tests, commits with structured messages |
| Verifier | verification | Checks each story against acceptance criteria, runs tests, security checks |
| Tester | testing | Integration and E2E testing after all stories are implemented |
| Reviewer | analysis | Reviews the PR, approves or requests changes with actionable feedback |
Each agent has its own workspace with three identity files:
AGENTS.md-- operational instructions and processSOUL.md-- personality and decision-making styleIDENTITY.md-- name and role declaration
See docs/agent-roles.md for full documentation of each agent.
This pipeline is not theoretical. It is running RIGHT NOW on three Solana ecosystem repos as part of the Solana Graveyard Hackathon:
| Repository | Branch | Stories | What Antfarm Did |
|---|---|---|---|
| ExpertVagabond/tribeca-dao | revival/graveyard-hack |
9 | Anchor 0.30 migration, IDL regeneration, TS SDK migration, demo lifecycle |
| ExpertVagabond/grape-art | revival/graveyard-hack |
11 | Full dependency modernization, Parcel build fix, marketplace demo |
| ExpertVagabond/port-lending | revival/graveyard-hack |
13 | Rust dependency updates, cargo build-sbf, test restoration, TS SDK |
33 stories total across 3 repositories, all autonomously planned and implemented by Antfarm agents over 48 hours of continuous execution starting February 16, 2026.
See docs/evidence.md for full details and verification instructions.
# Prerequisites: OpenClaw installed and running
# 1. Clone the repo
git clone https://github.com/ExpertVagabond/antfarm-devpipe.git
cd antfarm-devpipe
# 2. Configure environment
cp .env.example .env
# Edit .env with your ANTHROPIC_API_KEY and GITHUB_TOKEN
# 3. Install the workflow
antfarm workflow install feature-dev
# 4. Run on any task
antfarm workflow run feature-dev "Add user authentication with JWT tokens"
# 5. Monitor progress
antfarm workflow statusThe Planner agent explores the target codebase, understands the stack and conventions, and decomposes the task into ordered user stories (max 20). Each story is sized to fit in a single agent context window. Stories are ordered by dependency: schema/DB first, then backend, then frontend, then integration. Every acceptance criterion is mechanically verifiable.
The Setup agent creates a feature branch, reads package.json / Cargo.toml / pyproject.toml and CI configs to discover build and test commands, ensures .gitignore exists, and runs the build and test suite to establish a baseline. Downstream agents receive the discovered BUILD_CMD and TEST_CMD.
For each story in order, the pipeline spawns a fresh Developer session. The Developer reads progress.txt for codebase patterns discovered by previous stories, implements the story, writes tests, runs the build and test suite, commits, and appends to the progress log. Then the Verifier checks every acceptance criterion, runs tests, performs security checks, and either approves or sends specific retry feedback. Max 2 retries per story before escalating to a human.
After all stories pass verification, the Tester agent runs the full test suite and checks for integration issues between stories, cross-cutting concerns, and E2E flows.
The Developer agent creates a pull request with a clear title, description of changes, and test results.
The Reviewer agent reads the PR diff, checks for code quality, bugs, test coverage, and convention adherence. It posts its review directly to GitHub -- either approving or requesting changes with specific, actionable feedback.
Each story gets a fresh agent session. No accumulated context drift. No 200K-token conversations that lose the plot. One story, one session, one commit.
The Verifier checks every acceptance criterion mechanically. If something fails, it sends specific feedback: "The test asserts on the wrong field -- it checks name but the requirement was about displayName." The Developer retries with that feedback. Max 2 retries before escalating.
Agents share knowledge through progress.txt. The Codebase Patterns section at the top captures reusable discoveries: "This project uses node:sqlite DatabaseSync, not async." Each new Developer session reads this before starting, so story 8 benefits from patterns discovered in story 1.
A medic watchdog runs on a 5-minute cron interval, checking for stalled steps. If an agent is stuck (no progress for too long), the medic resets the step and alerts the main agent. No silent failures.
Agents fire in sequence on staggered intervals (0s / 60s / 120s / 180s / 240s / 300s) within a 5-minute cron cycle. This prevents resource contention and ensures orderly pipeline progression.
The Verifier blocks sensitive files (.env, *.key, *.pem, credentials), checks .gitignore exists, and scans diffs for hardcoded credentials. Security failures are always a rejection, regardless of whether the code works.
Antfarm workflows are defined in YAML. You can create custom workflows with different agent configurations, step sequences, and loop structures.
See docs/workflow-authoring.md for a complete guide to writing custom workflows.
See docs/architecture.md for deep technical details on the pipeline, cron system, and communication protocol.
- OpenClaw -- Multi-agent orchestration and cron scheduling
- Antfarm -- Workflow engine and agent runtime
- Claude (Anthropic) -- AI model powering all agents
- GitHub CLI -- PR creation and code review
Track 3: Developer Infrastructure
Antfarm DevPipe demonstrates how multi-agent pipelines can autonomously handle the full software development lifecycle -- from task decomposition through code review -- using OpenClaw's orchestration primitives. The pipeline is not a demo: it is actively processing real codebases with real PRs.
Matthew Karsten -- Purple Squirrel Media
- Twitter: @expertvagabond
- GitHub: ExpertVagabond
MIT -- see LICENSE for details.