AgentFlow Architecture

How AgentFlow turns your Kanban board into an autonomous AI software development pipeline with full observability, deterministic quality gates, and built-in cost controls.

Core Principle

Your project management tool IS the orchestration layer.

AgentFlow doesn't build a separate database, message queue, or custom infrastructure. It reads and writes pipeline state directly to your Kanban board (Asana, GitHub Projects, Linear, Jira). This gives you:

Crash recovery for free: State survives agent crashes because it lives in your PM tool
Phone-accessible observability: Monitor the entire pipeline from any device
Human override at any point: Drag a card to "Needs Human" to intervene
Audit trail built-in: Every agent decision is a comment on the task card

v2 Architecture: Dual-Mode

AgentFlow v2 supports two execution modes from the same codebase:

Standalone Mode (v1)

The original architecture. Workers are separate terminal sessions, orchestrator runs via crontab, all communication flows through Asana comments.

Plugin Mode (v2)

Workers spawn as a named agent team inside Claude Code. The orchestrator creates the team, dispatches via SendMessage for instant handoffs, and hooks enforce quality gates at the tool level.

graph TB
    subgraph standalone["Standalone Mode (v1)"]
        CR["Crontab<br/>(sweep)"] -->|reads/writes Asana| KB1["Kanban Board"]
        CR -->|dispatches| T2a["T2 (terminal)"]
        CR -->|dispatches| T3a["T3 (terminal)"]
        CR -->|dispatches| T4a["T4 (terminal)"]
    end
    
    subgraph plugin["Plugin Mode (v2)"]
        OR["Orchestrator<br/>(agent)"] -->|"TeamCreate + SendMessage"| KB2["Kanban Board"]
        OR -->|spawns| T2b["T2 (agent)"]
        OR -->|spawns| T3b["T3 (agent)"]
        OR -->|spawns| T4b["T4 (agent)"]
        T2b <-->|SendMessage| T4b
    end

What Changes in Plugin Mode

Aspect	Standalone	Plugin
Worker spawning	Manual (iTerm tabs)	Automatic (TeamCreate)
Handoff latency	15 min (sweep cycle)	<30 sec (SendMessage)
Quality gates	Prompt-enforced	Hook-enforced (lint-gate, coverage-gate, scope-guard)
Progress tracking	[HEARTBEAT] comments	Real-time telemetry via SendMessage
Communication	Asana only	SendMessage + Asana (dual channel)
Shutdown	Manual crontab edit	TeamDelete (clean teardown)

Hooks: Infrastructure-Level Gates

Plugin mode adds 3 hooks that enforce quality gates at the tool level:

Hook	Event	What it does
lint-gate	PreToolUse on Bash	Blocks commit without tsc/lint/test
coverage-gate	Stop on tester	Blocks merge without 80% coverage
scope-guard	PreToolUse on Edit/Write	Warns then blocks unpredicted files

System Components

1. The Orchestrator (`/sdlc-orchestrate`)

A stateless, one-shot sweep that runs via real crontab — not a daemon, not a session-based scheduler.

*/15 * * * * ~/.claude/sdlc/agentflow-cron.sh >> /tmp/agentflow-orchestrate.log 2>&1

Each sweep:

Discovers all pipeline projects
Checks for spec drift (SHA-256 hash comparison)
Detects dead workers (heartbeat timeout > 10 min)
Processes stage transitions based on comment tags
Triggers feedback loops for rejected tasks
Dispatches ready tasks to available worker slots
Runs system-level retrospective (every 10 completions)
Updates the status dashboard
Checks for graceful shutdown signals

Why stateless? Session-based schedulers die with the terminal. A real crontab entry survives reboots, terminal crashes, and network interruptions. The orchestrator reads all state from the Kanban board on every sweep — it has no memory between invocations.

2. Workers (`/sdlc-worker --slot T<N>`)

Each worker is a Claude Code session bound to a slot identifier (T2, T3, T4, T5). Workers:

Query the Kanban board for tasks assigned to their slot
Determine the current stage from task metadata
Execute the appropriate stage prompt (research, build, review, test)
Post results as structured comments with machine-readable tags
Update task metadata (stage, cost, retry count)

Workers are stateless between tasks. When a worker finishes one task, it checks for the next assigned task. If none, it reports idle.

3. The Kanban Board (State Machine)

The board has 8 columns (sections):

graph LR
    S0["0 - Needs Human"] --- S1["1 - Backlog"] --- S2["2 - Research"] --- S3["3 - Build"] --- S4["4 - Review"] --- S5["5 - Test"] --- S6["6 - Integrate"] --- S7["7 - Done"]

State is stored in two places:

Task position (which column) — the current pipeline stage
Task description header — metadata: [SLOT:T2] [STAGE:Build] [RETRY:1] [COST:~$2.50]
Task comments — structured event log with machine-readable tags

4. Deterministic Quality Gates

Before any AI review happens, deterministic checks run:

graph LR
    TSC["tsc --noEmit"] --> ESLINT["eslint"] --> TEST["npm test"]

This catches ~60% of issues (type errors, lint violations, failing tests) at near-zero cost. Only code that passes all three gates reaches the AI reviewer.

After review, a coverage gate runs:

npm test -- --coverage

New files must have ≥80% test coverage to proceed to the Test stage.

5. Feedback Loops

When a task fails (review reject, test fail, integration fail):

Retry counter increments
Accumulated context is posted: what was tried, what failed, what to do differently
Worker slot is cleared (on retry 2+, a different worker is assigned)
Task moves back to Build stage
Cost is updated and checked against guardrails

This creates a learning loop where each retry carries the full history of previous attempts.

6. System-Level Learning

Every 10 completed tasks, the orchestrator runs a retrospective:

Reads all reject/fail comments from completed tasks
Identifies common failure patterns (same error type appearing 3+ times)
Writes patterns to LEARNINGS.md in the project root
Future builders and reviewers read LEARNINGS.md before starting work

This means the system gets better over time — mistakes made in task 5 are avoided in task 50.

Data Flow

graph TB
    SPEC["SPEC.md"] --> STA["/spec-to-asana<br/>(decompose + validate + create)"]
    STA --> KB["Kanban Board (Asana)"]
    
    KB --> ORCH["Orchestrator<br/>(every 15 min)"]
    ORCH --> |"detects transitions"| KB
    ORCH --> |"assigns slots"| KB
    
    KB --> W2["Worker T2<br/>(Build)"]
    W2 --> |"[BUILD:COMPLETE]<br/>+ PR link"| KB
    
    KB --> W4["Worker T4<br/>(Review)"]
    W4 --> |"[REVIEW:PASS]<br/>or [REVIEW:REJECT]"| KB
    
    KB --> W5["Worker T5<br/>(Test + Merge)"]
    W5 --> |"[TEST:PASS]<br/>merge PR"| KB
    
    KB --> HUMAN["Human<br/>(phone/web)"]
    HUMAN --> |"drag card to intervene"| KB

Superpowers Integration Layer

AgentFlow can optionally integrate with Superpowers (or similar methodology-as-prompt tools) to enhance build quality. The integration follows a strict two-layer architecture:

Two-Layer Model

graph TB
    subgraph outer["OUTER LOOP — AgentFlow (Lifecycle Owner)"]
        OL["dispatch • heartbeat • tags • transitions • cost • retry"]
        subgraph inner["INNER LOOP — Superpowers (Methodology Owner)"]
            IL["brainstorm → plan → sub-agents → TDD → verification"]
        end
    end

AgentFlow controls WHEN things happen (start, complete, heartbeat, retry, cost check). Superpowers controls HOW things happen (planning approach, sub-agent strategy, TDD flow).

Complexity Gating

Not every task benefits from full Superpowers overhead. Tasks are gated by complexity:

Complexity	Brainstorm	Plan	Sub-Agents	Estimated Overhead
Simple (S)	Skip	Skip	No	~$0 extra
Medium (M)	Skip	Yes	Optional	~$0.20-0.40 extra
Large (L)	Yes	Yes	Yes	~$0.50-1.00 extra

Sub-Agent Management

When Superpowers dispatches sub-agents within a build stage:

Heartbeat continuity: Parent posts [HEARTBEAT] before dispatching and between sub-agent completions. Sub-agents do not post their own heartbeats.
File conflict prevention: Parent assigns non-overlapping file sets to each sub-agent based on the task's predicted files. If files cannot be cleanly partitioned, sub-agents run sequentially.
Output aggregation: Parent aggregates all sub-agent outputs into a single structured comment before posting [BUILD:COMPLETE] or failure tags. This preserves context for retries.

Safety & Sanitization

Input Sanitization

Every stage execution begins with an input sanitization check. The worker scans task descriptions, research results, and external inputs for:

Instruction override patterns ("ignore all previous instructions", "disregard above")
Base64-encoded command sequences
Suspicious URLs or redirect chains
Attempts to read environment variables or secrets

If detected: [SECURITY:WARNING] is posted and the task moves to Needs Human.

Verification Command Allowlist

Verification commands in task descriptions are restricted to known-safe patterns:

npm test, npm run <script>, npx <tool>
pytest, python -m pytest
go test, cargo test, mix test
curl localhost:<port> (local only)
Custom commands explicitly allowlisted in project configuration

Commands containing pipes to sh, eval, exec, or network calls to external hosts are rejected.

LEARNINGS.md as Injection Vector

LEARNINGS.md is written by the system (retrospective step) and read by all workers. A compromised or manipulated LEARNINGS.md could inject instructions into every subsequent build. Mitigations:

LEARNINGS.md is capped at 50 lines (limits blast radius)
Only the orchestrator's retrospective step writes to LEARNINGS.md
Workers read LEARNINGS.md as reference data, not as executable instructions
Patterns follow a strict format; anything not matching the format is ignored

Cost Model

AgentFlow tracks costs per task using dual cost profiles:

Sonnet Profile (default, recommended)

Stage	Without Superpowers	With Superpowers (M)	With Superpowers (L)
Research	~$0.10	~$0.10	~$0.10
Build	~$0.40	~$0.80	~$1.20
Review	~$0.10	~$0.10	~$0.10
Test	~$0.05	~$0.05	~$0.05
Integrate	~$0.03	~$0.03	~$0.03

Guardrails: Warning at $3, Hard stop at $10

Opus Profile (for complex projects)

Stage	Without Superpowers	With Superpowers (M)	With Superpowers (L)
Research	~$1.00	~$1.00	~$1.50
Build	~$3.00	~$5.00	~$8.00
Review	~$0.50	~$0.50	~$0.50
Test	~$1.00	~$1.00	~$1.00
Integrate	~$0.25	~$0.25	~$0.25

Guardrails: Warning at $8, Hard stop at $20

Orchestrator cost (crontab):

Default (*/15): ~$48/day with Opus, ~$10/day with Sonnet (recommended)
Sprint mode (*/5): ~$144/day with Opus, ~$30/day with Sonnet
Idle optimization: consecutive idle sweeps double the interval (15 -> 30 -> 60 min)

Expected cost per sprint (14 tasks):

Sonnet profile, no Superpowers: ~$44-60
Sonnet profile, with Superpowers: ~$60-100
Opus profile, with Superpowers: ~$120-200

Adapter Architecture

AgentFlow uses adapters to abstract the PM tool interface:

graph TB
    CORE["AgentFlow Core<br/>(skills + prompts + conventions)"] --> AI["Adapter Interface"]
    AI --> ASANA["Asana Adapter (MCP)<br/>✅ available"]
    AI --> GH["GitHub Projects Adapter<br/>📋 planned"]
    AI --> LIN["Linear Adapter<br/>📋 planned"]
    AI --> JIRA["Jira Adapter<br/>📋 planned"]

Each adapter maps these operations to the specific PM tool's API. The core skills and prompts never reference a specific tool — they use the adapter interface.

Security Model

No secrets in code: Mock values in tests, process.env.X in implementation
No force-push: Integration failures create revert commits
No unreviewed merges: Every PR goes through deterministic gates + adversarial AI review
Cost containment: Automatic hard stops prevent runaway spending
Scope containment: PR file changes compared against predicted files
Worker isolation: Each worker operates in its own git worktree

Failure Modes and Recovery

Failure	Detection	Recovery
Worker crashes mid-build	No heartbeat for 10 min	Orchestrator reassigns to different slot
Integration breaks main	Tests fail after merge	Auto-revert via `git revert` (new commit)
Task is impossible	2+ build failures	`[BUILD:BLOCKED]` → Needs Human
Spec changes mid-sprint	SHA-256 hash mismatch	All tasks flagged `[NEEDS:REVALIDATION]`
Cost runaway	Per-task tracking	Warning at threshold, hard stop at ceiling
All slots busy	Orchestrator checks availability	Tasks wait in Backlog until slot frees
Circular dependencies	Topological sort at decomposition	Blocked before tasks are created
Shared file conflicts	Predicted files comparison	Parallel tasks serialized
Concurrent merges	`[MERGE_LOCK]` on Status task	Second merge waits, retries after lock release
Dual sweep collision	`[SWEEP:RUNNING]` timestamp check	Second sweep exits immediately if recent sweep active
Git revert fails	`git revert` exits non-zero	`[INTEGRATE:REVERT_FAILED]` → Needs Human (manual fix required)
Crontab dies silently	`[LAST_SWEEP]` timestamp >30 min old	External watchdog sends notification
Stale worktrees accumulate	Task moves to Done	Worktree cleaned up on Done transition
Review ping-pong	Minor-only issues on retry 2+	`[REVIEW:PASS_WITH_NOTES]` allows proceed with suggestions
Prompt injection in inputs	Sanitization check at stage entry	`[SECURITY:WARNING]` → Needs Human
LEARNINGS.md overflow	Line count check on write	Oldest patterns rotated out, cap at 50 lines

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AgentFlow Architecture

Core Principle

v2 Architecture: Dual-Mode

Standalone Mode (v1)

Plugin Mode (v2)

What Changes in Plugin Mode

Hooks: Infrastructure-Level Gates

System Components

1. The Orchestrator (`/sdlc-orchestrate`)

2. Workers (`/sdlc-worker --slot T<N>`)

3. The Kanban Board (State Machine)

4. Deterministic Quality Gates

5. Feedback Loops

6. System-Level Learning

Data Flow

Superpowers Integration Layer

Two-Layer Model

Complexity Gating

Sub-Agent Management

Safety & Sanitization

Input Sanitization

Verification Command Allowlist

LEARNINGS.md as Injection Vector

Cost Model

Sonnet Profile (default, recommended)

Opus Profile (for complex projects)

Adapter Architecture

Security Model

Failure Modes and Recovery

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

AgentFlow Architecture

Core Principle

v2 Architecture: Dual-Mode

Standalone Mode (v1)

Plugin Mode (v2)

What Changes in Plugin Mode

Hooks: Infrastructure-Level Gates

System Components

1. The Orchestrator (/sdlc-orchestrate)

2. Workers (/sdlc-worker --slot T<N>)

3. The Kanban Board (State Machine)

4. Deterministic Quality Gates

5. Feedback Loops

6. System-Level Learning

Data Flow

Superpowers Integration Layer

Two-Layer Model

Complexity Gating

Sub-Agent Management

Safety & Sanitization

Input Sanitization

Verification Command Allowlist

LEARNINGS.md as Injection Vector

Cost Model

Sonnet Profile (default, recommended)

Opus Profile (for complex projects)

Adapter Architecture

Security Model

Failure Modes and Recovery

1. The Orchestrator (`/sdlc-orchestrate`)

2. Workers (`/sdlc-worker --slot T<N>`)