A 6-phase pipeline that coordinates multiple AI agents with deep reasoning, parallel execution, and automated quality gates.
Don't just spawn agents. Orchestrate them — with planning, verification, and quality control.
Most multi-agent frameworks just "spawn N agents and hope for the best":
- No planning phase — agents duplicate work or miss requirements
- No quality control — garbage in, garbage out, at scale
- Fixed agent count — overkill for simple tasks, insufficient for complex ones
- No synthesis — results from parallel agents are dumped, not integrated
- No error recovery — if one agent fails, everything fails
A 6-phase pipeline that adapts to task complexity:
Phase 0 Phase 1 Phase 2 Phase 3 Phase 4 Phase 5 Phase 6
ANALYZE → RESEARCH → ARCHITECT → EXECUTE → SYNTHESIZE → QUALITY → DELIVER
(deep) (parallel) (plan) (parallel) (merge) (gate) (fix)
The orchestrator classifies your task, scales the agent count, coordinates parallel execution, merges results, and runs a fresh-eyes quality gate before delivery.
| Feature | Description |
|---|---|
| Adaptive Complexity | Auto-classifies tasks as LIGHT (1 agent) / MEDIUM (2-3) / HEAVY (3-5) / WRITING (specialized) |
| Deep Reasoning | Extended thinking (ultrathink) at every critical decision point |
| Parallel Execution | Independent subtasks run simultaneously — no wasted time |
| Quality Gate | Fresh-eyes critic agent reviews all output before delivery |
| Auto-Recovery | Failed agents retry once, complexity auto-adjusts up/down |
| Domain Aware | Specialized patterns for CODE, WRITING, ANALYSIS, RESEARCH, DEBUG |
| Skill Integration | Can invoke other skills within worker agents |
# Copy SKILL.md to your Claude Code skills directory
cp SKILL.md /path/to/your/skills/deep-work/SKILL.mdAdd to your settings.json or project configuration:
{
"skills": {
"deep-work": {
"path": "/path/to/your/skills/deep-work/SKILL.md"
}
}
}> deep work: implement a REST API with authentication, rate limiting, and database migrations
> deep mode: analyze this codebase for security vulnerabilities
> max mode: write a comprehensive literature review on transformer architectures
Trigger phrases: deep work, deep mode, max mode
Input: Your task description
Output: Task classification + complexity assessment + execution plan
- Uses extended thinking for thorough analysis
- Classifies: CODE | WRITING | ANALYSIS | RESEARCH | DEBUG
- Assesses: LIGHT | MEDIUM | HEAVY | WRITING
- LIGHT tasks skip directly to Phase 3 (no overhead)
Agents: 1-4 Explore subagents (fast model)
Purpose: Gather context before execution
CODE → scan codebase, find patterns, read docs
WRITING → search literature, find sources, check style guides
ANALYSIS → gather data, find precedents, check methodologies
DEBUG → reproduce bug, check logs, trace call stack
Agents: 1 Planning agent (strongest model)
Purpose: Design the execution strategy
Output:
- Subtask definitions with clear boundaries
- Dependency map (what must run sequentially vs. parallel)
- Risk assessment
- Verification criteria
Agents: 1-5 Worker agents (strongest model + extended thinking)
Purpose: Do the actual work
Each worker gets:
- Subtask scope and boundaries
- Context from Phase 1 research
- Inputs from Phase 2 architecture
- Expected output format
- Success criteria
- Rules: stay in scope, don't duplicate, flag blockers
Purpose: Merge all worker outputs into coherent result
- Collect all outputs
- Check consistency across agents
- Merge non-overlapping work
- Resolve conflicts (if agents disagree)
- Fill gaps
- Verify completeness
Agents: 1 Critic agent (strongest model, fresh context)
Purpose: Fresh-eyes review of everything
Universal checks:
✓ Completeness — all requirements addressed?
✓ Correctness — no errors, bugs, or inaccuracies?
✓ Quality — meets professional standards?
✓ Expertise — would a domain expert approve?
✓ Elegance — is there unnecessary complexity?
Domain-specific checks:
CODE → tests pass, no security issues, clean architecture
WRITING → coherent, well-cited, proper structure
ANALYSIS → data-backed, methodology sound, conclusions supported
Returns: PASS or NEEDS_FIX (with severity-rated issues)
- If PASS: deliver final output with structured summary
- If NEEDS_FIX: fix issues (max 2 iterations), then deliver
Delivery includes:
- Task summary
- Complexity classification
- Number of agents used
- Quality verdict
- What was done
- Key decisions made
- Verification evidence
| Complexity | Research Agents | Architect | Workers | Critic | Example |
|---|---|---|---|---|---|
| LIGHT | 0 | No | 1 | 1 | Fix a typo, add a function |
| MEDIUM | 2 | No | 2-3 | 1 | Implement a feature, write a report |
| HEAVY | 3-4 | Yes | 3-5 | 1 | System refactor, research paper |
| WRITING | Specialized | No | Writers | Reviewer | Literature review, grant proposal |
| Feature | This Tool | Simple Agent Spawn | CrewAI | AutoGen | LangGraph |
|---|---|---|---|---|---|
| Adaptive agent count | ✅ Auto | ❌ Fixed | ❌ Fixed | ❌ Fixed | ❌ Manual |
| Deep reasoning phase | ✅ | ❌ | ❌ | ❌ | ❌ |
| Quality gate | ✅ Fresh-eyes | ❌ | ❌ | ❌ | |
| Auto-recovery | ✅ | ❌ | |||
| Parallel execution | ✅ | ✅ | ✅ | ✅ | ✅ |
| Synthesis step | ✅ | ❌ | ❌ | ❌ | |
| Task classification | ✅ 5 types | ❌ | ❌ | ❌ | ❌ |
| Zero config | ✅ | ✅ | ❌ | ❌ | ❌ |
graph TD
A[Task Input] --> B[Phase 0: Deep Analysis]
B --> C{Complexity?}
C -->|LIGHT| F[Phase 3: Single Worker]
C -->|MEDIUM| D[Phase 1: 2 Research Agents]
C -->|HEAVY| D2[Phase 1: 3-4 Research Agents]
C -->|WRITING| D3[Phase 1: Literature Search]
D --> F2[Phase 3: 2-3 Workers]
D2 --> E[Phase 2: Architect]
D3 --> F3[Phase 3: Specialized Writers]
E --> F4[Phase 3: 3-5 Workers]
F --> G[Phase 4: Synthesis]
F2 --> G
F3 --> G
F4 --> G
G --> H[Phase 5: Quality Gate]
H -->|PASS| I[Phase 6: Deliver]
H -->|NEEDS_FIX| J[Fix Issues]
J --> H
I --> K[Structured Output]
This tool requires no API keys or external services. It runs entirely within your Claude Code environment.
To customize behavior:
- Modify complexity thresholds in
SKILL.md - Add domain-specific worker templates
- Integrate additional skills (e.g.,
humanizer,analytical-depth,research-intern)
- Complex Feature Implementation — breaks down the feature, implements in parallel, verifies
- Research & Writing — parallel literature search → structured writing → quality review
- Codebase Refactoring — analyzes dependencies, plans changes, executes safely
- Bug Investigation — reproduces, traces, identifies root cause, fixes, verifies
- Data Analysis — gathers data from multiple sources, analyzes in parallel, synthesizes
Contributions welcome! Open an issue or submit a PR.
MIT License — see LICENSE for details.