diff --git a/README.md b/README.md
index 6d2adcf..3fc2806 100644
--- a/README.md
+++ b/README.md
@@ -45,13 +45,14 @@ Research subagents (codebase-locator, codebase-analyzer, pattern-finder) are spa
Refine rough ideas into fully-formed designs through collaborative questioning.
-- One question at a time
+- One question at a time (critical rule!)
- 2-3 approaches with trade-offs
- Section-by-section validation
-- Spawns research subagents to understand codebase
+- Fires research subagents in parallel via `background_task`
+- Auto-hands off to planner when user approves
- Output: `thoughts/shared/designs/YYYY-MM-DD-{topic}-design.md`
-**Research subagents** (spawned in parallel):
+**Research subagents** (fired in parallel via background_task):
| Subagent | Purpose |
|----------|---------|
@@ -59,17 +60,27 @@ Refine rough ideas into fully-formed designs through collaborative questioning.
| `codebase-analyzer` | Explain HOW code works (with file:line refs) |
| `pattern-finder` | Find existing patterns to follow |
+**Auto-handoff:** When user approves the design, brainstormer automatically spawns the planner - no extra confirmation needed.
+
### 2. Plan
Transform validated designs into comprehensive implementation plans.
-- Spawns research subagents for exact paths, signatures, patterns
+- Fires research subagents in parallel via `background_task`
+- Uses `context7` and `btca_ask` for external library documentation
- Bite-sized tasks (2-5 minutes each)
- Exact file paths, complete code examples
- TDD workflow: failing test → verify fail → implement → verify pass → commit
- Get human approval before implementing
- Output: `thoughts/shared/plans/YYYY-MM-DD-{topic}.md`
+**Library research tools:**
+
+| Tool | Purpose |
+|------|---------|
+| `context7` | Documentation lookup for external libraries |
+| `btca_ask` | Source code search for library internals |
+
### 3. Implement
Execute plan in git worktree for isolation:
@@ -104,34 +115,47 @@ Dependent tasks (must be sequential):
- Task B's test relies on Task A's implementation
```
-#### Parallel Execution
+#### Parallel Execution (Fire-and-Check Pattern)
+
+The executor uses a **fire-and-check** pattern for maximum parallelism:
-Within a batch, all tasks run concurrently by spawning multiple subagents in a single message:
+1. **Fire** - Launch all implementers as `background_task` in ONE message
+2. **Poll** - Check `background_list` for completions
+3. **React** - Start reviewer immediately when each implementer finishes
+4. **Repeat** - Continue polling until batch complete
```
Plan with 6 tasks:
├── Batch 1 (parallel): Tasks 1, 2, 3 → independent, different files
-│ ├── implementer: task 1 ─┐
-│ ├── implementer: task 2 ─┼─ spawn in ONE message
-│ └── implementer: task 3 ─┘
-│ [wait for all]
-│ ├── reviewer: task 1 ─┐
-│ ├── reviewer: task 2 ─┼─ spawn in ONE message
-│ └── reviewer: task 3 ─┘
-│ [wait for all]
+│ │
+│ │ FIRE: background_task(agent="implementer") x3
+│ │
+│ │ POLL: background_list() → task 2 completed!
+│ │ → background_output(task_2)
+│ │ → background_task(agent="reviewer", "Review task 2")
+│ │
+│ │ POLL: background_list() → tasks 1, 3 completed!
+│ │ → start reviewers for 1 and 3
+│ │
+│ │ [continue until all reviewed]
│
└── Batch 2 (parallel): Tasks 4, 5, 6 → depend on batch 1
└── [same pattern]
```
+Key: Reviewers start **immediately** when their implementer finishes - no waiting for the whole batch.
+
#### Per-Task Cycle
Each task gets its own implement→review loop:
-1. Spawn implementer with task details
-2. Spawn reviewer to check implementation
-3. If changes requested → re-spawn implementer (max 3 cycles)
-4. Mark as DONE or BLOCKED
+1. Fire implementer via `background_task`
+2. Implementer: make changes → run tests → **commit** if passing
+3. Fire reviewer to check implementation
+4. If changes requested → fire new implementer (max 3 cycles)
+5. Mark as DONE or BLOCKED
+
+**Note:** Implementer commits after verification passes, using the commit message from the plan.
### 4. Session Continuity
@@ -233,10 +257,28 @@ Searches across:
| `ast_grep_replace` | AST-aware code pattern replacement |
| `look_at` | Extract file structure for large files |
| `artifact_search` | Search past plans and ledgers |
-| `background_task` | Run long-running tasks in background |
-| `background_output` | Check background task status/output |
-| `background_cancel` | Cancel background tasks |
-| `background_list` | List all background tasks |
+| `btca_ask` | Query library source code (requires btca CLI) |
+| `background_task` | Fire subagent to run in background, returns task_id |
+| `background_list` | List all tasks and status (use to poll for completion) |
+| `background_output` | Get results from completed task |
+| `background_cancel` | Cancel running task(s) |
+
+### Background Task Pattern
+
+Research agents (brainstormer, planner, project-initializer) use the **fire-poll-collect** pattern. Executor uses **fire-and-check** (starts reviewers as implementers complete).
+
+```
+# FIRE: Launch all in ONE message
+task_1 = background_task(agent="locator", prompt="...")
+task_2 = background_task(agent="analyzer", prompt="...")
+
+# POLL: Check until complete
+background_list() # repeat until all show "completed" or "error"
+
+# COLLECT: Get results (skip errored tasks)
+background_output(task_id=task_1)
+background_output(task_id=task_2)
+```
## Hooks
diff --git a/bun.lock b/bun.lock
index 6cef8ca..8b80a7b 100644
--- a/bun.lock
+++ b/bun.lock
@@ -5,7 +5,7 @@
"": {
"name": "@vtemian/opencode-config",
"dependencies": {
- "@opencode-ai/plugin": "^1.0.219",
+ "@opencode-ai/plugin": "^1.0.224",
},
"devDependencies": {
"@biomejs/biome": "^2.3.10",
@@ -33,9 +33,9 @@
"@biomejs/cli-win32-x64": ["@biomejs/cli-win32-x64@2.3.10", "", { "os": "win32", "cpu": "x64" }, "sha512-pHEFgq7dUEsKnqG9mx9bXihxGI49X+ar+UBrEIj3Wqj3UCZp1rNgV+OoyjFgcXsjCWpuEAF4VJdkZr3TrWdCbQ=="],
- "@opencode-ai/plugin": ["@opencode-ai/plugin@1.0.219", "", { "dependencies": { "@opencode-ai/sdk": "1.0.219", "zod": "4.1.8" } }, "sha512-acyaJd/LuSo/h2RFP8sXX89KZ4aLGjqPJVRkA47ccQGDMcwAzjK9JPJOrmNPzykDWQLVCX66bKKO1Equ82VVvQ=="],
+ "@opencode-ai/plugin": ["@opencode-ai/plugin@1.0.224", "", { "dependencies": { "@opencode-ai/sdk": "1.0.224", "zod": "4.1.8" } }, "sha512-V2Su55FI6NGyabFHo853+8r9h66q//gsYWCIODbwRs47qi4VfbFylfddJxQDD+/M/H7w0++ojbQC9YCLNDXdKw=="],
- "@opencode-ai/sdk": ["@opencode-ai/sdk@1.0.219", "", {}, "sha512-thbbQsNhkR4M7hKXy1YK+ekMa6rnuDNNqFt1fCjf3zx7h/DLkoI8ll1MDw/Do/cSzcYuTgVCMV1H+lLDQN0I6A=="],
+ "@opencode-ai/sdk": ["@opencode-ai/sdk@1.0.224", "", {}, "sha512-gODyWLDTaz38qISxRdJKsEiFqvJNcFzu4/awoSICIl8j8gx6qDxLsYWVp/ToO4LKXTvHMn8yyZpM3ZEdGhDC+g=="],
"@types/node": ["@types/node@25.0.3", "", { "dependencies": { "undici-types": "~7.16.0" } }, "sha512-W609buLVRVmeW693xKfzHeIV6nJGGz98uCPfeXI1ELMLXVeKYZ9m15fAMSaUPBHYLGFsVRcMmSCksQOrZV9BYA=="],
diff --git a/package.json b/package.json
index ccf222f..d29a377 100644
--- a/package.json
+++ b/package.json
@@ -44,7 +44,7 @@
"url": "https://github.com/vtemian/micode/issues"
},
"dependencies": {
- "@opencode-ai/plugin": "^1.0.219"
+ "@opencode-ai/plugin": "^1.0.224"
},
"devDependencies": {
"@biomejs/biome": "^2.3.10",
diff --git a/src/agents/brainstormer.ts b/src/agents/brainstormer.ts
index 79bb0a5..22d560f 100644
--- a/src/agents/brainstormer.ts
+++ b/src/agents/brainstormer.ts
@@ -13,34 +13,50 @@ This is DESIGN ONLY. The planner agent handles detailed implementation plans.
ONE QUESTION AT A TIME: Ask exactly ONE question, then STOP and wait for the user's response. NEVER ask multiple questions in a single message. This is the most important rule.
NO CODE: Never write code. Never provide code examples. Design only.
- SUBAGENTS: Spawn multiple in parallel for codebase analysis.
- TOOLS (grep, read, etc.): Do NOT use directly - use subagents instead.
+ BACKGROUND TASKS: Use background_task for parallel codebase analysis.
+ TOOLS (grep, read, etc.): Do NOT use directly - use background subagents instead.
+
+ Fire subagent tasks that run in parallel. Returns task_id immediately.
+ List all background tasks and their current status. Use to poll for completion.
+ Get results from a completed task. Only call after background_list shows task is done.
+
+
-
- Find files, modules, patterns. Spawn multiple with different queries.
- Examples: "Find authentication code", "Find API routes", "Find config files"
+
+ Find files, modules, patterns. Fire multiple with different queries.
+ Example: background_task(agent="codebase-locator", prompt="Find authentication code", description="Find auth files")
+
+
+ Deep analysis of specific modules. Fire multiple for different areas.
+ Example: background_task(agent="codebase-analyzer", prompt="Analyze the auth module", description="Analyze auth")
-
- Deep analysis of specific modules. Spawn multiple for different areas.
- Examples: "Analyze the auth module", "Explain the data layer"
+
+ Find existing patterns in codebase. Fire for different pattern types.
+ Example: background_task(agent="pattern-finder", prompt="Find error handling patterns", description="Find error patterns")
-
- Find existing patterns in codebase. Spawn for different pattern types.
- Examples: "Find error handling patterns", "Find how similar features are implemented"
+
+ Creates detailed implementation plan from validated design.
+ Example: Task(subagent_type="planner", prompt="Create implementation plan for [design path]", description="Create plan")
-
- Spawn subagents in PARALLEL to gather context:
-
- In a SINGLE message, spawn:
- - codebase-locator: "Find files related to [topic]"
- - codebase-analyzer: "Analyze existing [related feature]"
- - pattern-finder: "Find patterns for [similar functionality]"
-
+
+ Fire background tasks in PARALLEL to gather context:
+
+ In a SINGLE message, fire ALL background tasks:
+ background_task(agent="codebase-locator", prompt="Find files related to [topic]", description="Find [topic] files")
+ background_task(agent="codebase-analyzer", prompt="Analyze existing [related feature]", description="Analyze [feature]")
+ background_task(agent="pattern-finder", prompt="Find patterns for [similar functionality]", description="Find patterns")
+
+
+ background_list() // repeat until all show "completed" or "error"
+
+
+ background_output(task_id=...) for each completed task (skip errored tasks)
+
purpose, constraints, success criteria
@@ -70,16 +86,29 @@ This is DESIGN ONLY. The planner agent handles detailed implementation plans.
Commit the design document to git
Ask: "Ready for the planner to create a detailed implementation plan?"
+
+
+ When user says yes/approved/ready, IMMEDIATELY spawn the planner:
+
+ Task(
+ subagent_type="planner",
+ prompt="Create a detailed implementation plan based on the design at thoughts/shared/designs/YYYY-MM-DD-{topic}-design.md",
+ description="Create implementation plan"
+ )
+
+ Do NOT ask again - if user approved, spawn planner immediately
+
NO CODE. Describe components, not implementations. Planner writes code.
- ALWAYS use subagents for code analysis, NEVER tools directly
- Spawn multiple subagents in a SINGLE message
+ Use background_task for parallel research, poll with background_list, collect with background_output
+ Fire ALL background tasks in a SINGLE message for true parallelism
Ask exactly ONE question per message. STOP after asking. Wait for user's answer before continuing. NEVER bundle multiple questions together.
Remove unnecessary features from ALL designs
ALWAYS propose 2-3 approaches before settling
Present in sections, validate each before proceeding
+ When user approves design, IMMEDIATELY spawn planner - don't ask again
diff --git a/src/agents/executor.ts b/src/agents/executor.ts
index 1e9a1cd..fff1ff0 100644
--- a/src/agents/executor.ts
+++ b/src/agents/executor.ts
@@ -6,16 +6,24 @@ export const executorAgent: AgentConfig = {
model: "anthropic/claude-opus-4-5",
temperature: 0.2,
prompt: `
-Execute plan tasks with maximum parallelism.
+Execute plan tasks with maximum parallelism using fire-and-check pattern.
Each task gets its own implementer → reviewer cycle.
Detect and parallelize independent tasks.
-
+
+You have access to background task management tools:
+- background_task: Fire a subagent to run in background, returns task_id immediately
+- background_output: Check status or get results from a background task
+- background_list: List all background tasks and their status
+
+
+
Parse plan to extract individual tasks
Analyze task dependencies to build execution graph
Group tasks into parallel batches (independent tasks run together)
-For each batch: spawn implementer → reviewer per task IN PARALLEL
+Fire ALL implementers in batch as background_task
+Poll with background_list, start reviewer immediately when each implementer finishes
Wait for batch to complete before starting dependent batch
Aggregate results and report
@@ -35,83 +43,115 @@ Tasks are DEPENDENT (must be sequential) when:
When uncertain, assume DEPENDENT (safer).
-
-Example: 9 tasks where tasks 1-3 are independent, 4-6 depend on 1-3, 7-9 depend on 4-6
-
-Batch 1 (parallel):
- - Spawn implementer for task 1 → reviewer
- - Spawn implementer for task 2 → reviewer
- - Spawn implementer for task 3 → reviewer
- [Wait for all to complete]
-
-Batch 2 (parallel):
- - Spawn implementer for task 4 → reviewer
- - Spawn implementer for task 5 → reviewer
- - Spawn implementer for task 6 → reviewer
- [Wait for all to complete]
-
-Batch 3 (parallel):
- - Spawn implementer for task 7 → reviewer
- - Spawn implementer for task 8 → reviewer
- - Spawn implementer for task 9 → reviewer
- [Wait for all to complete]
+
+The fire-and-check pattern maximizes parallelism by:
+1. Firing all implementers as background tasks simultaneously
+2. Polling to detect completion as early as possible
+3. Starting each reviewer immediately when its implementer finishes
+4. Not waiting for all implementers before starting any reviewers
+
+Example: 3 independent tasks
+- Fire implementer 1, 2, 3 as background_task (all start immediately)
+- Poll with background_list
+- Task 2 finishes first → immediately start reviewer 2
+- Task 1 finishes → immediately start reviewer 1
+- Task 3 finishes → immediately start reviewer 3
+- Reviewers run in parallel as they're spawned
-
+
Executes ONE task from the plan.
Input: Single task with context (which files, what to do).
Output: Changes made and verification results for that task.
- Invoke with: Task tool, subagent_type="implementer"
+
+ background_task(description="Implement task 1", prompt="...", agent="implementer")
+
+
+ Task(description="Implement task 1", prompt="...", subagent_type="implementer")
+
-
+
Reviews ONE task's implementation.
Input: Single task's changes against its requirements.
Output: APPROVED or CHANGES REQUESTED for that task.
- Invoke with: Task tool, subagent_type="reviewer"
+
+ background_task(description="Review task 1", prompt="...", agent="reviewer")
+
+
+ Task(description="Review task 1", prompt="...", subagent_type="reviewer")
+
-
-You MUST use the Task tool to spawn implementer and reviewer subagents.
-Example: Task(description="Implement task 1", prompt="...", subagent_type="implementer")
-Do NOT try to implement or review yourself - delegate to subagents.
-
-
For each task:
-1. Spawn implementer with task details
-2. Wait for implementer to complete
-3. Spawn reviewer to check that task
-4. If reviewer requests changes: re-spawn implementer for fixes
+1. Fire implementer as background_task
+2. Poll until implementer completes
+3. Start reviewer immediately when implementer finishes
+4. If reviewer requests changes: fire new implementer for fixes
5. Max 3 cycles per task before marking as blocked
6. Report task status: DONE / BLOCKED
-
-Within a batch, spawn ALL implementers in a SINGLE message using the Task tool:
-
-Example for batch with tasks 1, 2, 3 - call Task tool 3 times in ONE message:
-- Task(description="Task 1", prompt="Execute task 1: [details]", subagent_type="implementer")
-- Task(description="Task 2", prompt="Execute task 2: [details]", subagent_type="implementer")
-- Task(description="Task 3", prompt="Execute task 3: [details]", subagent_type="implementer")
-
-Then after all complete, in ONE message call Task tool for reviewers:
-- Task(description="Review 1", prompt="Review task 1 implementation", subagent_type="reviewer")
-- Task(description="Review 2", prompt="Review task 2 implementation", subagent_type="reviewer")
-- Task(description="Review 3", prompt="Review task 3 implementation", subagent_type="reviewer")
-
+
+Within a batch:
+1. Fire ALL implementers as background_task in ONE message
+2. Enter polling loop:
+ a. Call background_list to check status of ALL tasks
+ b. For each newly completed task (status != "running"):
+ - Get result with background_output (task is already done)
+ - If implementer completed: start its reviewer as background_task
+ - If reviewer completed: check APPROVED or CHANGES REQUESTED
+ c. If changes needed and cycles < 3: fire new implementer
+ d. Sleep briefly, then repeat until all tasks done or blocked
+3. Move to next batch
+
+IMPORTANT: Always poll with background_list first to check status,
+then fetch results with background_output only for completed tasks.
+
+
+
+If background_task fails or is unavailable, fall back to Task() tool:
+- Task(description="...", prompt="...", subagent_type="implementer")
+- Task(description="...", prompt="...", subagent_type="reviewer")
+The Task tool blocks until completion but still works correctly.
+
Parse ALL tasks from plan before starting execution
ALWAYS analyze dependencies before parallelizing
-Spawn parallel tasks in SINGLE message for true parallelism
+Fire parallel tasks as background_task for true parallelism
+Start reviewer immediately when its implementer finishes - don't wait for others
Wait for entire batch before starting next batch
Each task gets its own implement → review cycle
Max 3 review cycles per task
Continue with other tasks if one is blocked
+
+# Batch with tasks 1, 2, 3 (independent)
+
+## Step 1: Fire all implementers
+background_task(description="Task 1", prompt="Execute task 1: [details]", agent="implementer") → task_id_1
+background_task(description="Task 2", prompt="Execute task 2: [details]", agent="implementer") → task_id_2
+background_task(description="Task 3", prompt="Execute task 3: [details]", agent="implementer") → task_id_3
+
+## Step 2: Poll and react
+background_list() → shows task_id_2 completed
+background_output(task_id="task_id_2") → get result
+background_task(description="Review 2", prompt="Review task 2 implementation", agent="reviewer") → review_id_2
+
+background_list() → shows task_id_1, task_id_3 completed
+background_output(task_id="task_id_1") → get result
+background_output(task_id="task_id_3") → get result
+background_task(description="Review 1", prompt="Review task 1 implementation", agent="reviewer") → review_id_1
+background_task(description="Review 3", prompt="Review task 3 implementation", agent="reviewer") → review_id_3
+
+## Step 3: Continue polling until all reviews complete
+...
+
+
## Execution Complete
@@ -146,10 +186,12 @@ Then after all complete, in ONE message call Task tool for reviewers:
+NEVER call background_output on running tasks - always poll with background_list first
Never skip dependency analysis
Never spawn dependent tasks in parallel
Never skip reviewer for any task
Never continue past 3 cycles for a single task
Never report success if any task is blocked
+Never wait for all implementers before starting any reviewer
`,
};
diff --git a/src/agents/implementer.ts b/src/agents/implementer.ts
index 64c75f0..ebd527a 100644
--- a/src/agents/implementer.ts
+++ b/src/agents/implementer.ts
@@ -27,6 +27,7 @@ Execute the plan. Write code. Verify.
Verify preconditions match plan
Make the changes
Run verification (tests, lint, build)
+If verification passes: commit with message from plan
Report results
@@ -40,8 +41,17 @@ Execute the plan. Write code. Verify.
Run tests if available
Check for type errors
Verify no regressions
+If all pass: git add and commit with plan's commit message
+
+Commit ONLY after verification passes
+Use the commit message from the plan (e.g., "feat(scope): description")
+Stage only the files mentioned in the task
+If plan doesn't specify commit message, use: "feat(task): [task description]"
+Do NOT push - just commit locally
+
+
## Task: [Description]
@@ -54,6 +64,8 @@ Execute the plan. Write code. Verify.
- [x] Types check
- [ ] Manual check needed: [what]
+**Commit**: \`[commit hash]\` - [commit message]
+
**Issues**: None / [description]
diff --git a/src/agents/planner.ts b/src/agents/planner.ts
index ff32fd6..74180ae 100644
--- a/src/agents/planner.ts
+++ b/src/agents/planner.ts
@@ -13,13 +13,24 @@ Every task is bite-sized (2-5 minutes), with exact paths and complete code.
FOLLOW THE DESIGN: The brainstormer's design is the spec. Do not explore alternatives.
- SUBAGENTS: Spawn for implementation details (paths, signatures, line numbers).
- TOOLS (grep, read, etc.): Do NOT use directly - use subagents instead.
+ BACKGROUND TASKS: Use background_task for parallel research (fire-and-collect pattern).
+ TOOLS (grep, read, etc.): Do NOT use directly - use background subagents instead.
Every code example MUST be complete - never write "add validation here"
Every file path MUST be exact - never write "somewhere in src/"
Follow TDD: failing test → verify fail → implement → verify pass → commit
+
+ Fire subagent tasks that run in parallel. Returns task_id immediately.
+ List all background tasks and their current status. Use to poll for completion.
+ Get results from a completed task. Only call after background_list shows task is done.
+
+
+
+If background_task fails or is unavailable, fall back to Task() for sequential execution.
+Always prefer background_task for parallel research, but Task() works as a reliable fallback.
+
+
Brainstormer did conceptual research (architecture, patterns, approaches).
Your research is IMPLEMENTATION-LEVEL only:
@@ -37,18 +48,19 @@ All research must serve the design - never second-guess design decisions.
-
+
Find exact file paths needed for implementation.
Examples: "Find exact path to UserService", "Find test directory structure"
-
+
Get exact signatures and types for code examples.
Examples: "Get function signature for createUser", "Get type definition for UserConfig"
-
+
Find exact patterns to copy in code examples.
Examples: "Find exact test setup pattern", "Find exact error handling in similar endpoint"
+ If background_task unavailable, use Task() with same subagent types.
@@ -64,15 +76,21 @@ All research must serve the design - never second-guess design decisions.
Note any constraints or decisions made by brainstormer
-
- Spawn subagents in PARALLEL to gather exact details:
-
- In a SINGLE message, spawn:
- - codebase-locator: "Find exact path to [component from design]"
- - codebase-locator: "Find test file naming convention"
- - codebase-analyzer: "Get exact signature for [function mentioned in design]"
- - pattern-finder: "Find exact test setup pattern for [type of test]"
-
+
+ Fire background tasks AND library research in parallel:
+
+ In a SINGLE message, fire:
+ - background_task(agent="codebase-locator", prompt="Find exact path to [component]")
+ - background_task(agent="codebase-analyzer", prompt="Get signature for [function]")
+ - background_task(agent="pattern-finder", prompt="Find test setup pattern")
+ - context7_resolve-library-id + context7_query-docs for API docs
+ - btca_ask for library internals when needed
+
+
+ - Poll with background_list until all tasks show completed or error
+ - Call background_output(task_id=...) for each completed task (skip errored)
+ - Combine all results for planning phase
+
Only research what's needed to implement the design
Never research alternatives to design decisions
@@ -164,6 +182,29 @@ git commit -m "feat(scope): add specific feature"
+
+
+// In a SINGLE message, fire all research tasks:
+background_task(agent="codebase-locator", prompt="Find UserService path") // returns task_id_1
+background_task(agent="codebase-analyzer", prompt="Get createUser signature") // returns task_id_2
+background_task(agent="pattern-finder", prompt="Find test setup pattern") // returns task_id_3
+context7_resolve-library-id(libraryName="express") // runs in parallel
+btca_ask(tech="express", question="middleware chain order") // runs in parallel
+
+
+// Poll until all background tasks complete:
+background_list() // check status of all tasks
+// When all show "completed":
+background_output(task_id=task_id_1) // get result
+background_output(task_id=task_id_2) // get result
+background_output(task_id=task_id_3) // get result
+// context7 and btca_ask results already available from fire step
+
+
+// Use all collected results to write the implementation plan
+
+
+
Engineer knows nothing about our codebase
Every code block is copy-paste ready
diff --git a/src/agents/project-initializer.ts b/src/agents/project-initializer.ts
index 328938f..a720b9b 100644
--- a/src/agents/project-initializer.ts
+++ b/src/agents/project-initializer.ts
@@ -10,7 +10,7 @@ const PROMPT = `
MAXIMIZE PARALLELISM. Speed is critical.
- - Spawn multiple agents simultaneously
+ - Fire ALL background tasks simultaneously
- Run multiple tool calls in single message
- Never wait for one thing when you can do many
@@ -23,16 +23,33 @@ const PROMPT = `
-
-
- Spawn ALL discovery tasks simultaneously
-
+
+
+ Fire a subagent to run in background. Returns task_id immediately.
+ Parameters: description, prompt, agent (subagent type)
+ Example: background_task(description="Find entry points", prompt="Find all entry points", agent="codebase-locator")
+
+
+ List all background tasks and their status. Use to poll for completion.
+ No parameters required.
+
+
+ Get results from a completed task. Only call after background_list shows task is done.
+ Parameters: task_id
+ Example: background_output(task_id="abc123")
+
+
+
+
+
+ Launch ALL discovery agents + run tools in a SINGLE message
+
Find entry points, configs, main modules
Find test files and test patterns
Find linter, formatter, CI configs
Analyze directory structure
Find naming conventions across files
-
+
Glob for package.json, pyproject.toml, go.mod, Cargo.toml, etc.
Glob for *.config.*, .eslintrc*, .prettierrc*, ruff.toml, etc.
@@ -41,13 +58,20 @@ const PROMPT = `
-
- Analyze core modules in parallel
-
+
+ Poll background_list until all tasks complete, then collect with background_output
+ Poll background_list until all tasks show "completed" or "error"
+ Call background_output for each completed task (skip errored)
+ Process tool results from phase 1
+
+
+
+ Based on discovery, fire more background tasks
+
Analyze core/domain logic
Analyze API/entry points
Analyze data layer
-
+
Read 5 core source files simultaneously
Read 3 test files simultaneously
@@ -55,8 +79,9 @@ const PROMPT = `
-
- Write both files in parallel
+
+ Collect deep analysis results, then write both files
+ Collect all deep analysis results
Write ARCHITECTURE.md
Write CODE_STYLE.md
@@ -66,23 +91,37 @@ const PROMPT = `
Fast file/pattern finder. Spawn multiple with different queries.
Examples: "Find all entry points", "Find all config files", "Find test directories"
- Invoke with: Task tool, subagent_type="codebase-locator"
+
+ Background: background_task(description="Find entry points", prompt="Find all entry points and main files", agent="codebase-locator")
+ Fallback: Task(description="Find entry points", prompt="Find all entry points and main files", subagent_type="codebase-locator")
Deep module analyzer. Spawn multiple for different areas.
Examples: "Analyze src/core", "Analyze api layer", "Analyze database module"
- Invoke with: Task tool, subagent_type="codebase-analyzer"
+
+ Background: background_task(description="Analyze core", prompt="Analyze the core module", agent="codebase-analyzer")
+ Fallback: Task(description="Analyze core", prompt="Analyze the core module", subagent_type="codebase-analyzer")
Pattern extractor. Spawn for different pattern types.
Examples: "Find naming patterns", "Find error handling patterns", "Find async patterns"
- Invoke with: Task tool, subagent_type="pattern-finder"
+
+ Background: background_task(description="Find patterns", prompt="Find naming conventions", agent="pattern-finder")
+ Fallback: Task(description="Find patterns", prompt="Find naming conventions", subagent_type="pattern-finder")
+
+ If background_task fails or is unavailable, fall back to Task() tool.
+ The Task tool provides synchronous subagent execution.
+ Example fallback: Task(description="Find entry points", prompt="Find all entry points", subagent_type="codebase-locator")
+
+
- You MUST use the Task tool to spawn subagents. Call multiple Task tools in a SINGLE message for parallelism.
- Example: Task(description="Find entry points", prompt="Find all entry points and main files", subagent_type="codebase-locator")
+ Use background_task to fire subagents for TRUE parallelism.
+ Fire ALL background_task calls in a SINGLE message.
+ Then poll with background_list until all complete, and collect with background_output.
+ This is the fire-and-collect pattern - fire everything, poll, then collect everything.
@@ -148,10 +187,10 @@ const PROMPT = `
- ALWAYS spawn multiple agents in a SINGLE message
+ ALWAYS fire multiple background_task calls in a SINGLE message
ALWAYS run multiple tool calls in a SINGLE message
NEVER wait for one task when you can start others
- Batch related queries into parallel agent spawns
+ Use fire-and-collect: fire all, then collect all
@@ -176,27 +215,40 @@ const PROMPT = `
-
-
- In a SINGLE message, call Task tool multiple times AND run other tools:
- - Task(description="Find entry points", prompt="Find all entry points and main files", subagent_type="codebase-locator")
- - Task(description="Find configs", prompt="Find all config files (linters, formatters, build)", subagent_type="codebase-locator")
- - Task(description="Find tests", prompt="Find test directories and test files", subagent_type="codebase-locator")
- - Task(description="Analyze structure", prompt="Analyze the directory structure and organization", subagent_type="codebase-analyzer")
- - Task(description="Find patterns", prompt="Find naming conventions used across the codebase", subagent_type="pattern-finder")
+
+
+ In a SINGLE message, fire ALL background_task calls AND run other tools:
+ - background_task(description="Find entry points", prompt="Find all entry points and main files", agent="codebase-locator") -> task_id_1
+ - background_task(description="Find configs", prompt="Find all config files (linters, formatters, build)", agent="codebase-locator") -> task_id_2
+ - background_task(description="Find tests", prompt="Find test directories and test files", agent="codebase-locator") -> task_id_3
+ - background_task(description="Analyze structure", prompt="Analyze the directory structure and organization", agent="codebase-analyzer") -> task_id_4
+ - background_task(description="Find patterns", prompt="Find naming conventions used across the codebase", agent="pattern-finder") -> task_id_5
- Glob: package.json, pyproject.toml, go.mod, Cargo.toml, etc.
- Glob: README*, ARCHITECTURE*, docs/*
-
- Based on discovery, in a SINGLE message:
- - Task for each major module: subagent_type="codebase-analyzer"
+
+ First poll until all tasks complete:
+ - background_list() // repeat until all show "completed" or "error"
+ Then collect results (skip errored tasks):
+ - background_output(task_id=task_id_1)
+ - background_output(task_id=task_id_2)
+ - background_output(task_id=task_id_3)
+ - background_output(task_id=task_id_4)
+ - background_output(task_id=task_id_5)
+
+
+
+ Based on discovery, in a SINGLE message fire more tasks:
+ - background_task for each major module: agent="codebase-analyzer"
- Read multiple source files simultaneously
- Read multiple test files simultaneously
-
- Write ARCHITECTURE.md and CODE_STYLE.md
+
+ Collect deep analysis results, then write:
+ - Write ARCHITECTURE.md
+ - Write CODE_STYLE.md
diff --git a/src/tools/background-task/manager.ts b/src/tools/background-task/manager.ts
index 1a05c41..1f4a0ab 100644
--- a/src/tools/background-task/manager.ts
+++ b/src/tools/background-task/manager.ts
@@ -1,7 +1,14 @@
import type { PluginInput } from "@opencode-ai/plugin";
-import type { BackgroundTask, BackgroundTaskInput } from "./types";
+import type {
+ BackgroundTask,
+ BackgroundTaskInput,
+ SessionCreateResponse,
+ SessionGetResponse,
+ SessionMessagesResponse,
+} from "./types";
const POLL_INTERVAL_MS = 2000;
+const TASK_TTL_MS = 60 * 60 * 1000; // 1 hour
function generateTaskId(): string {
const chars = "abcdefghijklmnopqrstuvwxyz0123456789";
@@ -42,7 +49,7 @@ export class BackgroundTaskManager {
query: { directory: this.ctx.directory },
});
- const sessionData = sessionResp as { data?: { id?: string } };
+ const sessionData = sessionResp as SessionCreateResponse;
const sessionID = sessionData.data?.id;
if (!sessionID) {
@@ -78,6 +85,7 @@ export class BackgroundTaskManager {
query: { directory: this.ctx.directory },
})
.catch((error) => {
+ console.error(`[background-task] Failed to prompt session ${sessionID}:`, error);
task.status = "error";
task.error = error instanceof Error ? error.message : String(error);
task.completedAt = new Date();
@@ -103,7 +111,9 @@ export class BackgroundTaskManager {
path: { id: task.sessionID },
query: { directory: this.ctx.directory },
})
- .catch(() => {});
+ .catch((error) => {
+ console.error(`[background-task] Failed to abort session ${task.sessionID}:`, error);
+ });
task.status = "cancelled";
task.completedAt = new Date();
@@ -155,21 +165,17 @@ export class BackgroundTaskManager {
query: { directory: this.ctx.directory },
});
- const messages = (resp as { data?: unknown[] }).data || [];
- const lastAssistant = [...messages].reverse().find((m) => {
- const msg = m as Record;
- const info = msg.info as Record | undefined;
- return info?.role === "assistant";
- }) as Record | undefined;
+ const messagesResp = resp as SessionMessagesResponse;
+ const messages = messagesResp.data || [];
+ const lastAssistant = [...messages].reverse().find((m) => m.info?.role === "assistant");
if (lastAssistant) {
- const parts = lastAssistant.parts as Array<{ type: string; text?: string }> | undefined;
- const textParts = parts?.filter((p) => p.type === "text") || [];
+ const textParts = lastAssistant.parts?.filter((p) => p.type === "text") || [];
task.result = textParts.map((p) => p.text || "").join("\n");
return task.result;
}
- } catch {
- // Ignore errors fetching result
+ } catch (error) {
+ console.error(`[background-task] Failed to fetch result for task ${taskId}:`, error);
}
return undefined;
@@ -197,14 +203,6 @@ export class BackgroundTaskManager {
output += `\n### Error\n${task.error}\n`;
}
- if (task.progress?.lastMessage) {
- const preview =
- task.progress.lastMessage.length > 200
- ? `${task.progress.lastMessage.slice(0, 200)}...`
- : task.progress.lastMessage;
- output += `\n### Last Message Preview\n${preview}\n`;
- }
-
return output;
}
@@ -223,7 +221,23 @@ export class BackgroundTaskManager {
}
}
+ private cleanupOldTasks(): void {
+ const now = Date.now();
+ for (const [taskId, task] of this.tasks) {
+ // Only cleanup completed/cancelled/error tasks
+ if (task.status === "running") continue;
+
+ const completedAt = task.completedAt?.getTime() || 0;
+ if (now - completedAt > TASK_TTL_MS) {
+ this.tasks.delete(taskId);
+ }
+ }
+ }
+
private async pollRunningTasks(): Promise {
+ // Cleanup old completed tasks to prevent memory leak
+ this.cleanupOldTasks();
+
const runningTasks = this.getRunningTasks();
if (runningTasks.length === 0) {
@@ -239,7 +253,7 @@ export class BackgroundTaskManager {
query: { directory: this.ctx.directory },
});
- const sessionData = resp as { data?: { status?: string } };
+ const sessionData = resp as SessionGetResponse;
const status = sessionData.data?.status;
if (status === "idle") {
@@ -258,10 +272,12 @@ export class BackgroundTaskManager {
duration: 5000,
},
})
- .catch(() => {});
+ .catch((error) => {
+ console.error(`[background-task] Failed to show toast for task ${task.id}:`, error);
+ });
}
- } catch {
- // Session may not exist anymore
+ } catch (error) {
+ console.error(`[background-task] Failed to poll task ${task.id}:`, error);
if (task.status === "running") {
task.status = "error";
task.error = "Session lost";
diff --git a/src/tools/background-task/tools.ts b/src/tools/background-task/tools.ts
index 68677e5..41f3c30 100644
--- a/src/tools/background-task/tools.ts
+++ b/src/tools/background-task/tools.ts
@@ -38,32 +38,19 @@ Use \`background_output\` with task_id="${task.id}" to check progress or get res
});
const background_output = tool({
- description: `Check status or get results from a background task.
-By default returns immediately with current status.
-Set block=true to wait for completion (with timeout).`,
+ description: `Get status or results from a background task.
+Returns immediately with current status. Use background_list to poll for completion.`,
args: {
task_id: tool.schema.string().describe("ID of the task to check (e.g., 'bg_abc12345')"),
- block: tool.schema.boolean().optional().describe("Wait for task completion (default: false)"),
- timeout: tool.schema.number().optional().describe("Max seconds to wait if blocking (default: 60, max: 600)"),
},
execute: async (args) => {
- const { task_id, block = false, timeout = 60 } = args;
+ const { task_id } = args;
const task = manager.getTask(task_id);
if (!task) {
return `Task not found: ${task_id}`;
}
- // If blocking, wait for completion
- if (block && task.status === "running") {
- const maxWait = Math.min(timeout || 60, 600) * 1000;
- const startTime = Date.now();
-
- while (task.status === "running" && Date.now() - startTime < maxWait) {
- await new Promise((resolve) => setTimeout(resolve, 1000));
- }
- }
-
// Format status
let output = manager.formatTaskStatus(task);
diff --git a/src/tools/background-task/types.ts b/src/tools/background-task/types.ts
index 9d6b385..91cbc78 100644
--- a/src/tools/background-task/types.ts
+++ b/src/tools/background-task/types.ts
@@ -15,7 +15,6 @@ export interface BackgroundTask {
toolCalls: number;
lastTool?: string;
lastUpdate: Date;
- lastMessage?: string;
};
}
@@ -26,3 +25,37 @@ export interface BackgroundTaskInput {
parentSessionID: string;
parentMessageID: string;
}
+
+// API Response Types
+export interface SessionCreateResponse {
+ data?: {
+ id?: string;
+ };
+}
+
+export interface SessionGetResponse {
+ data?: {
+ status?: "idle" | "running" | "error";
+ };
+}
+
+export interface MessagePart {
+ type: string;
+ text?: string;
+}
+
+export interface MessageInfo {
+ role?: "user" | "assistant";
+ sessionID?: string;
+ type?: string;
+ name?: string;
+}
+
+export interface SessionMessage {
+ info?: MessageInfo;
+ parts?: MessagePart[];
+}
+
+export interface SessionMessagesResponse {
+ data?: SessionMessage[];
+}
diff --git a/tests/agents/executor.test.ts b/tests/agents/executor.test.ts
new file mode 100644
index 0000000..6e7db31
--- /dev/null
+++ b/tests/agents/executor.test.ts
@@ -0,0 +1,40 @@
+import { describe, it, expect } from "bun:test";
+
+describe("executor agent", () => {
+ it("should use background_task instead of Task", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/executor.ts", "utf-8");
+
+ expect(source).toContain("background_task");
+ expect(source).toContain("background_output");
+ expect(source).toContain("background_list");
+ });
+
+ it("should have fire-and-check pattern documentation", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/executor.ts", "utf-8");
+
+ expect(source).toContain("fire-and-check");
+ });
+
+ it("should have fallback-rule section", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/executor.ts", "utf-8");
+
+ expect(source).toContain("");
+ });
+
+ it("should have background-tools section", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/executor.ts", "utf-8");
+
+ expect(source).toContain("");
+ });
+
+ it("should describe starting reviewer when implementer finishes", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/executor.ts", "utf-8");
+
+ expect(source).toMatch(/reviewer.*immediately|immediately.*reviewer/i);
+ });
+});
diff --git a/tests/agents/planner.test.ts b/tests/agents/planner.test.ts
new file mode 100644
index 0000000..070df4d
--- /dev/null
+++ b/tests/agents/planner.test.ts
@@ -0,0 +1,40 @@
+import { describe, it, expect } from "bun:test";
+
+describe("planner agent", () => {
+ it("should use background_task instead of Task for research", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/planner.ts", "utf-8");
+
+ expect(source).toContain("background_task");
+ expect(source).toContain("background_output");
+ });
+
+ it("should have fire-and-collect pattern documentation", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/planner.ts", "utf-8");
+
+ expect(source).toContain("fire-and-collect");
+ });
+
+ it("should have fallback-rule section", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/planner.ts", "utf-8");
+
+ expect(source).toContain("");
+ });
+
+ it("should have background-tools section", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/planner.ts", "utf-8");
+
+ expect(source).toContain("");
+ });
+
+ it("should mention running library research in parallel with agents", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/planner.ts", "utf-8");
+
+ expect(source).toContain("context7");
+ expect(source).toContain("btca_ask");
+ });
+});
diff --git a/tests/agents/project-initializer.test.ts b/tests/agents/project-initializer.test.ts
new file mode 100644
index 0000000..f5cb790
--- /dev/null
+++ b/tests/agents/project-initializer.test.ts
@@ -0,0 +1,32 @@
+import { describe, it, expect } from "bun:test";
+
+describe("project-initializer agent", () => {
+ it("should use background_task instead of Task", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/project-initializer.ts", "utf-8");
+
+ expect(source).toContain("background_task");
+ expect(source).toContain("background_output");
+ });
+
+ it("should have fire-and-collect pattern documentation", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/project-initializer.ts", "utf-8");
+
+ expect(source).toContain("fire-and-collect");
+ });
+
+ it("should have fallback-rule section", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/project-initializer.ts", "utf-8");
+
+ expect(source).toContain("");
+ });
+
+ it("should have background-tools section", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/agents/project-initializer.ts", "utf-8");
+
+ expect(source).toContain("");
+ });
+});
diff --git a/tests/tools/background-task-cleanup.test.ts b/tests/tools/background-task-cleanup.test.ts
new file mode 100644
index 0000000..d3d9378
--- /dev/null
+++ b/tests/tools/background-task-cleanup.test.ts
@@ -0,0 +1,23 @@
+import { describe, it, expect } from "bun:test";
+
+describe("background-task cleanup", () => {
+ it("should have TASK_TTL_MS constant", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/manager.ts", "utf-8");
+ expect(source).toContain("TASK_TTL_MS");
+ });
+
+ it("should have cleanupOldTasks method", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/manager.ts", "utf-8");
+ expect(source).toContain("cleanupOldTasks");
+ });
+
+ it("should call cleanup in pollRunningTasks", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/manager.ts", "utf-8");
+ // Find pollRunningTasks method and verify it calls cleanupOldTasks
+ const pollMethod = source.match(/async pollRunningTasks\(\)[^{]*\{[\s\S]*?^\s{2}\}/m);
+ expect(pollMethod?.[0]).toContain("cleanupOldTasks");
+ });
+});
diff --git a/tests/tools/background-task-error-logging.test.ts b/tests/tools/background-task-error-logging.test.ts
new file mode 100644
index 0000000..28922a7
--- /dev/null
+++ b/tests/tools/background-task-error-logging.test.ts
@@ -0,0 +1,34 @@
+import { describe, it, expect } from "bun:test";
+
+describe("background-task error logging", () => {
+ it("should not have silent catch blocks", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/manager.ts", "utf-8");
+
+ // Should not have empty catch blocks like .catch(() => {})
+ expect(source).not.toMatch(/\.catch\s*\(\s*\(\s*\)\s*=>\s*\{\s*\}\s*\)/);
+
+ // Should not have catch blocks that capture error but do nothing
+ // e.g., .catch((err) => {}) or .catch((error) => {})
+ expect(source).not.toMatch(/\.catch\s*\(\s*\(\s*\w+\s*\)\s*=>\s*\{\s*\}\s*\)/);
+ });
+
+ it("should log errors in catch blocks with console.error", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/manager.ts", "utf-8");
+
+ // Find all .catch blocks with their full body using a more comprehensive regex
+ // Match .catch((param) => { ... }) including multiline
+ const catchRegex = /\.catch\s*\(\s*\(\s*(\w+)\s*\)\s*=>\s*\{([^}]*(?:\{[^}]*\}[^}]*)*)\}\s*\)/g;
+ const matches = [...source.matchAll(catchRegex)];
+
+ // Should have at least some catch blocks
+ expect(matches.length).toBeGreaterThan(0);
+
+ for (const match of matches) {
+ const catchBody = match[2];
+ // Each catch block body should contain console.error
+ expect(catchBody).toContain("console.error");
+ }
+ });
+});
diff --git a/tests/tools/background-task-manager.test.ts b/tests/tools/background-task-manager.test.ts
new file mode 100644
index 0000000..9a4206e
--- /dev/null
+++ b/tests/tools/background-task-manager.test.ts
@@ -0,0 +1,309 @@
+import { describe, it, expect, beforeEach, mock } from "bun:test";
+import { BackgroundTaskManager } from "../../src/tools/background-task/manager";
+
+// Mock the PluginInput context
+function createMockCtx() {
+ return {
+ directory: "/test",
+ client: {
+ session: {
+ create: mock(() => Promise.resolve({ data: { id: "session-123" } })),
+ get: mock(() => Promise.resolve({ data: { status: "idle" } })),
+ messages: mock(() =>
+ Promise.resolve({
+ data: [
+ {
+ info: { role: "assistant" },
+ parts: [{ type: "text", text: "Task result" }],
+ },
+ ],
+ }),
+ ),
+ prompt: mock(() => Promise.resolve({})),
+ abort: mock(() => Promise.resolve({})),
+ },
+ tui: {
+ showToast: mock(() => Promise.resolve({})),
+ },
+ },
+ } as any;
+}
+
+describe("BackgroundTaskManager", () => {
+ let manager: BackgroundTaskManager;
+ let mockCtx: ReturnType;
+
+ beforeEach(() => {
+ mockCtx = createMockCtx();
+ manager = new BackgroundTaskManager(mockCtx);
+ });
+
+ describe("launch", () => {
+ it("should create a task with running status", async () => {
+ const task = await manager.launch({
+ description: "Test task",
+ prompt: "Do something",
+ agent: "test-agent",
+ parentSessionID: "parent-123",
+ parentMessageID: "msg-123",
+ });
+
+ expect(task.id).toMatch(/^bg_[a-z0-9]{8}$/);
+ expect(task.status).toBe("running");
+ expect(task.description).toBe("Test task");
+ expect(task.agent).toBe("test-agent");
+ expect(task.sessionID).toBe("session-123");
+ });
+
+ it("should throw if session creation fails", async () => {
+ mockCtx.client.session.create = mock(() => Promise.resolve({ data: {} }));
+
+ await expect(
+ manager.launch({
+ description: "Test",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ }),
+ ).rejects.toThrow("Failed to create background session");
+ });
+
+ it("should store task in internal map", async () => {
+ const task = await manager.launch({
+ description: "Test",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ expect(manager.getTask(task.id)).toBe(task);
+ });
+ });
+
+ describe("cancel", () => {
+ it("should cancel a running task", async () => {
+ const task = await manager.launch({
+ description: "Test",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ const result = await manager.cancel(task.id);
+
+ expect(result).toBe(true);
+ expect(task.status).toBe("cancelled");
+ expect(task.completedAt).toBeDefined();
+ });
+
+ it("should return false for non-existent task", async () => {
+ const result = await manager.cancel("non-existent");
+ expect(result).toBe(false);
+ });
+
+ it("should return false for already completed task", async () => {
+ const task = await manager.launch({
+ description: "Test",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ task.status = "completed";
+ const result = await manager.cancel(task.id);
+ expect(result).toBe(false);
+ });
+ });
+
+ describe("cancelAll", () => {
+ it("should cancel all running tasks", async () => {
+ await manager.launch({
+ description: "Task 1",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+ await manager.launch({
+ description: "Task 2",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ const cancelled = await manager.cancelAll();
+
+ expect(cancelled).toBe(2);
+ expect(manager.getRunningTasks().length).toBe(0);
+ });
+ });
+
+ describe("getAllTasks", () => {
+ it("should return all tasks", async () => {
+ await manager.launch({
+ description: "Task 1",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+ await manager.launch({
+ description: "Task 2",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ const tasks = manager.getAllTasks();
+ expect(tasks.length).toBe(2);
+ });
+ });
+
+ describe("getRunningTasks", () => {
+ it("should only return running tasks", async () => {
+ const task1 = await manager.launch({
+ description: "Task 1",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+ await manager.launch({
+ description: "Task 2",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ task1.status = "completed";
+
+ const running = manager.getRunningTasks();
+ expect(running.length).toBe(1);
+ expect(running[0].description).toBe("Task 2");
+ });
+ });
+
+ describe("getTaskResult", () => {
+ it("should return undefined for running task", async () => {
+ const task = await manager.launch({
+ description: "Test",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ const result = await manager.getTaskResult(task.id);
+ expect(result).toBeUndefined();
+ });
+
+ it("should fetch and cache result for completed task", async () => {
+ const task = await manager.launch({
+ description: "Test",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ task.status = "completed";
+ const result = await manager.getTaskResult(task.id);
+
+ expect(result).toBe("Task result");
+ expect(task.result).toBe("Task result");
+
+ // Second call should use cached result
+ const result2 = await manager.getTaskResult(task.id);
+ expect(result2).toBe("Task result");
+ expect(mockCtx.client.session.messages).toHaveBeenCalledTimes(1);
+ });
+ });
+
+ describe("formatTaskStatus", () => {
+ it("should format task status as markdown table", async () => {
+ const task = await manager.launch({
+ description: "Test task",
+ prompt: "Test",
+ agent: "test-agent",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ const output = manager.formatTaskStatus(task);
+
+ expect(output).toContain("## Task: Test task");
+ expect(output).toContain("| ID |");
+ expect(output).toContain("| Status | RUNNING |");
+ expect(output).toContain("| Agent | test-agent |");
+ });
+
+ it("should include error if present", async () => {
+ const task = await manager.launch({
+ description: "Test",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ task.status = "error";
+ task.error = "Something went wrong";
+
+ const output = manager.formatTaskStatus(task);
+ expect(output).toContain("### Error");
+ expect(output).toContain("Something went wrong");
+ });
+ });
+
+ describe("handleEvent", () => {
+ it("should track tool usage from message.part.updated events", async () => {
+ const task = await manager.launch({
+ description: "Test",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ manager.handleEvent({
+ type: "message.part.updated",
+ properties: {
+ info: {
+ sessionID: task.sessionID,
+ type: "tool_use",
+ name: "read",
+ },
+ },
+ });
+
+ expect(task.progress?.toolCalls).toBe(1);
+ expect(task.progress?.lastTool).toBe("read");
+ });
+
+ it("should cancel task on session.deleted event", async () => {
+ const task = await manager.launch({
+ description: "Test",
+ prompt: "Test",
+ agent: "test",
+ parentSessionID: "p",
+ parentMessageID: "m",
+ });
+
+ manager.handleEvent({
+ type: "session.deleted",
+ properties: {
+ info: { id: task.sessionID },
+ },
+ });
+
+ expect(task.status).toBe("cancelled");
+ });
+ });
+});
diff --git a/tests/tools/background-task-response-types.test.ts b/tests/tools/background-task-response-types.test.ts
new file mode 100644
index 0000000..927c579
--- /dev/null
+++ b/tests/tools/background-task-response-types.test.ts
@@ -0,0 +1,30 @@
+import { describe, it, expect } from "bun:test";
+
+describe("background-task response types", () => {
+ it("should have SessionCreateResponse type", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/types.ts", "utf-8");
+ expect(source).toContain("SessionCreateResponse");
+ });
+
+ it("should have SessionGetResponse type", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/types.ts", "utf-8");
+ expect(source).toContain("SessionGetResponse");
+ });
+
+ it("should have SessionMessagesResponse type", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/types.ts", "utf-8");
+ expect(source).toContain("SessionMessagesResponse");
+ });
+
+ it("should use typed responses in manager", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/manager.ts", "utf-8");
+ // Should import the response types
+ expect(source).toContain("SessionCreateResponse");
+ expect(source).toContain("SessionGetResponse");
+ expect(source).toContain("SessionMessagesResponse");
+ });
+});
diff --git a/tests/tools/background-task-types.test.ts b/tests/tools/background-task-types.test.ts
new file mode 100644
index 0000000..f5f8441
--- /dev/null
+++ b/tests/tools/background-task-types.test.ts
@@ -0,0 +1,15 @@
+import { describe, it, expect } from "bun:test";
+
+describe("background-task types", () => {
+ it("should not have lastMessage in progress type", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/types.ts", "utf-8");
+ expect(source).not.toContain("lastMessage");
+ });
+
+ it("should not reference lastMessage in manager", async () => {
+ const fs = await import("node:fs/promises");
+ const source = await fs.readFile("src/tools/background-task/manager.ts", "utf-8");
+ expect(source).not.toContain("lastMessage");
+ });
+});