From 1ec953aac12e32b0a7272ff3606b47c102ff6732 Mon Sep 17 00:00:00 2001 From: "yemyat (github)" Date: Fri, 23 Jan 2026 16:07:05 +0800 Subject: [PATCH 1/8] feat: add exploration-tracer planning with potentialChangeLocations and points --- .ralph-wiggum/PROMPT_plan.md | 210 ++++++++++++++++++++++-------- src/domain/spec.ts | 34 ++++- src/domain/task.ts | 30 +++++ src/services/builder.ts | 11 +- src/templates/prompts.ts | 239 +++++++++++++++++++++++++---------- src/types.ts | 10 ++ 6 files changed, 413 insertions(+), 121 deletions(-) diff --git a/.ralph-wiggum/PROMPT_plan.md b/.ralph-wiggum/PROMPT_plan.md index 2a96812..35e46fb 100644 --- a/.ralph-wiggum/PROMPT_plan.md +++ b/.ralph-wiggum/PROMPT_plan.md @@ -1,85 +1,187 @@ # Plan Mode -You are an autonomous planning agent. Analyze specs and create a structured implementation plan. +Create implementation plans from specs by thoroughly exploring the codebase first. Use sub-agents as "exploration tracers" to discover all potential change locations before breaking work into tasks. + +## Philosophy + +**The problem with naive planning**: Specs describe WHAT to build, but implementation plans need to describe WHAT TO CHANGE. The gap between these is where agents fail—they burn context discovering change locations instead of implementing. + +**Solution**: Do expensive exploration ONCE during planning, so execution agents work from concrete checklists. ## Context (Read First) -1. Read @.ralph-wiggum/GUARDRAILS.md — understand project compliance rules -2. Read all specs in `.ralph-wiggum/specs/*` — understand what needs to be built -3. Read @.ralph-wiggum/implementation.json (if exists) — current progress state -4. Read @.ralph-wiggum/PROGRESS.md — learnings from previous runs -5. Reference source code thoroughly to understand current state +1. Read `.ralph-wiggum/GUARDRAILS.md` — project compliance rules +2. The spec file being planned +3. `.ralph-wiggum/implementation.json` — current progress (use `jq` to query) -## Rules -- Plan only — do NOT implement anything -- Do NOT assume functionality is missing — confirm with code search first -- Each spec should have clear tasks and acceptance criteria -- Treat `src/lib` as shared utilities — prefer consolidation over duplication +## Process + +### Phase 1: Parse Spec + +Read the spec file and extract: +- Requirements (what needs to change) +- Acceptance criteria (how to verify) +- Mentioned files/modules (hints for exploration) +- Dependencies on other specs + +### Phase 2: Identify Exploration Targets + +From each requirement, derive exploration questions: +- "Where is X defined?" +- "What files use Y?" +- "What tests cover Z?" +- "What patterns exist for similar functionality?" + +### Phase 3: Spawn Exploration Sub-Agents (Parallel) + +For each major exploration target, spawn a sub-agent. -## Workflow +``` +Example for "Remove transactionIds from artifacts": + +Agent A: "Find all definitions and usages of transactionIds in artifacts" + → Returns: schema.ts:42, types.ts:89, queries.ts:67/120, mutations.ts:30 + +Agent B: "Find all tests that reference artifact transactionIds" + → Returns: artifacts.test.ts:200/340, scrapbook.test.ts:89 + +Agent C: "Trace what depends on artifact.transactionIds for data" + → Returns: scrapbook.ts:89 uses it for timeline, nowhere else +``` -### 1. Audit Specs -- Read all specs in `.ralph-wiggum/specs/*` -- For each spec, verify tasks and acceptance criteria are clear and complete -- If a spec is missing details, update it with specific tasks and acceptance criteria +Each sub-agent should return: +- File paths with line numbers +- Patterns observed (e.g., "uses compound index", "has auth check") +- Estimated lines of code to change +- Any concerns or complexity notes -### 2. Audit Codebase -- Use up to 500 parallel subagents to search the source code -- Compare implementation against specs -- Look for: TODOs, placeholders, skipped tests, incomplete features, inconsistent patterns +**No arbitrary cap on sub-agents**—use as many as needed for thorough exploration. -### 3. Output implementation.json +### Phase 4: Synthesize Into Tasks -Create or update @.ralph-wiggum/implementation.json with this structure: +Group exploration findings into tasks using these heuristics: + +| Signal | Action | +|--------|--------| +| Same file, routine changes (< 30 LOC) | Merge into one task | +| Different files, same pattern | Merge if total LOC < 50 | +| Creates something others depend on | Separate task (first in chain) | +| Complex logic requiring reasoning | Separate task | +| Natural verification boundary | Separate task | +| Tests for a feature | Include with feature if small, separate if > 30 LOC | + +**LOC → Points guidance** (use internally, don't store LOC): +- 1-20 LOC routine changes → 1-2 points +- 20-50 LOC or moderate complexity → 3 points +- 50-100 LOC or high complexity → 5 points +- > 100 LOC → must split into multiple tasks + +### Phase 5: Output implementation.json + +Update `.ralph-wiggum/implementation.json` with the new task structure: ```json { "version": 1, - "updatedAt": "2026-01-17T10:30:00Z", + "updatedAt": "2026-01-23T10:30:00Z", "updatedBy": "plan-mode", "specs": [ { - "id": "spec-id-kebab-case", - "file": "specs/spec-file.md", + "id": "060-feat-something", + "file": ".ralph-wiggum/specs/060-feat-something.md", "name": "Human Readable Name", - "priority": 1, + "priority": 160, "status": "pending", - "context": "Brief context for this spec. Reference existing code locations.", + "dependsOn": [], + "pointsBudget": 15, "tasks": [ { - "id": "spec-id-1", - "description": "First task description", + "id": "060-feat-something-1", + "description": "Schema + type changes for new feature", + "potentialChangeLocations": [ + "convex/schema.ts:142 - add newField to someTable", + "convex/schema.ts:156 - add by_userId_newField index", + "convex/types.ts:89 - update SomeTableDoc type", + "convex/validators.ts:34 - may need validator update" + ], + "points": 2, "status": "pending", - "acceptanceCriteria": ["Criteria 1", "Criteria 2"] + "acceptanceCriteria": [ + "newField exists in schema with correct type", + "Index by_userId_newField exists", + "bun run typecheck passes" + ] }, { - "id": "spec-id-2", - "description": "Second task description", - "status": "pending" + "id": "060-feat-something-2", + "description": "Core queries and mutations for new feature", + "potentialChangeLocations": [ + "convex/core/someTable/queries.ts:45 - add getByNewField query", + "convex/core/someTable/mutations.ts:67 - update create to accept newField", + "convex/core/someTable/internal.ts:23 - add helper for newField lookup" + ], + "points": 3, + "status": "pending", + "dependsOn": ["060-feat-something-1"], + "acceptanceCriteria": [ + "getByNewField query works with auth check", + "create mutation accepts and persists newField", + "bun run typecheck passes" + ] } ], - "acceptanceCriteria": ["Spec-level AC 1", "Spec-level AC 2"] + "acceptanceCriteria": ["Spec-level AC copied from spec file"] } ] } ``` -**Important:** -- Each spec gets an `id` (kebab-case, derived from spec filename) -- Tasks get sequential IDs like `{spec-id}-1`, `{spec-id}-2`, etc. -- `priority`: Lower number = higher priority (1 = first to implement) -- `status`: "pending" for unstarted, "in_progress" for active, "completed" for done -- `context`: Include relevant code paths, dependencies, or notes for the build agent - -### 4. Create Missing Specs -If functionality is needed but no spec exists: -1. Search codebase to confirm it's actually missing -2. Create spec at `.ralph-wiggum/specs/FILENAME.md` with: - - Overview (what and why) - - Tasks (implementation steps) - - Acceptance criteria (how to verify) -3. Add to implementation.json with appropriate priority - -### 5. Update Guardrails (if needed) -If you discover project-specific rules that should be enforced, add them to the "Project-Specific Rules" section of @.ralph-wiggum/GUARDRAILS.md. - -COMPLETION: When all specs are audited, have clear tasks/acceptance criteria, and implementation.json is created/updated, output exactly: DONE \ No newline at end of file +## Task Schema + +Each task MUST have: +- `id`: `{spec-id}-{number}` format +- `description`: What to do (action-oriented) +- `potentialChangeLocations`: Array of `"file:line - what to change"` strings +- `points`: 1, 2, 3, or 5 (never 8—split instead) +- `status`: "pending" +- `acceptanceCriteria`: How to verify completion + +Optional: +- `dependsOn`: Task IDs that must complete first + +## Rules + +1. **Plan only** — do NOT implement anything +2. **Explore thoroughly** — use sub-agents liberally to trace all usages +3. **Be concrete** — `potentialChangeLocations` should have file:line where possible +4. **Estimate with LOC** — points should reflect actual code volume discovered during exploration +5. **Merge small changes** — don't create tasks for < 15 LOC unless there's a hard dependency +6. **Split large changes** — no task should exceed ~100 LOC or 5 points + +## Example Sub-Agent Prompts + +**For tracing usages:** +``` +Find all usages of `transactionIds` field on artifacts table. +Return: file paths with line numbers, whether it's a read or write, +and any patterns you observe (e.g., always used with userId filter). +``` + +**For tracing dependencies:** +``` +What code depends on the return value of `getArtifactById`? +Trace callers and report what fields they access. +``` + +**For finding tests:** +``` +Find all tests that would need updating if we remove `transactionIds` from artifacts. +Return: test file paths, line numbers, and what the test is asserting. +``` + +**For pattern discovery:** +``` +How do existing core modules (e.g., core/transactions, core/users) structure +their queries.ts and mutations.ts? Report the patterns so we can follow them. +``` + +COMPLETION: When the spec has been fully explored and tasks created, output exactly: DONE \ No newline at end of file diff --git a/src/domain/spec.ts b/src/domain/spec.ts index 9b15be9..fe0d421 100644 --- a/src/domain/spec.ts +++ b/src/domain/spec.ts @@ -7,6 +7,8 @@ export class Spec { private readonly _name: string; private readonly _priority: number; private readonly _context?: string; + private readonly _pointsBudget?: number; + private readonly _dependsOn: string[]; private readonly _tasks: Task[]; private readonly _acceptanceCriteria: string[]; @@ -16,6 +18,8 @@ export class Spec { this._name = entry.name; this._priority = entry.priority; this._context = entry.context; + this._pointsBudget = entry.pointsBudget; + this._dependsOn = entry.dependsOn ?? []; this._tasks = entry.tasks.map((t) => Task.fromEntry(t)); this._acceptanceCriteria = entry.acceptanceCriteria ?? []; } @@ -40,6 +44,14 @@ export class Spec { return this._context; } + get pointsBudget(): number | undefined { + return this._pointsBudget; + } + + get dependsOn(): string[] { + return this._dependsOn; + } + get tasks(): Task[] { return this._tasks; } @@ -56,7 +68,19 @@ export class Spec { } get nextPendingTask(): Task | undefined { - return this._tasks.find((t) => t.status === "pending"); + const completedTaskIds = new Set( + this._tasks.filter((t) => t.status === "completed").map((t) => t.id) + ); + + return this._tasks.find((t) => { + if (t.status !== "pending") { + return false; + } + if (t.dependsOn.length === 0) { + return true; + } + return t.dependsOn.every((depId) => completedTaskIds.has(depId)); + }); } get completedTasks(): Task[] { @@ -106,6 +130,14 @@ export class Spec { entry.context = this._context; } + if (this._pointsBudget !== undefined) { + entry.pointsBudget = this._pointsBudget; + } + + if (this._dependsOn.length > 0) { + entry.dependsOn = this._dependsOn; + } + if (this._acceptanceCriteria.length > 0) { entry.acceptanceCriteria = this._acceptanceCriteria; } diff --git a/src/domain/task.ts b/src/domain/task.ts index 773f7b2..82da5df 100644 --- a/src/domain/task.ts +++ b/src/domain/task.ts @@ -5,6 +5,9 @@ export class Task { private readonly _description: string; private _status: TaskStatusType; private readonly _acceptanceCriteria: string[]; + private readonly _potentialChangeLocations: string[]; + private readonly _points?: number; + private readonly _dependsOn: string[]; private _blockedReason?: string; private _retryCount: number; private _completedAt?: string; @@ -14,6 +17,9 @@ export class Task { this._description = entry.description; this._status = entry.status; this._acceptanceCriteria = entry.acceptanceCriteria ?? []; + this._potentialChangeLocations = entry.potentialChangeLocations ?? []; + this._points = entry.points; + this._dependsOn = entry.dependsOn ?? []; this._blockedReason = entry.blockedReason; this._retryCount = entry.retryCount ?? 0; this._completedAt = entry.completedAt; @@ -35,6 +41,18 @@ export class Task { return this._acceptanceCriteria; } + get potentialChangeLocations(): string[] { + return this._potentialChangeLocations; + } + + get points(): number | undefined { + return this._points; + } + + get dependsOn(): string[] { + return this._dependsOn; + } + get blockedReason(): string | undefined { return this._blockedReason; } @@ -84,6 +102,18 @@ export class Task { entry.acceptanceCriteria = this._acceptanceCriteria; } + if (this._potentialChangeLocations.length > 0) { + entry.potentialChangeLocations = this._potentialChangeLocations; + } + + if (this._points !== undefined) { + entry.points = this._points; + } + + if (this._dependsOn.length > 0) { + entry.dependsOn = this._dependsOn; + } + if (this._blockedReason !== undefined) { entry.blockedReason = this._blockedReason; } diff --git a/src/services/builder.ts b/src/services/builder.ts index 4237798..7934bde 100644 --- a/src/services/builder.ts +++ b/src/services/builder.ts @@ -20,10 +20,19 @@ function generateTaskPrompt(spec: Spec, task: Task): string { ? task.acceptanceCriteria.map((ac) => `- [ ] ${ac}`).join("\n") : "_No specific acceptance criteria._"; + const potentialChangeLocations = task.potentialChangeLocations?.length + ? task.potentialChangeLocations.map((loc) => `- ${loc}`).join("\n") + : "_No specific locations identified during planning._"; + + const dependsOn = task.dependsOn?.length ? task.dependsOn.join(", ") : ""; + return Mustache.render(PROMPT_BUILD, { spec_name: spec.name, full_specs_file: spec.file, - task_context: task.description, + task_description: task.description, + task_points: task.points ?? "unestimated", + task_depends_on: dependsOn || undefined, + potential_change_locations: potentialChangeLocations, acceptance_criteria: acceptanceCriteria, }); } diff --git a/src/templates/prompts.ts b/src/templates/prompts.ts index a48de7c..67a71b4 100644 --- a/src/templates/prompts.ts +++ b/src/templates/prompts.ts @@ -1,109 +1,221 @@ export const PROMPT_PLAN = `# Plan Mode -You are an autonomous planning agent. Analyze specs and create a structured implementation plan. +Create implementation plans from specs by thoroughly exploring the codebase first. Use sub-agents as "exploration tracers" to discover all potential change locations before breaking work into tasks. + +## Philosophy + +**The problem with naive planning**: Specs describe WHAT to build, but implementation plans need to describe WHAT TO CHANGE. The gap between these is where agents fail—they burn context discovering change locations instead of implementing. + +**Solution**: Do expensive exploration ONCE during planning, so execution agents work from concrete checklists. ## Context (Read First) -1. Read @.ralph-wiggum/GUARDRAILS.md — understand project compliance rules -2. Read all specs in \`.ralph-wiggum/specs/*\` — understand what needs to be built -3. Read @.ralph-wiggum/implementation.json (if exists) — current progress state -4. Read @.ralph-wiggum/PROGRESS.md — learnings from previous runs -5. Reference source code thoroughly to understand current state +1. Read \`.ralph-wiggum/GUARDRAILS.md\` — project compliance rules +2. The spec file being planned +3. \`.ralph-wiggum/implementation.json\` — current progress (use \`jq\` to query) -## Rules -- Plan only — do NOT implement anything -- Do NOT assume functionality is missing — confirm with code search first -- Each spec should have clear tasks and acceptance criteria -- Treat \`src/lib\` as shared utilities — prefer consolidation over duplication +## Process -## Workflow +### Phase 1: Parse Spec + +Read the spec file and extract: +- Requirements (what needs to change) +- Acceptance criteria (how to verify) +- Mentioned files/modules (hints for exploration) +- Dependencies on other specs + +### Phase 2: Identify Exploration Targets + +From each requirement, derive exploration questions: +- "Where is X defined?" +- "What files use Y?" +- "What tests cover Z?" +- "What patterns exist for similar functionality?" + +### Phase 3: Spawn Exploration Sub-Agents (Parallel) + +For each major exploration target, spawn a sub-agent. + +\`\`\` +Example for "Remove transactionIds from artifacts": + +Agent A: "Find all definitions and usages of transactionIds in artifacts" + → Returns: schema.ts:42, types.ts:89, queries.ts:67/120, mutations.ts:30 + +Agent B: "Find all tests that reference artifact transactionIds" + → Returns: artifacts.test.ts:200/340, scrapbook.test.ts:89 + +Agent C: "Trace what depends on artifact.transactionIds for data" + → Returns: scrapbook.ts:89 uses it for timeline, nowhere else +\`\`\` + +Each sub-agent should return: +- File paths with line numbers +- Patterns observed (e.g., "uses compound index", "has auth check") +- Estimated lines of code to change +- Any concerns or complexity notes + +**No arbitrary cap on sub-agents**—use as many as needed for thorough exploration. + +### Phase 4: Synthesize Into Tasks + +Group exploration findings into tasks using these heuristics: -### 1. Audit Specs -- Read all specs in \`.ralph-wiggum/specs/*\` -- For each spec, verify tasks and acceptance criteria are clear and complete -- If a spec is missing details, update it with specific tasks and acceptance criteria +| Signal | Action | +|--------|--------| +| Same file, routine changes (< 30 LOC) | Merge into one task | +| Different files, same pattern | Merge if total LOC < 50 | +| Creates something others depend on | Separate task (first in chain) | +| Complex logic requiring reasoning | Separate task | +| Natural verification boundary | Separate task | +| Tests for a feature | Include with feature if small, separate if > 30 LOC | -### 2. Audit Codebase -- Use up to 500 parallel subagents to search the source code -- Compare implementation against specs -- Look for: TODOs, placeholders, skipped tests, incomplete features, inconsistent patterns +**LOC → Points guidance** (use internally, don't store LOC): +- 1-20 LOC routine changes → 1-2 points +- 20-50 LOC or moderate complexity → 3 points +- 50-100 LOC or high complexity → 5 points +- > 100 LOC → must split into multiple tasks -### 3. Output implementation.json +### Phase 5: Output implementation.json -Create or update @.ralph-wiggum/implementation.json with this structure: +Update \`.ralph-wiggum/implementation.json\` with the new task structure: \`\`\`json { "version": 1, - "updatedAt": "2026-01-17T10:30:00Z", + "updatedAt": "2026-01-23T10:30:00Z", "updatedBy": "plan-mode", "specs": [ { - "id": "spec-id-kebab-case", - "file": ".ralph-wiggum/specs/spec-file.md", + "id": "060-feat-something", + "file": ".ralph-wiggum/specs/060-feat-something.md", "name": "Human Readable Name", - "priority": 1, + "priority": 160, "status": "pending", - "context": "Brief context for this spec. Reference existing code locations.", + "dependsOn": [], + "pointsBudget": 15, "tasks": [ { - "id": "spec-id-1", - "description": "First task description", + "id": "060-feat-something-1", + "description": "Schema + type changes for new feature", + "potentialChangeLocations": [ + "convex/schema.ts:142 - add newField to someTable", + "convex/schema.ts:156 - add by_userId_newField index", + "convex/types.ts:89 - update SomeTableDoc type", + "convex/validators.ts:34 - may need validator update" + ], + "points": 2, "status": "pending", - "acceptanceCriteria": ["Criteria 1", "Criteria 2"] + "acceptanceCriteria": [ + "newField exists in schema with correct type", + "Index by_userId_newField exists", + "bun run typecheck passes" + ] }, { - "id": "spec-id-2", - "description": "Second task description", - "status": "pending" + "id": "060-feat-something-2", + "description": "Core queries and mutations for new feature", + "potentialChangeLocations": [ + "convex/core/someTable/queries.ts:45 - add getByNewField query", + "convex/core/someTable/mutations.ts:67 - update create to accept newField", + "convex/core/someTable/internal.ts:23 - add helper for newField lookup" + ], + "points": 3, + "status": "pending", + "dependsOn": ["060-feat-something-1"], + "acceptanceCriteria": [ + "getByNewField query works with auth check", + "create mutation accepts and persists newField", + "bun run typecheck passes" + ] } ], - "acceptanceCriteria": ["Spec-level AC 1", "Spec-level AC 2"] + "acceptanceCriteria": ["Spec-level AC copied from spec file"] } ] } \`\`\` -**Important:** -- Each spec gets an \`id\` (kebab-case, derived from spec filename) -- Tasks get sequential IDs like \`{spec-id}-1\`, \`{spec-id}-2\`, etc. -- \`priority\`: Lower number = higher priority (1 = first to implement) -- \`status\`: "pending" for unstarted, "in_progress" for active, "completed" for done -- \`context\`: Include relevant code paths, dependencies, or notes for the build agent -- Make sure to copy / paste relevant acceptance criteria from the spec file for each task. Some tasks may share the same acceptance criteria. +## Task Schema + +Each task MUST have: +- \`id\`: \`{spec-id}-{number}\` format +- \`description\`: What to do (action-oriented) +- \`potentialChangeLocations\`: Array of \`"file:line - what to change"\` strings +- \`points\`: 1, 2, 3, or 5 (never 8—split instead) +- \`status\`: "pending" +- \`acceptanceCriteria\`: How to verify completion + +Optional: +- \`dependsOn\`: Task IDs that must complete first + +## Rules + +1. **Plan only** — do NOT implement anything +2. **Explore thoroughly** — use sub-agents liberally to trace all usages +3. **Be concrete** — \`potentialChangeLocations\` should have file:line where possible +4. **Estimate with LOC** — points should reflect actual code volume discovered during exploration +5. **Merge small changes** — don't create tasks for < 15 LOC unless there's a hard dependency +6. **Split large changes** — no task should exceed ~100 LOC or 5 points + +## Example Sub-Agent Prompts + +**For tracing usages:** +\`\`\` +Find all usages of \`transactionIds\` field on artifacts table. +Return: file paths with line numbers, whether it's a read or write, +and any patterns you observe (e.g., always used with userId filter). +\`\`\` + +**For tracing dependencies:** +\`\`\` +What code depends on the return value of \`getArtifactById\`? +Trace callers and report what fields they access. +\`\`\` -### 4. Create Missing Specs -If functionality is needed but no spec exists: -1. Search codebase to confirm it's actually missing -2. Create spec at \`.ralph-wiggum/specs/FILENAME.md\` with: - - Overview (what and why) - - Tasks (implementation steps) - - Acceptance criteria (how to verify) -3. Add to implementation.json with appropriate priority +**For finding tests:** +\`\`\` +Find all tests that would need updating if we remove \`transactionIds\` from artifacts. +Return: test file paths, line numbers, and what the test is asserting. +\`\`\` -### 5. Update Guardrails (if needed) -If you discover project-specific rules that should be enforced, add them to the "Project-Specific Rules" section of @.ralph-wiggum/GUARDRAILS.md. +**For pattern discovery:** +\`\`\` +How do existing core modules (e.g., core/transactions, core/users) structure +their queries.ts and mutations.ts? Report the patterns so we can follow them. +\`\`\` -COMPLETION: When all specs are audited, have clear tasks/acceptance criteria, and implementation.json is created/updated, output exactly: DONE`; +COMPLETION: When the spec has been fully explored and tasks created, output exactly: DONE`; export const PROMPT_BUILD = `# Build Mode ## Context (Read First) -You are working on a specific task that is mentioned below. The task is part of a large spec. You have been iterating step by step on tasks from within that spec. +You are working on a specific task that is mentioned below. The task is part of a larger spec. You have been iterating step by step on tasks from within that spec. -### Larger Spec Context -{{spec_name}} +### Spec Context +**Spec:** {{spec_name}} +**Spec File:** {{full_specs_file}} -You can find the full specs in the file: {{full_specs_file}} (This is under ".ralph-wiggum" folder. You'll find ".ralph-wiggum/specs") +### Task +**Description:** {{task_description}} +**Points:** {{task_points}} +{{#task_depends_on}} +**Depends On:** {{task_depends_on}} +{{/task_depends_on}} -### Task Description -{{task_context}} +### Potential Change Locations +These locations were identified during planning—use them as your starting points: +{{potential_change_locations}} -- Read \`.ralph-wiggum/PROGRESS.md\` — context from previous runs -- Read @.ralph-wiggum/GUARDRAILS.md for compliance rules. +### Acceptance Criteria +{{acceptance_criteria}} + +- Read \`.ralph-wiggum/PROGRESS.md\` — context and learnings from previous runs +- Read \`.ralph-wiggum/GUARDRAILS.md\` for compliance rules. ## Rules -- Do NOT assume code is missing — search first using subagents (up to 500 for reads, 1 for builds) +- Start with the potential change locations above—they were discovered during planning +- Do NOT assume code is missing — search first using subagents - No placeholders or stubs — implement completely - Single sources of truth — no migrations or adapters - If unrelated tests fail, fix them as part of your work @@ -113,10 +225,7 @@ You can find the full specs in the file: {{full_specs_file}} (This is under ".ra ### 1. Pre-Flight (Guardrails Check) - Read \`.ralph-wiggum/GUARDRAILS.md\` completely - Verify you understand the "Before Making Changes" rules -- Create an implementation plan around the task according to the following acceptance criteria - -{{acceptance_criteria}} - +- Review the potential change locations and acceptance criteria above ### 2. Understand Current State - Search codebase before making changes diff --git a/src/types.ts b/src/types.ts index fa271c7..acb9a22 100644 --- a/src/types.ts +++ b/src/types.ts @@ -80,6 +80,12 @@ export interface TaskEntry { description: string; status: TaskStatusType; acceptanceCriteria?: string[]; + /** File paths with line numbers and what to change, e.g., "src/file.ts:42 - add newField" */ + potentialChangeLocations?: string[]; + /** Story points: 1, 2, 3, or 5 (based on LOC and complexity) */ + points?: number; + /** Task IDs that must complete before this task */ + dependsOn?: string[]; blockedReason?: string; retryCount?: number; completedAt?: string; @@ -92,6 +98,10 @@ export interface SpecEntry { priority: number; status: TaskStatusType; context?: string; + /** Total points budget for this spec */ + pointsBudget?: number; + /** Spec IDs that must complete before this spec */ + dependsOn?: string[]; tasks: TaskEntry[]; acceptanceCriteria?: string[]; } From 0080998b3bf8da821add9ab0ae913b976ec6f609 Mon Sep 17 00:00:00 2001 From: "yemyat (github)" Date: Fri, 23 Jan 2026 16:07:43 +0800 Subject: [PATCH 2/8] fix: reorder build workflow to log progress before commit --- src/templates/prompts.ts | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/src/templates/prompts.ts b/src/templates/prompts.ts index 67a71b4..b928df4 100644 --- a/src/templates/prompts.ts +++ b/src/templates/prompts.ts @@ -265,19 +265,10 @@ A backend spec is NOT complete until all relevant test suites pass. - Check off the task in the spec: \`- [x] AC\` - Add any discovered issues as new specs if needed -### 8. Commit & Push -\`\`\`bash -git add -A -git commit -m "feat: " -git push -\`\`\` - -### 9. Log Progress (Append to \`.ralph-wiggum/PROGRESS.md\`). Example below: +### 8. Log Progress (Append to \`.ralph-wiggum/PROGRESS.md\`). Example below: \`\`\`markdown ## [YYYY-MM-DD HH:MM] - -**Commit:** \`\` - **Guardrails:** - Pre-flight: ✓ - Post-flight: ✓ @@ -298,6 +289,13 @@ git push --- \`\`\` +### 9. Commit & Push +\`\`\`bash +git add -A +git commit -m "feat: " +git push +\`\`\` + ### 10. Signal Completion When the task is done, output exactly: From 660ef661739fd8f936ac3131dd144186ed9e7d60 Mon Sep 17 00:00:00 2001 From: "yemyat (github)" Date: Fri, 23 Jan 2026 16:08:57 +0800 Subject: [PATCH 3/8] chore: remove redundant guardrail since spec context is injected --- src/templates/prompts.ts | 1 - 1 file changed, 1 deletion(-) diff --git a/src/templates/prompts.ts b/src/templates/prompts.ts index b928df4..6416c5d 100644 --- a/src/templates/prompts.ts +++ b/src/templates/prompts.ts @@ -322,7 +322,6 @@ export const GUARDRAILS_TEMPLATE = `# Guardrails Compliance rules to verify before and after making changes. ## Before Making Changes -- [ ] Read the relevant spec file completely - [ ] Understand acceptance criteria before coding - [ ] Search codebase to confirm current state (don't assume) - [ ] Check for existing patterns to follow From 7c9776291b65d6313e06c0e5e7c9aa2ea1d2ce06 Mon Sep 17 00:00:00 2001 From: "yemyat (github)" Date: Fri, 23 Jan 2026 16:09:35 +0800 Subject: [PATCH 4/8] chore: remove redundant AC guardrail since it's injected in prompt --- src/templates/prompts.ts | 1 - 1 file changed, 1 deletion(-) diff --git a/src/templates/prompts.ts b/src/templates/prompts.ts index 6416c5d..0b975ee 100644 --- a/src/templates/prompts.ts +++ b/src/templates/prompts.ts @@ -322,7 +322,6 @@ export const GUARDRAILS_TEMPLATE = `# Guardrails Compliance rules to verify before and after making changes. ## Before Making Changes -- [ ] Understand acceptance criteria before coding - [ ] Search codebase to confirm current state (don't assume) - [ ] Check for existing patterns to follow From eb0f108cadea61f000a42457d24b7304c70638fd Mon Sep 17 00:00:00 2001 From: "yemyat (github)" Date: Fri, 23 Jan 2026 16:15:15 +0800 Subject: [PATCH 5/8] feat: add fully autonomous rule to prevent plan mode from asking questions --- .ralph-wiggum/PROMPT_plan.md | 13 +++++++------ src/templates/prompts.ts | 13 +++++++------ 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/.ralph-wiggum/PROMPT_plan.md b/.ralph-wiggum/PROMPT_plan.md index 35e46fb..3c541ae 100644 --- a/.ralph-wiggum/PROMPT_plan.md +++ b/.ralph-wiggum/PROMPT_plan.md @@ -150,12 +150,13 @@ Optional: ## Rules -1. **Plan only** — do NOT implement anything -2. **Explore thoroughly** — use sub-agents liberally to trace all usages -3. **Be concrete** — `potentialChangeLocations` should have file:line where possible -4. **Estimate with LOC** — points should reflect actual code volume discovered during exploration -5. **Merge small changes** — don't create tasks for < 15 LOC unless there's a hard dependency -6. **Split large changes** — no task should exceed ~100 LOC or 5 points +1. **Fully autonomous** — do NOT ask the user questions; make reasonable assumptions and proceed +2. **Plan only** — do NOT implement anything +3. **Explore thoroughly** — use sub-agents liberally to trace all usages +4. **Be concrete** — `potentialChangeLocations` should have file:line where possible +5. **Estimate with LOC** — points should reflect actual code volume discovered during exploration +6. **Merge small changes** — don't create tasks for < 15 LOC unless there's a hard dependency +7. **Split large changes** — no task should exceed ~100 LOC or 5 points ## Example Sub-Agent Prompts diff --git a/src/templates/prompts.ts b/src/templates/prompts.ts index 0b975ee..f39369f 100644 --- a/src/templates/prompts.ts +++ b/src/templates/prompts.ts @@ -150,12 +150,13 @@ Optional: ## Rules -1. **Plan only** — do NOT implement anything -2. **Explore thoroughly** — use sub-agents liberally to trace all usages -3. **Be concrete** — \`potentialChangeLocations\` should have file:line where possible -4. **Estimate with LOC** — points should reflect actual code volume discovered during exploration -5. **Merge small changes** — don't create tasks for < 15 LOC unless there's a hard dependency -6. **Split large changes** — no task should exceed ~100 LOC or 5 points +1. **Fully autonomous** — do NOT ask the user questions; make reasonable assumptions and proceed +2. **Plan only** — do NOT implement anything +3. **Explore thoroughly** — use sub-agents liberally to trace all usages +4. **Be concrete** — \`potentialChangeLocations\` should have file:line where possible +5. **Estimate with LOC** — points should reflect actual code volume discovered during exploration +6. **Merge small changes** — don't create tasks for < 15 LOC unless there's a hard dependency +7. **Split large changes** — no task should exceed ~100 LOC or 5 points ## Example Sub-Agent Prompts From 2e27214329f3bc19eda167ea865bcdb9b1689578 Mon Sep 17 00:00:00 2001 From: "yemyat (github)" Date: Fri, 23 Jan 2026 16:18:26 +0800 Subject: [PATCH 6/8] fix: make plan mode auto-select next pending spec without asking --- .ralph-wiggum/PROMPT_plan.md | 13 ++++++++++++- src/templates/prompts.ts | 13 ++++++++++++- 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/.ralph-wiggum/PROMPT_plan.md b/.ralph-wiggum/PROMPT_plan.md index 3c541ae..ff10ab0 100644 --- a/.ralph-wiggum/PROMPT_plan.md +++ b/.ralph-wiggum/PROMPT_plan.md @@ -1,6 +1,17 @@ # Plan Mode -Create implementation plans from specs by thoroughly exploring the codebase first. Use sub-agents as "exploration tracers" to discover all potential change locations before breaking work into tasks. +You are an autonomous planning agent. Your job is to create implementation plans from specs. + +**IMPORTANT: Do NOT ask questions. Do NOT wait for user input. Start working immediately.** + +1. Read `.ralph-wiggum/specs/` to find all specs +2. Read `.ralph-wiggum/implementation.json` to see which specs are already planned +3. Pick the highest priority spec that hasn't been planned yet +4. Create the implementation plan for that spec + +If no spec file is explicitly provided, automatically select and plan the next pending spec. + +--- ## Philosophy diff --git a/src/templates/prompts.ts b/src/templates/prompts.ts index f39369f..1578ea7 100644 --- a/src/templates/prompts.ts +++ b/src/templates/prompts.ts @@ -1,6 +1,17 @@ export const PROMPT_PLAN = `# Plan Mode -Create implementation plans from specs by thoroughly exploring the codebase first. Use sub-agents as "exploration tracers" to discover all potential change locations before breaking work into tasks. +You are an autonomous planning agent. Your job is to create implementation plans from specs. + +**IMPORTANT: Do NOT ask questions. Do NOT wait for user input. Start working immediately.** + +1. Read \`.ralph-wiggum/specs/\` to find all specs +2. Read \`.ralph-wiggum/implementation.json\` to see which specs are already planned +3. Pick the highest priority spec that hasn't been planned yet +4. Create the implementation plan for that spec + +If no spec file is explicitly provided, automatically select and plan the next pending spec. + +--- ## Philosophy From ab9646e84bcf94e96c186b7c0af500b0f0a03334 Mon Sep 17 00:00:00 2001 From: "yemyat (github)" Date: Fri, 23 Jan 2026 16:18:50 +0800 Subject: [PATCH 7/8] chore: update plan prompt --- src/templates/prompts.ts | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/templates/prompts.ts b/src/templates/prompts.ts index 1578ea7..3359987 100644 --- a/src/templates/prompts.ts +++ b/src/templates/prompts.ts @@ -1,5 +1,4 @@ -export const PROMPT_PLAN = `# Plan Mode - +export const PROMPT_PLAN = ` You are an autonomous planning agent. Your job is to create implementation plans from specs. **IMPORTANT: Do NOT ask questions. Do NOT wait for user input. Start working immediately.** From 604bd84d8893a2931ccb824e34de8c2765e79a48 Mon Sep 17 00:00:00 2001 From: "yemyat (github)" Date: Fri, 23 Jan 2026 16:27:59 +0800 Subject: [PATCH 8/8] feat: update build prompt --- src/templates/prompts.ts | 37 +++++++++++++------------------------ 1 file changed, 13 insertions(+), 24 deletions(-) diff --git a/src/templates/prompts.ts b/src/templates/prompts.ts index 3359987..2ef857b 100644 --- a/src/templates/prompts.ts +++ b/src/templates/prompts.ts @@ -1,5 +1,4 @@ -export const PROMPT_PLAN = ` -You are an autonomous planning agent. Your job is to create implementation plans from specs. +export const PROMPT_PLAN = `You are an autonomous planning agent. Your job is to create implementation plans from specs. **IMPORTANT: Do NOT ask questions. Do NOT wait for user input. Start working immediately.** @@ -197,16 +196,14 @@ their queries.ts and mutations.ts? Report the patterns so we can follow them. COMPLETION: When the spec has been fully explored and tasks created, output exactly: DONE`; -export const PROMPT_BUILD = `# Build Mode - -## Context (Read First) - -You are working on a specific task that is mentioned below. The task is part of a larger spec. You have been iterating step by step on tasks from within that spec. +export const PROMPT_BUILD = `You are working on a specific task that is mentioned below. The task is part of a larger spec. You have been iterating step by step on tasks from within that spec. ### Spec Context **Spec:** {{spec_name}} **Spec File:** {{full_specs_file}} +You don't really need to read this full file unless you need further context on the product requirements beyond what is described below. + ### Task **Description:** {{task_description}} **Points:** {{task_points}} @@ -236,23 +233,18 @@ These locations were identified during planning—use them as your starting poin ### 1. Pre-Flight (Guardrails Check) - Read \`.ralph-wiggum/GUARDRAILS.md\` completely - Verify you understand the "Before Making Changes" rules -- Review the potential change locations and acceptance criteria above - -### 2. Understand Current State -- Search codebase before making changes -- Use subagents for complex reasoning if needed -- Don't assume anything is missing — confirm with code search +- Review the potential change locations based on context provided to you above and acceptance criteria above -### 3. Implement +### 2. Implement - Complete the assigned task only - Follow existing code conventions - Make all changes needed for the task to pass its acceptance criteria mentioned earlier -### 4. Post-Flight (Guardrails Check) +### 3. Post-Flight (Guardrails Check) - Verify ALL items in \`.ralph-wiggum/GUARDRAILS.md\` "After Making Changes": - Check off acceptance criteria in the spec: \`- [x] AC\` -### 5. Frontend Testing (Required for UI Changes) +### 4. Frontend Testing (Required for UI Changes) If the spec involves UI changes, you MUST verify in the browser: 1. Load the \`agent-browser\` skill 2. Navigate to the relevant page @@ -261,7 +253,7 @@ If the spec involves UI changes, you MUST verify in the browser: A frontend spec is NOT complete until browser verification passes. -### 6. Backend Testing (Required for API/Service Changes) +### 5. Backend Testing (Required for API/Service Changes) If the spec involves backend changes, you MUST run all relevant tests: 1. Unit tests — test individual functions/modules in isolation 2. Integration tests — test interactions between components @@ -271,12 +263,12 @@ Adjust commands based on project (check package.json or AGENTS.md for available A backend spec is NOT complete until all relevant test suites pass. -### 7. Update Plan +### 6. Update Plan - Move spec from "In Progress" to "Completed" in \`.ralph-wiggum/implementation.json\` - Check off the task in the spec: \`- [x] AC\` - Add any discovered issues as new specs if needed -### 8. Log Progress (Append to \`.ralph-wiggum/PROGRESS.md\`). Example below: +### 7. Log Progress (Append to \`.ralph-wiggum/PROGRESS.md\`). Example below: \`\`\`markdown ## [YYYY-MM-DD HH:MM] - @@ -288,9 +280,6 @@ A backend spec is NOT complete until all relevant test suites pass. - \`bun run typecheck\` → PASS - \`bun run test\` → PASS -**Files changed:** -- path/to/file.ts - **What was done:** @@ -300,14 +289,14 @@ A backend spec is NOT complete until all relevant test suites pass. --- \`\`\` -### 9. Commit & Push +### 8. Commit & Push \`\`\`bash git add -A git commit -m "feat: " git push \`\`\` -### 10. Signal Completion +### 9. Signal Completion When the task is done, output exactly: \`\`\`