Skip to content

fix(sdk): skip advance step when verification finds gaps#1454

Merged
trek-e merged 2 commits intogsd-build:mainfrom
odmrs:fix/skip-advance-on-gaps-found
Apr 1, 2026
Merged

fix(sdk): skip advance step when verification finds gaps#1454
trek-e merged 2 commits intogsd-build:mainfrom
odmrs:fix/skip-advance-on-gaps-found

Conversation

@odmrs
Copy link
Copy Markdown
Contributor

@odmrs odmrs commented Mar 28, 2026

Problem

The advance step in PhaseRunner.run() executes unconditionally after verify, marking phases as [x] (complete) in ROADMAP.md even when verification returns gaps_found. On subsequent auto runs, the SDK sees [x] and skips the phase entirely.

Impact observed: In a 6-phase project, phases 2 and 3 were marked "Complete" with zero lines of code written. ~$15 spent on research/plan/verify cycles that produced nothing because the SDK kept advancing past unfinished phases.

Fix

phase-runner.ts now checks if all verify steps passed before running advance:

const verifyPassed = steps.every(s => s.step !== PhaseStepType.Verify || s.success);
if (!halted && verifyPassed) {
  const advanceResult = await this.runAdvanceStep(...);
  steps.push(advanceResult);
} else if (!halted && !verifyPassed) {
  this.logger?.warn(`Skipping advance for phase ${phaseNumber}: verification found gaps`);
}

When verification fails, the phase remains [ ] so the next auto run re-attempts it.

Open question

Should the milestone runner continue to the next phase when a phase fails verification?

Currently it stops on first success: false (index.ts line ~197). Possible behaviors:

  1. Stop (current) — safest, user re-runs auto after fixing
  2. Retry then stop — re-attempt the phase (re-execute → re-verify) N times
  3. Config-drivenworkflow.on_verify_fail: "stop" | "retry"

Happy to implement whichever approach you prefer.

Test plan

  • npm run test:unit — 675 tests passing, zero breakage
  • npm run build — zero errors
  • Live test: phases with gaps_found stay [ ] in ROADMAP

Previously, the advance step ran unconditionally after verify,
marking phases as complete in ROADMAP.md even when gaps_found.
This caused subsequent auto runs to skip unfinished phases.

Now checks if all verify steps passed before advancing. When
verification fails, the phase remains incomplete so the next
auto run re-attempts it.
@odmrs odmrs requested a review from glittercowboy as a code owner March 28, 2026 14:54
Copy link
Copy Markdown
Collaborator

@jeremymcs jeremymcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validated review findings from local verification of commit 39d8688.

1) High: advance can still run when verification gaps persist

In sdk/src/phase-runner.ts:

  • advance gate uses verifyPassed = steps.every(... s.success) (around line 253)
  • but runVerifyStep() still returns success: true when gaps_found exhausts retries (around lines 911, 926, 932)

So persistent gaps can still pass the new gate and execute runAdvanceStep().

Recommended fix

Track explicit verify outcome and gate advance on outcome instead of s.success alone.

Example direction:

  • add a verify-specific marker to PhaseStepResult (e.g. verifyOutcome?: 'passed' | 'human_needed' | 'gaps_found')
  • in runVerifyStep(), when retries are exhausted with gaps_found, return success: false and verifyOutcome: 'gaps_found'
  • in run(), gate advance with verifyOutcome !== 'gaps_found'
  • if you flip verify to success: false on exhausted gaps, ensure retryOnce('verify', ...) does not trigger an extra full verify cycle unintentionally

2) Medium: tests still encode old behavior, missing regression

In sdk/src/phase-runner.test.ts:

  • tests around ~580 and ~662 still expect exhausted-gap verify to be success: true
  • no regression test currently asserts “persistent gaps => no advance / no phaseComplete”

Recommended test additions

Add tests that assert:

  • persistent gaps_found does not append PhaseStepType.Advance
  • tools.phaseComplete is not called in that case
  • verifier disabled path still advances (to preserve intended behavior)

These changes would align implementation with PR intent: "skip advance step when verification finds gaps".

@jeremymcs
Copy link
Copy Markdown
Collaborator

@odmrs if you can update your PR with the findings. I"ll merge it.

Address review findings from gsd-build#1454:

1. runVerifyStep now returns success:false when gaps persist after
   exhausting retries (was always returning success:true)
2. human_needed + callback accept correctly sets outcome to passed
3. retryOnce skips retry for verification outcomes (gaps_found,
   human_needed) which have their own internal retry logic
4. Updated 3 existing tests to expect success:false on exhausted gaps
5. Added 3 regression tests:
   - persistent gaps_found does NOT append Advance step
   - persistent gaps_found does NOT call phaseComplete
   - verifier disabled still advances normally
@odmrs
Copy link
Copy Markdown
Contributor Author

odmrs commented Mar 28, 2026

Thanks for the thorough review! Updated the PR with your findings — verify now properly returns success:false on exhausted gaps, and added the 3 regression tests you suggested. Will be more careful with these edge cases on future PRs. Mind taking another look?

odmrs added a commit to odmrs/get-shit-done that referenced this pull request Mar 28, 2026
Address review findings from gsd-build#1454:

1. runVerifyStep now returns success:false when gaps persist after
   exhausting retries (was always returning success:true)
2. human_needed + callback accept correctly sets outcome to passed
3. retryOnce skips retry for verification outcomes (gaps_found,
   human_needed) which have their own internal retry logic
4. Updated 3 existing tests to expect success:false on exhausted gaps
5. Added 3 regression tests:
   - persistent gaps_found does NOT append Advance step
   - persistent gaps_found does NOT call phaseComplete
   - verifier disabled still advances normally
Copy link
Copy Markdown
Collaborator

@trek-e trek-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review: fix(sdk): skip advance step when verification finds gaps

Verdict: APPROVE (after addressing review feedback from jeremymcs)

The author has addressed all findings from the previous CHANGES_REQUESTED review by jeremymcs. Verifying the amendments:


Review finding 1 (High): advance can still run when verification gaps persist

Original issue: runVerifyStep() returned success: true even when gaps_found exhausted retries, so the advance gate (steps.every(s.success)) would pass.

Fix verified in commit 4bf9789:

  • runVerifyStep() now returns success: false when outcome is not passed (line: const verifySuccess = outcome === 'passed')
  • Error field set to verification_${outcome} for traceability
  • retryOnce now short-circuits on verification_* errors to avoid re-triggering verify's internal retry logic
  • The advance gate verifyPassed = steps.every(s => s.step !== PhaseStepType.Verify || s.success) now correctly blocks advance when verify returns success: false

Review finding 2 (Medium): tests encode old behavior, missing regression tests

Fix verified in commit 4bf9789:

  • Three existing tests updated from expect(verifyStep!.success).toBe(true) to expect(verifyStep!.success).toBe(false) for exhausted-gap scenarios
  • Three new regression tests added:
    1. "persistent gaps_found does NOT append Advance step" -- asserts stepTypes does not contain PhaseStepType.Advance
    2. "persistent gaps_found does NOT call phaseComplete" -- asserts tools.phaseComplete not called
    3. "verifier disabled still advances normally" -- asserts advance still works when verifier is skipped

All three tests are well-structured and directly assert the behaviors recommended in the review.

Additional changes (from earlier commits)

  • Step-skip optimization for auto_advance resume mode (skips research/plan if artifacts exist)
  • Phase success determination based on verify outcome rather than all-step history

Note: These additional changes overlap with PR #1443 (same author). Whichever PR merges first, the other will need a rebase. Both PRs should coordinate to avoid duplicate work.

No security issues, no prompt injection. The fix addresses a real $15+ waste scenario described in the PR.

@trek-e trek-e added the review: approved PR reviewed and approved by maintainer label Apr 1, 2026
@trek-e trek-e merged commit 4f2db2b into gsd-build:main Apr 1, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review: approved PR reviewed and approved by maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants