Skip to content

fix(pm): resolve PM interview timeouts and crashes across all backends#231

Open
Q00 wants to merge 2 commits intomainfrom
fix/pm-interview-timeouts
Open

fix(pm): resolve PM interview timeouts and crashes across all backends#231
Q00 wants to merge 2 commits intomainfrom
fix/pm-interview-timeouts

Conversation

@Q00
Copy link
Copy Markdown
Owner

@Q00 Q00 commented Mar 27, 2026

Summary

  • Guard litellm ImportError in CLI pm.py and factory.py — graceful error message instead of crash
  • Bounded loop for ask_next_question() — max 8 auto-advances instead of unbounded recursion (C1 fix)
  • max_turns=5, timeout=90s for PM interview adapter — was max_turns=1 causing empty responses (C2 fix)
  • Application-level timeout added to ClaudeCodeAdapter (C3 fix)
  • Codex adapter allowed_tools=[] fix — empty list now emits "Do NOT use any tools" constraint, preventing Codex from making recursive MCP tool calls
  • Empty initial_context rejection with specific error message (M4 fix)
  • PM meta restore order — restore before load_state to prevent silent deferred tracking loss (M3 fix)
  • SKILL.md cleanup — removed dead meta.ask_user_question branch (M1), added cross-platform clipboard fallback (M2)

E2E Verification

Backend Before After Time
claude_code 120s timeout ✅ Success ~67s
codex 90s timeout (recursive MCP call) ✅ Success ~68s

Checks passed

  • ruff lint
  • ruff format (4 files auto-formatted)
  • mypy (0 issues)
  • pytest (251 passed, 4 skipped)

Test plan

  • Unit tests for ImportError guard (pm.py, factory.py)
  • Unit tests for bounded loop (pm_interview.py)
  • Unit tests for PM handler (timeout, meta restore)
  • E2E: claude_code backend → interview start + first question
  • E2E: codex backend → interview start + first question

🤖 Generated with Claude Code

- Guard litellm ImportError in pm.py and factory.py with helpful install message
- Convert unbounded ask_next_question() recursion to bounded loop (max 8 auto-advances)
- Set max_turns=5 and timeout=90s for PM interview LLM adapter (was max_turns=1)
- Add application-level timeout to ClaudeCodeAdapter
- Fix Codex adapter: allowed_tools=[] now emits "no tools" constraint (was silent)
- Reject empty initial_context with specific error message
- Restore PM meta before load_state to prevent silent deferred tracking loss
- Remove dead meta.ask_user_question branch from SKILL.md
- Add cross-platform clipboard fallback (pbcopy/wl-copy/xclip)

E2E verified: claude_code backend ~67s, codex backend ~68s (both under 90s timeout)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit c34681d | Triggered by: automatic review

Branch: fix/pm-interview-timeouts | 12 files, +405/-112 | CI: MyPy Type Check pass 42s https://github.com/Q00/ouroboros/actions/runs/23636206279/job/68845903845 | Ruff Lint pass 15s https://github.com/Q00/ouroboros/actions/runs/23636206279/job/68845903857 | Test Python 3.12 pass 1m2s https://github.com/Q00/ouroboros/actions/runs/23636206271/job/68845903830 | Test Python 3.13 pass 1m5s https://github.com/Q00/ouroboros/actions/runs/23636206271/job/68845903814 | Test Python 3.14 pass 1m16s https://github.com/Q00/ouroboros/actions/runs/23636206271/job/6884

Issue Requirements

No linked issue detected in PR body.

Previous Review Follow-up

First review — no previous findings.

Blocking Findings

# File:Line Severity Confidence Finding
1 src/ouroboros/mcp/tools/pm_handler.py:323 High High The new PM-interview adapter settings are still bypassed in the normal MCP server path. _get_engine() prefers self.llm_adapter over the freshly configured adapter, and the server wires PMInterviewHandler with a shared llm_adapter, so the advertised max_turns=5, timeout=90, allowed_tools=[], and per-request cwd never apply in production. That means the timeout/empty-response fixes this PR is trying to land are not actually used for the default handler construction.
2 src/ouroboros/providers/codex_cli_adapter.py:84 High High The new “no tools” prompt is applied far more broadly than intended because allowed_tools=None is collapsed to [] at construction time. After that, _build_prompt() cannot distinguish “caller omitted tool constraints” from “caller explicitly forbade all tools”, so default Codex adapters now emit “Do NOT use any tools or MCP calls” even for existing call sites that intentionally pass allowed_tools=None. This is a regression for non-PM interview flows and any default Codex-backed completion path.
3 src/ouroboros/providers/factory.py:94 Medium High The factory now accepts cwd from callers like PMInterviewHandler, but the claude_code branch still drops it entirely. Even if finding #1 is fixed, Claude-backed PM interviews will continue to run in the process working directory instead of the request’s cwd, so repo-relative behavior remains inconsistent across backends.
4 src/ouroboros/mcp/tools/pm_handler.py:780 Medium Medium _handle_answer() now restores PM metadata before verifying that the interview state can be loaded. On a bad or stale session_id, that mutates the long-lived engine with deferred items / pending reframe state and then returns an error, leaving contaminated in-memory state for the next request if there is no replacement pm_meta. Previously the engine stayed untouched on load failure.
5 skills/pm/SKILL.md:90 Low High Step 5 now correctly allows “no clipboard tool found; skipping clipboard copy”, but Step 6 still instructs the agent to tell the user “(Clipboard에 복사되었습니다)” unconditionally. That leaves the skill internally contradictory and will cause false success messaging on Linux/CI environments without a clipboard utility.

Test Coverage

Missing coverage around the production PM handler wiring: tests/unit/mcp/tools/test_pm_handler.py only exercises _get_engine() when llm_adapter is not injected, so it never catches that src/ouroboros/mcp/tools/pm_handler.py:323 ignores the new adapter kwargs under the actual server composition path.

Missing Codex regression coverage for the default allowed_tools=None case: tests/unit/providers/test_codex_cli_adapter.py only asserts the explicit-tools prompt path and does not verify the prompt built by a default adapter, which is why the new blanket “Do NOT use any tools” behavior slipped through.

Missing Claude cwd propagation coverage: tests/unit/providers/test_factory.py and tests/unit/providers/test_claude_code_adapter.py do not assert that a cwd passed into create_llm_adapter(..., backend="claude_code") reaches the SDK options, so the backend inconsistency remains untested.

Design Notes

The direction is good: the PR is trying to make PM interviews backend-aware and bounded instead of relying on unbounded recursion and transport timeouts. The main architectural problem is that adapter policy now lives in three places at once (PMInterviewHandler, create_llm_adapter(), and each concrete adapter), so the effective runtime behavior depends on which composition path instantiated the handler. That drift is what caused the production-path bypass and the backend inconsistency here.

Follow-up Recommendations

  • Re-run a maintainer spot check on the exact HEAD commit if additional pushes land after this review.
  • Confirm any non-blocking cleanup separately so blocker scope stays tied to merge risk.

Files Reviewed

  • skills/pm/SKILL.md
  • src/ouroboros/bigbang/pm_interview.py
  • src/ouroboros/cli/commands/pm.py
  • src/ouroboros/mcp/tools/pm_handler.py
  • src/ouroboros/providers/claude_code_adapter.py
  • src/ouroboros/providers/codex_cli_adapter.py
  • src/ouroboros/providers/factory.py
  • tests/unit/bigbang/test_decide_later_items.py
  • tests/unit/cli/test_pm_overwrite.py
  • tests/unit/mcp/tools/test_pm_handler.py
  • tests/unit/providers/test_claude_code_adapter.py
  • tests/unit/providers/test_factory.py

Reviewed by ouroboros-agent[bot] via Codex deep analysis

Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit
't5c15b98' | Triggered by: new push

Branch:
'tfix/pm-interview-timeouts' | 13 files, +543/-115 | CI: no checks reported

Issue Requirements

No linked issue detected in PR body.

Previous Review Follow-up

Previous Finding Status
PM handler ignored the PM-specific adapter settings when the server injected a shared adapter. RESOLVED — now rebuilds a dedicated adapter whenever is configured, and the new regression test covers that server-composed path (, ).
Codex default adapters incorrectly emitted a blanket "Do NOT use any tools" constraint when was omitted. RESOLVED — the adapter now preserves vs and only emits the text-only constraint for an explicit empty list, with coverage for both cases (, ).
Claude-backed adapters dropped caller-provided . RESOLVED — the factory now forwards , the adapter stores it, and SDK options use that configured directory (, , ).
Resume-on-load-failure restored PM metadata too early and could contaminate a long-lived engine. RESOLVED — now loads interview state before restoring PM metadata, and the regression test asserts is not called on load failure (, ).
The PM skill promised clipboard copy even when no clipboard tool exists. RESOLVED — Step 6 now makes the clipboard message conditional on Step 5 succeeding ().

Blocking Findings

No blocking findings verified on the current PR HEAD.

Test Coverage

Coverage is materially better on this revision:

  • The PM handler tests now exercise the dedicated-adapter rebuild path and cwd propagation ().
  • The adapter regressions that drove the prior review are covered for both Claude and Codex defaults (, ).
  • The bounded auto-advance guard and persistence-error propagation are covered in the PM interview tests ().

Design Notes

This revision closes the main correctness gaps from the prior review without broadening the surface area much. The strongest part of the fix is that the runtime behavior and the tests now line up: PM interview requests get a dedicated backend adapter with explicit timeout/tool policy, the unbounded auto-advance recursion is replaced with a bounded loop that preserves state updates, and the resume path no longer mutates long-lived engine state before load succeeds.

Follow-up Recommendations

  • GitHub currently reports the PR as , so the author should rebase or resolve branch conflicts before merge even though the reviewed diff itself looks ready.
  • A targeted end-to-end check on the exact merge commit would still be worthwhile once the branch is conflict-free, since there are no GitHub checks reported on the current head.

Files Reviewed

  • skills/pm/SKILL.md
  • src/ouroboros/bigbang/pm_interview.py
  • src/ouroboros/cli/commands/pm.py
  • src/ouroboros/mcp/tools/pm_handler.py
  • src/ouroboros/providers/claude_code_adapter.py
  • src/ouroboros/providers/codex_cli_adapter.py
  • src/ouroboros/providers/factory.py
  • tests/unit/bigbang/test_decide_later_items.py
  • tests/unit/cli/test_pm_overwrite.py
  • tests/unit/mcp/tools/test_pm_handler.py
  • tests/unit/providers/test_claude_code_adapter.py
  • tests/unit/providers/test_codex_cli_adapter.py
  • tests/unit/providers/test_factory.py

Reviewed by ouroboros-agent[bot] via manual deep analysis fallback after Codex timeout

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant