Conversation
Issue follow-up comments now reuse the original issue_request Codex/OMX session instead of restarting analysis. The runner captures a resumable OMX session id from Codex session artifacts, stores it on the launched session, and uses it later to resume the same discussion thread. Constraint: General issue comments must keep /approve implementation behavior unchanged Constraint: Resume lookup should reuse stored session state instead of inventing a parallel ledger Rejected: Start a fresh issue_request session for every follow-up | loses discussion continuity and repeats analysis Rejected: Store only tmux session identifiers | not sufficient for `omx resume` Confidence: high Scope-risk: moderate Reversibility: clean Directive: Any new resumable stages should use storage session lookup plus stored omx_session_id, not prompt-text heuristics alone Tested: uv run pytest -q; uv run ruff check .; uv run ty check; uv run deptry . Not-tested: Live resume behavior against a real long-running OMX issue session in this branch
The Feature/#1 implementation was functionally correct, but CI still ran the repository quality gate and failed on automatic Ruff formatting. This commit applies the required formatting without changing behavior so the branch can merge cleanly. Constraint: Behavior and verification outcomes from the Feature/#1 implementation must remain unchanged Rejected: Merge with failing quality job | violates branch quality expectations Confidence: high Scope-risk: narrow Reversibility: clean Directive: Run 23 files left unchanged before final push when CI enforces repository formatting Tested: uv run ruff format .; uv run ruff check .; uv run pytest -q; uv run ty check; uv run deptry . Not-tested: Additional live GitHub webhook/resume interactions beyond existing local verification
Preserve issue discussions by resuming the original OMX session
Review-round completion was previously accepted if any signed PR comment existed, which allowed stale signatures to satisfy verification and duplicate comments to queue follow-up work more than once. This change verifies the exact expected signature and records processed review events so repeated deliveries stay idempotent. Constraint: Review completion is inferred from GitHub PR comments rather than a separate durable job callback Constraint: Must preserve the existing JSON storage model without adding dependencies Rejected: Trust only the latest signed PR comment | stale or duplicate comments can still satisfy the check incorrectly Rejected: Deduplicate purely in memory | process restarts would immediately lose the guard Confidence: high Scope-risk: moderate Reversibility: clean Directive: If webhook/event delivery semantics change, keep review-round handling idempotent across retries and replays Tested: ./.venv/bin/pytest -q Tested: ./.venv/bin/ruff check dani tests Tested: ./.venv/bin/ruff format --check dani tests Not-tested: Migration behavior for pre-existing data directories created before processed-events.json existed
The previous fix still performed duplicate detection as a separate read before recording the processed review event, which left a small race window for two concurrent deliveries to enqueue the same next round. This change makes the storage write itself the dedupe gate and also upgrades final verdict verification to require an exact matching signature instead of any signed PR comment. Constraint: Must keep JSON-file persistence and current service architecture Rejected: Leave review dedupe as check-then-record | concurrent duplicate deliveries can still double-enqueue Rejected: Keep final verdict on latest signed comment | stale signed review comments can satisfy the wrong stage Confidence: high Scope-risk: narrow Reversibility: clean Directive: Treat signature-based completion checks consistently across stages; avoid check-then-act dedupe for replayable events Tested: ./.venv/bin/pytest -q Tested: ./.venv/bin/ruff check dani tests Tested: ./.venv/bin/ruff format --check dani tests Not-tested: Real concurrent webhook delivery against the live GitHub integration
Fix exact review-round verification and dedupe duplicate completions
Session cleanup was implicit and depended on tmux exiting on its own, which left lifecycle completion ambiguous and could strand tmux sessions after successful or failed runs. This change adds an explicit session finalization step in DaniService, records when and why a session ended, and teaches OmxRunner how to kill an existing tmux session safely. Tests now cover both successful and failed finalization paths as well as the low-level tmux close behavior. Constraint: Must preserve the existing omx resume flow that uses omx_session_id across fresh tmux shells Rejected: Full stage lifecycle state machine | too broad for the issue and unnecessary to stop stray tmux sessions Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep tmux cleanup centralized in service finalization so future lifecycle changes do not duplicate termination logic Tested: python -m pytest -q; python -m ruff check . Not-tested: Real tmux + omx integration against a live GitHub event stream
Fix tmux session cleanup for OMX job lifecycle
Issue #2 needs stronger prompt contracts so agents explain what they understood, state the expected outcome, and show real execution evidence before review verdicts. Constraint: Keep the change limited to prompt/template behavior and regression tests Rejected: Add service-level semantic validation for comment contents | too invasive before locking prompt contract Confidence: high Scope-risk: narrow Reversibility: clean Directive: If stricter enforcement is needed later, add structured output or service-side validation rather than piling more prose into prompts Tested: python -m pytest -q tests/test_prompts.py Not-tested: End-to-end agent comment quality across real repositories
The new Issue #2 prompt requirements were correct but too verbose in prose form, so this follow-up rewrites the issue-request, review, and final-verdict instructions as compact checklists without changing the contract. Constraint: Preserve the new required sections and evidence expectations while making prompts easier to scan Rejected: Revert the new requirements entirely | loses the explicit expected-outcome and real-result guidance Confidence: high Scope-risk: narrow Reversibility: clean Directive: Prefer checklist-style prompt constraints over long prose when tightening stage contracts Tested: python -m pytest -q tests/test_prompts.py Not-tested: Full end-to-end agent behavior with live GitHub comments
The agent-facing prompt surface should stay simple and use gh directly, while PyGithub remains an internal dani runtime surface for event handling and repository orchestration. Constraint: Preserve existing prompt contract requirements while removing the helper-specific instructions from OMX-facing templates Rejected: Keep the PyGithub helper in prompts as a thin wrapper | still leaks dani internals into OMX sessions Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep GitHub transport details out of OMX prompts unless the session truly needs a dani-internal API Tested: python -m pytest -q tests/test_prompts.py tests/test_service.py tests/test_github.py Not-tested: Live gh-authenticated OMX session against a real repository
Implementation prompts should describe the real PR update flow, and review prompts should require -review plus concrete verification evidence without hard-coding product-surface categories. Constraint: Keep the prompt contracts compact while aligning them with the actual agent workflow Rejected: Keep web/cli/backend-specific checklist items | too prescriptive and awkward for mixed or non-standard surfaces Confidence: high Scope-risk: narrow Reversibility: clean Directive: Prefer general evidence language unless a stage is truly tied to a single product surface Tested: python -m pytest -q tests/test_prompts.py tests/test_service.py tests/test_github.py Not-tested: Live review-round execution through OMX with the code-review skill
Issue #2: require expected outcome and real-result evidence in prompts
Dani previously treated a signed review_round PR comment as the trigger for the next review round, which let the same unchanged PR get reviewed three times in a row with no repair pass in between. This commit turns review comments into implementation follow-ups, requires implementation rounds on existing PRs to publish an exact signed PR comment, and advances back into review/final verdict from that implementation signature so the PR loop alternates correctly. Constraint: PR triage must rely on exact Dani signatures rather than free-form PR comment text Constraint: Existing issue /approve flow and first-time PR creation behavior must keep working Rejected: Trigger next stages from PR body edits | pull_request.edited is noisier and easier to mis-triage than signed PR comments Rejected: Keep review_round -> review_round chaining | repeats identical review findings without any fix session Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep signed PR comments as the transition event for existing PR implementation rounds unless webhook edited-event handling gains strict diff/dedupe guards Tested: PYTHONPATH=. ./.venv/bin/pytest -q Tested: ./.venv/bin/ruff check . Tested: ./.venv/bin/ty check Not-tested: Live GitHub webhook delivery against a real repository after the workflow change
The webhook layer now recognizes pushes to main and the sync engine can merge those commits into dev directly, only escalating to OMX when a merge conflict requires agent judgment. Constraint: dev accepts direct pushes from dani automation Constraint: Conflict resolution must stay available through OMX sessions Rejected: Open a sync PR for every main update | unnecessary overhead when clean merges can push directly Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep push-event normalization aligned with the dev sync workflow so main updates continue to enqueue sync jobs Tested: make check Tested: uv run pytest Not-tested: Live GitHub webhook delivery against a production repository
GitHub Actions can run the sync flow without a configured global user.name/user.email, so the syncer now supplies its automation identity to both the merge and commit subprocesses. I kept the fix local to subprocess environment injection and added a regression test that captures the env passed into both commands, rather than mutating repository config during sync. Constraint: CI runners may not have git user.name/user.email configured Rejected: Configure git identity in repository config during sync | mutates local repo state unnecessarily Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep automation identity injection on every git command that can materialize merge metadata, not only commit Tested: make check; make test Not-tested: Live GitHub Actions run after push
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Issues closed by merging this PR
What changed
Session continuity
Review pipeline hardening
Runner and service reliability
Files touched
dani/service.pydani/omx_runner.pydani/storage.pydani/github.pydani/models.pydani/prompts.pytests/test_service.pytests/test_omx_runner.pytests/helpers.pyREADME.mdDiff at a glance
Commits included