Issue #2: require expected outcome and real-result evidence in prompts#9
Merged
Issue #2: require expected outcome and real-result evidence in prompts#9
Conversation
Issue #2 needs stronger prompt contracts so agents explain what they understood, state the expected outcome, and show real execution evidence before review verdicts. Constraint: Keep the change limited to prompt/template behavior and regression tests Rejected: Add service-level semantic validation for comment contents | too invasive before locking prompt contract Confidence: high Scope-risk: narrow Reversibility: clean Directive: If stricter enforcement is needed later, add structured output or service-side validation rather than piling more prose into prompts Tested: python -m pytest -q tests/test_prompts.py Not-tested: End-to-end agent comment quality across real repositories
The new Issue #2 prompt requirements were correct but too verbose in prose form, so this follow-up rewrites the issue-request, review, and final-verdict instructions as compact checklists without changing the contract. Constraint: Preserve the new required sections and evidence expectations while making prompts easier to scan Rejected: Revert the new requirements entirely | loses the explicit expected-outcome and real-result guidance Confidence: high Scope-risk: narrow Reversibility: clean Directive: Prefer checklist-style prompt constraints over long prose when tightening stage contracts Tested: python -m pytest -q tests/test_prompts.py Not-tested: Full end-to-end agent behavior with live GitHub comments
The agent-facing prompt surface should stay simple and use gh directly, while PyGithub remains an internal dani runtime surface for event handling and repository orchestration. Constraint: Preserve existing prompt contract requirements while removing the helper-specific instructions from OMX-facing templates Rejected: Keep the PyGithub helper in prompts as a thin wrapper | still leaks dani internals into OMX sessions Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep GitHub transport details out of OMX prompts unless the session truly needs a dani-internal API Tested: python -m pytest -q tests/test_prompts.py tests/test_service.py tests/test_github.py Not-tested: Live gh-authenticated OMX session against a real repository
Implementation prompts should describe the real PR update flow, and review prompts should require -review plus concrete verification evidence without hard-coding product-surface categories. Constraint: Keep the prompt contracts compact while aligning them with the actual agent workflow Rejected: Keep web/cli/backend-specific checklist items | too prescriptive and awkward for mixed or non-standard surfaces Confidence: high Scope-risk: narrow Reversibility: clean Directive: Prefer general evidence language unless a stage is truly tied to a single product surface Tested: python -m pytest -q tests/test_prompts.py tests/test_service.py tests/test_github.py Not-tested: Live review-round execution through OMX with the code-review skill
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
What changed
Issue request
Review and final verdict
Real ResultTests
Files changed
dani/prompts.pytests/test_prompts.pyVerification
python -m pytest -q tests/test_prompts.pyCloses #2