-
Notifications
You must be signed in to change notification settings - Fork 167
Description
Summary
Two bugs in ouroboros_evaluate that prevent APPROVED verdicts for non-standard projects (e.g., Odoo CLI scripts that don't use pyproject.toml).
Bug 1: trigger_consensus silently ignored when Stage 2 < 0.80
Expected: When trigger_consensus=true is passed, Stage 3 (advocate + contrarian + judge) runs regardless of Stage 2 score.
Actual: Stage 3 is gated behind Stage 2's 0.80 threshold. If Stage 2 scores below 0.80, trigger_consensus=true has no effect — Stage 3 never runs.
Impact: CLI scripts (non-module code) consistently score 0.72 in Stage 2 semantic evaluation. The trigger_consensus parameter cannot override this, making 3-model consensus unreachable for these artifact types.
Suggested fix: When trigger_consensus=true, bypass the Stage 2 threshold and proceed directly to Stage 3 consensus. The whole point of forcing consensus is to get a second opinion when Stage 2 results are disputed.
Bug 2: ArtifactCollector fails silently without pyproject.toml/setup.py/package.json
Location: adapter.py:99-110 (_looks_like_project_root), adapter.py:145-160 (_project_dir_from_artifact), artifact_collector.py:44-45
Expected: When artifact contains file paths (e.g., File: /path/to/code.py), the collector reads those files from disk and bundles them for the semantic evaluator.
Actual: The collector requires a project root directory to enforce path boundary checks. _extract_project_dir() walks up from file paths looking for pyproject.toml, setup.py, or package.json. If none exist (common in Odoo, Django, and many enterprise projects), project_dir = None, and the collector returns an empty bundle. The semantic evaluator then only sees the artifact text, not the actual source files.
Additional issues:
_project_dir_from_artifact()(line 150) only matchesWrite:andEdit:prefixes — NOTFile:as expected- Adding
pyproject.tomlto fix file collection triggers auto-detection of wrong language presets (languages.py:68-71: Python preset runsruff check .andpython -m compileall -q src/which fail for Odoo projects) config.yamlproject_dirsetting requires MCP server restart to take effect (no hot reload)
Suggested fix:
- Accept
working_dirfrom the evaluate tool call as an explicit project_dir override (highest priority in_extract_project_dir) - Match
File:prefix in addition toWrite:andEdit:in_project_dir_from_artifact - Allow disabling language presets when
pyproject.tomlexists (e.g.,[tool.ouroboros] language = "none")
Environment
- Ouroboros version: 0.25.1 (installed via uv)
- Python: 3.14
- Project type: Odoo 17 EE (uses
odoo.cfg, nopyproject.toml) - Artifact: 818-line standalone CLI script with 83 passing unit tests
Reproduction
# This scores 0.72 consistently — trigger_consensus has no effect
ouroboros_evaluate(
session_id="any",
artifact="<full 818-line Python source code>",
artifact_type="code",
trigger_consensus=True, # silently ignored
acceptance_criterion="Create a 4-stage CLI pipeline...",
seed_content="goal: ...\nacceptance_criteria: ...",
)
# Result: Stage 2 score=0.72, REJECTED, Stage 3 never runs
# This should read files from disk but returns empty bundle
ouroboros_evaluate(
session_id="any",
artifact="Write: /path/to/project/script.py", # file exists on disk
working_dir="/path/to/project", # no pyproject.toml here
)
# Result: ArtifactCollector returns empty bundle, score=0.10