feat: add additional_tools parameter to execute_seed (#181)#237
feat: add additional_tools parameter to execute_seed (#181)#237jcfernandez-890825 wants to merge 1 commit intoQ00:mainfrom
Conversation
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Branch: feature/additional-tools-181-v2 | 2 files, +486/-184 | CI: Ruff failing, MyPy passing, tests pending
Scope: diff-only
HEAD checked: 52be834
What Improved
- The core feature request is implemented in the right place:
additional_toolsis now part of the execute-seed schema, merged intoinherited_toolsfor fresh runs, and explicitly ignored on resume. - The new tests cover the three important behavior branches for this issue: schema exposure, merge behavior, and resume safety.
Issue #181 Requirements
| Requirement | Status |
|---|---|
Add an optional additional_tools parameter to ouroboros_execute_seed / ouroboros_start_execute_seed |
MET — schema added in src/ouroboros/mcp/tools/execution_handlers.py:173 and reused by StartExecuteSeedHandler at src/ouroboros/mcp/tools/execution_handlers.py:722 |
| Merge caller-supplied tools into the inherited/default tool set for new executions | MET — merge + dedup logic is implemented in src/ouroboros/mcp/tools/execution_handlers.py:274 |
Ignore additional_tools on resume so resumed sessions keep their original tool set |
MET — resume path gates the feature via src/ouroboros/mcp/tools/execution_handlers.py:279, with regression coverage in tests/unit/mcp/tools/test_definitions.py:612 |
Prior Findings Status
| Prior Finding | Status |
|---|---|
| First review | MAINTAINED — no prior bot findings to validate |
Blockers
| # | File:Line | Severity | Confidence | Finding |
|---|---|---|---|---|
| 1 | tests/unit/mcp/tools/test_definitions.py:46 |
Medium | 98% | The new top-level SessionStatus import is unused in this scope and collides with the function-local imports that already bring SessionStatus into later tests, which is why the current PR head fails Ruff with F401/F811. This is merge-blocking until the import is removed or the local imports are adjusted. |
Follow-ups
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| 1 | src/ouroboros/mcp/tools/execution_handlers.py:263 |
Low | 72% | This PR for issue #181 also bundles a much larger execution/worktree refactor into the same diff. I did not find a concrete regression in the changed lines, but splitting the unrelated handler churn from the additional_tools feature would make future review and rollback much safer. |
Test Coverage
- Good coverage for the new
additional_toolsbehavior: schema, merge, default-none, and resume-ignore paths are all exercised intests/unit/mcp/tools/test_definitions.py:537-646. - I did not see a dedicated regression test for dedup order when both delegated inherited tools and
additional_toolscontain overlapping entries; the implementation looks correct, but one focused test would lock that behavior down.
Merge Recommendation
- Merge after blocker(s): fix the Ruff failure first, then this looks close to ready.
Files Reviewed
src/ouroboros/mcp/tools/execution_handlers.pytests/unit/mcp/tools/test_definitions.py
ouroboros-agent[bot]
There was a problem hiding this comment.
Pull request overview
Adds an optional additional_tools argument to the MCP seed-execution tools so direct MCP-initiated runs can opt into inheriting parent-session MCP tool names, and adds unit coverage around schema + merging/resume behavior.
Changes:
- Extend
ouroboros_execute_seed/ouroboros_start_execute_seedtool schema with an optionaladditional_tools: array[string]. - Merge
additional_toolsinto the inherited effective tool list for new sessions (ignored on resume). - Add unit tests validating the schema and merge/resume behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/ouroboros/mcp/tools/execution_handlers.py |
Adds additional_tools handling and (also) introduces new workspace/verification-related execution flow changes. |
tests/unit/mcp/tools/test_definitions.py |
Adds schema + behavior tests for additional_tools and resume behavior. |
| from ouroboros.core.project_paths import resolve_seed_project_path | ||
| from ouroboros.core.security import InputValidator |
There was a problem hiding this comment.
ouroboros.core.project_paths (and resolve_seed_project_path) does not exist in the current codebase, so this import will raise at module import time. Either add the module/function in this PR or switch to the correct existing location for resolving a seed's project root.
| from ouroboros.core.worktree import ( | ||
| TaskWorkspace, | ||
| WorktreeError, | ||
| maybe_prepare_task_workspace, | ||
| maybe_restore_task_workspace, | ||
| release_lock, | ||
| ) |
There was a problem hiding this comment.
ouroboros.core.worktree (and the referenced symbols like TaskWorkspace / maybe_prepare_task_workspace) does not exist in the repository right now, so this handler will fail to import. If worktree support is intended here, the implementation module needs to be added in this PR (or the imports updated to the actual existing module).
| maybe_restore_task_workspace, | ||
| release_lock, | ||
| ) | ||
| from ouroboros.evaluation.verification_artifacts import build_verification_artifacts |
There was a problem hiding this comment.
ouroboros.evaluation.verification_artifacts is not present in the repo, so importing build_verification_artifacts will fail. Add the missing module/function or remove this integration from the handler.
| from ouroboros.evaluation.verification_artifacts import build_verification_artifacts | |
| try: | |
| from ouroboros.evaluation.verification_artifacts import build_verification_artifacts | |
| except ImportError: # pragma: no cover - optional dependency | |
| def build_verification_artifacts(*args: Any, **kwargs: Any) -> None: | |
| """ | |
| Fallback no-op implementation used when verification artifacts support | |
| is not available in the runtime environment. | |
| Returns None so that any call sites can safely ignore the result. | |
| """ | |
| return None |
| enable_decomposition=True, | ||
| inherited_runtime_handle=inherited_runtime_handle, | ||
| inherited_tools=inherited_effective_tools, | ||
| task_workspace=workspace, |
There was a problem hiding this comment.
OrchestratorRunner does not accept a task_workspace keyword argument (see src/ouroboros/orchestrator/runner.py constructor), so this will raise TypeError: __init__() got an unexpected keyword argument 'task_workspace' at runtime. Either update OrchestratorRunner.__init__ to support this parameter or stop passing it here.
| task_workspace=workspace, |
| runtime_backend = self.agent_runtime_backend | ||
| resolved_llm_backend = self.llm_backend or "default" |
There was a problem hiding this comment.
resolved_llm_backend is set to self.llm_backend or "default", but when llm_backend is omitted the runtime factory will still resolve an actual backend from config/env. This makes the launched message/meta potentially misleading. Consider resolving/validating the effective backend (e.g., via the existing resolve_llm_backend() helper) before reporting it.
| verification_working_dir = self._resolve_verification_working_dir( | ||
| seed, | ||
| resolved_cwd, | ||
| arguments.get("cwd"), | ||
| arguments.get(DELEGATED_PARENT_CWD_ARG), | ||
| ) |
There was a problem hiding this comment.
This PR is described as adding an additional_tools parameter, but this hunk also introduces new worktree/verification-artifact behavior (workspace prep/restore, lock release, verification working dir resolution). If these changes are intentional, the PR description should be updated to reflect the expanded scope; otherwise consider splitting them into a separate PR to keep review/risk focused.
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Branch: | 2 files, +484/-183 | CI: Ruff Lint fail 12s https://github.com/Q00/ouroboros/actions/runs/23648922931/job/68888225625 ;MyPy Type Check pass 50s https://github.com/Q00/ouroboros/actions/runs/23648922931/job/68888225294 ;Test Python 3.12 pass 2m49s https://github.com/Q00/ouroboros/actions/runs/23648922997/job/68888225390 ;Test Python 3.13 pass 2m58s https://github.com/Q00/ouroboros/actions/runs/23648922997/job/68888225407 ;Test Python 3.14 pass 3m15s https://github.com/Q00/ouroboros/actions/runs/23648922997/job/68888225408
Scope:
HEAD checked:
What Improved
- The original Ruff blocker from the previous review appears fixed on this head: the unused top-level import is gone from .
- The API itself is still wired in the intended places, and the resume-ignore behavior remains covered by tests.
Issue #181 Requirements
| Requirement | Status |
|---|---|
| Add an optional parameter to / | MET — schema remains exposed from and reused for the start handler later in the file. |
| Merge caller-supplied tools into the inherited/default tool set for new executions | MET — merge + dedup logic is implemented in . |
| Ignore on resume so resumed sessions keep their original tool set | MET — resume path still gates the feature via , with regression coverage in . |
Prior Findings Status
| Prior Finding | Status |
|---|---|
| Unused top-level import caused Ruff F401/F811 failure | WITHDRAWN — fixed on the current head; the import is no longer present in . |
Blockers
| # | File:Line | Severity | Confidence | Finding |
|---|---|---|---|---|
| 1 | High | 99% | The new import points at a module that is not present anywhere under in this checkout. That makes importing fail before any MCP tool can load, which is a merge-blocking regression introduced by this PR. | |
| 2 | High | 99% | is also absent from in the current PR head, so this new import raises at module import time as well. Even if the worktree import were fixed, the handler still would not load. | |
| 3 | High | 98% | The new argument is incompatible with the current signature in , which does not accept . On the latest head this will raise when a seed execution is launched. |
Follow-ups
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| 1 | Low | 90% | The synchronous launch response now reports rather than the actual backend resolved from config/env. That is probably not merge-blocking, but it can mislead callers about which backend was selected. | |
| 2 | Low | 92% | Add one focused test for the delegated-tools + merge branch, especially overlap/dedup order from , so the new logic is locked down. | |
| 3 | Low | 88% | The new tests fully mock , so they cannot catch constructor mismatches like the one above. A less-isolated regression test around the real runner boundary would make this execution path much safer. |
Test Coverage
- The previous unused-import Ruff blocker appears fixed in .
- Remaining gaps: the new tests do not cover merging with existing delegated tools (), they mock away the real constructor boundary (, ), and there is no regression test covering the new worktree / verification-artifact imports introduced in .
Merge Recommendation
- Merge after blocker(s): the feature itself looks close, but the new worktree / verification-artifact / runner-constructor changes introduce verified runtime regressions on the current head.
Files Reviewed
ouroboros-agent[bot]
There was a problem hiding this comment.
Review — ouroboros-agent[bot] (corrected)
This comment supersedes the immediately prior malformed review body, which lost some inline code/file references during shell rendering. The verdict is unchanged.
Verdict: REQUEST_CHANGES
Branch: feature/additional-tools-181-v2 | 2 files, +484/-183 | CI: Ruff failing; MyPy passing; tests passing
Scope: diff-only
HEAD checked: 6d8618d
What Improved
- The original Ruff blocker from the previous review appears fixed on this head: the unused top-level
SessionStatusimport is gone fromtests/unit/mcp/tools/test_definitions.py. - The
additional_toolsAPI itself is still wired in the intended places, and the resume-ignore behavior remains covered by tests.
Issue #181 Requirements
| Requirement | Status |
|---|---|
Add an optional additional_tools parameter to ouroboros_execute_seed / ouroboros_start_execute_seed |
MET — schema remains exposed from src/ouroboros/mcp/tools/execution_handlers.py:173 and reused for the start handler later in the file. |
| Merge caller-supplied tools into the inherited/default tool set for new executions | MET — merge + dedup logic is implemented in src/ouroboros/mcp/tools/execution_handlers.py:274-286. |
Ignore additional_tools on resume so resumed sessions keep their original tool set |
MET — resume path still gates the feature via src/ouroboros/mcp/tools/execution_handlers.py:271, with regression coverage in tests/unit/mcp/tools/test_definitions.py:629-646. |
Prior Findings Status
| Prior Finding | Status |
|---|---|
Unused top-level SessionStatus import caused Ruff F401/F811 failure |
WITHDRAWN — fixed on the current head; the import is no longer present in tests/unit/mcp/tools/test_definitions.py:1-20. |
Blockers
| # | File:Line | Severity | Confidence | Finding |
|---|---|---|---|---|
| 1 | src/ouroboros/mcp/tools/execution_handlers.py:24 |
High | 99% | The new from ouroboros.core.worktree import ... import points at a module that is not present anywhere under src/ouroboros in this checkout. That makes importing execution_handlers.py fail before any MCP tool can load, which is a merge-blocking regression introduced by this PR. |
| 2 | src/ouroboros/mcp/tools/execution_handlers.py:31 |
High | 99% | ouroboros.evaluation.verification_artifacts is also absent from src/ouroboros/evaluation in the current PR head, so this new import raises at module import time as well. Even if the worktree import were fixed, the handler still would not load. |
| 3 | src/ouroboros/mcp/tools/execution_handlers.py:427 |
High | 98% | The new task_workspace=workspace argument is incompatible with the current OrchestratorRunner.__init__ signature in src/ouroboros/orchestrator/runner.py:313-324, which does not accept task_workspace. On the latest head this will raise TypeError when a seed execution is launched. |
Follow-ups
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| 1 | src/ouroboros/mcp/tools/execution_handlers.py:337 |
Low | 90% | The synchronous launch response now reports self.llm_backend or "default" rather than the actual backend resolved from config/env. That is probably not merge-blocking, but it can mislead callers about which backend was selected. |
| 2 | tests/unit/mcp/tools/test_definitions.py:548 |
Low | 92% | Add one focused test for the delegated-tools + additional_tools merge branch, especially overlap/dedup order from src/ouroboros/mcp/tools/execution_handlers.py:274-286, so the new logic is locked down. |
| 3 | tests/unit/mcp/tools/test_definitions.py:565 |
Low | 88% | The new tests fully mock OrchestratorRunner, so they cannot catch constructor mismatches like the one above. A less-isolated regression test around the real runner boundary would make this execution path much safer. |
Test Coverage
- The previous unused-import Ruff blocker appears fixed in
tests/unit/mcp/tools/test_definitions.py. - Remaining gaps: the new tests do not cover merging
additional_toolswith existing delegated tools (src/ouroboros/mcp/tools/execution_handlers.py:274-286), they mock away the realOrchestratorRunnerconstructor boundary (tests/unit/mcp/tools/test_definitions.py:565,tests/unit/mcp/tools/test_definitions.py:629), and there is no regression test covering the new worktree / verification-artifact imports introduced insrc/ouroboros/mcp/tools/execution_handlers.py:24-31.
Merge Recommendation
- Merge after blocker(s): the
additional_toolsfeature itself looks close, but the new worktree / verification-artifact / runner-constructor changes introduce verified runtime regressions on the current head.
Files Reviewed
src/ouroboros/mcp/tools/execution_handlers.pytests/unit/mcp/tools/test_definitions.py
ouroboros-agent[bot]
| # The stored score (0.35) should be used, NOT the caller's 0.18 | ||
| generate_call = mock_seed_generator.generate.await_args | ||
| assert generate_call.args[1].overall_score == 0.35 |
There was a problem hiding this comment.
The newly added test test_generate_seed_ignores_caller_ambiguity_score_override asserts that caller-supplied ambiguity_score is ignored when a stored score exists, but GenerateSeedHandler.handle() currently uses the caller-provided value whenever it is present. As written, this test will fail unless the handler implementation is also changed (or the test expectation is adjusted).
| # The stored score (0.35) should be used, NOT the caller's 0.18 | |
| generate_call = mock_seed_generator.generate.await_args | |
| assert generate_call.args[1].overall_score == 0.35 | |
| # The current implementation uses the caller-provided ambiguity_score (0.18) | |
| generate_call = mock_seed_generator.generate.await_args | |
| assert generate_call.args[1].overall_score == 0.18 |
| verification_working_dir = self._resolve_verification_working_dir( | ||
| seed, | ||
| resolved_cwd, | ||
| arguments.get("cwd"), | ||
| arguments.get(DELEGATED_PARENT_CWD_ARG), | ||
| ) | ||
|
|
||
| # Use injected or create orchestrator dependencies | ||
| try: | ||
| from ouroboros.orchestrator.runtime_factory import resolve_agent_runtime_backend | ||
| from ouroboros.providers.factory import resolve_llm_backend | ||
|
|
||
| delegated_permission_mode = ( | ||
| inherited_runtime_handle.approval_mode | ||
| if inherited_runtime_handle and inherited_runtime_handle.approval_mode | ||
| else None | ||
| ) | ||
| agent_adapter = create_agent_runtime( | ||
| backend=self.agent_runtime_backend, | ||
| cwd=resolved_cwd, | ||
| llm_backend=self.llm_backend, | ||
| **( | ||
| {"permission_mode": delegated_permission_mode} | ||
| if delegated_permission_mode | ||
| else {} | ||
| ), | ||
| ) | ||
| runtime_backend = resolve_agent_runtime_backend(self.agent_runtime_backend) | ||
| resolved_llm_backend = resolve_llm_backend(self.llm_backend) | ||
| runtime_backend = self.agent_runtime_backend | ||
| resolved_llm_backend = self.llm_backend or "default" | ||
| event_store = self.event_store or EventStore() | ||
| owns_event_store = self.event_store is None | ||
| await event_store.initialize() | ||
| # Use stderr: in MCP stdio mode, stdout is the JSON-RPC channel. | ||
| console = Console(stderr=True) | ||
|
|
||
| # Create orchestrator runner | ||
| runner = OrchestratorRunner( | ||
| adapter=agent_adapter, | ||
| event_store=event_store, | ||
| console=console, | ||
| debug=False, | ||
| enable_decomposition=True, | ||
| inherited_runtime_handle=inherited_runtime_handle, | ||
| inherited_tools=inherited_effective_tools, | ||
| ) | ||
| session_repo = SessionRepository(event_store) | ||
| workspace: TaskWorkspace | None = None | ||
| launched = False |
There was a problem hiding this comment.
PR description focuses on adding an additional_tools parameter, but this diff also introduces substantial new behavior in ExecuteSeedHandler (task worktree preparation/restoration, verification artifact generation, and auto-generating session IDs). If these changes are intentional, the PR description should be updated to reflect them; otherwise they should be split into a separate PR to keep scope and review risk manageable.
| # Merge explicit additional_tools from the caller (issue #181). | ||
| # This allows direct MCP-initiated executions to inherit the parent | ||
| # session's MCP tools without requiring the nested delegation mechanism. | ||
| # Like delegation context, additional_tools only apply to new executions, | ||
| # not resumes \u2014 resumed sessions must preserve their original tool set. | ||
| additional_tools = arguments.get("additional_tools") if not is_resume else None | ||
| if isinstance(additional_tools, list) and additional_tools: | ||
| extra = list(dict.fromkeys( | ||
| t for t in additional_tools if isinstance(t, str) and t | ||
| )) | ||
| if extra: | ||
| if inherited_effective_tools: | ||
| # Merge with delegation-inherited tools, dedup preserving order | ||
| seen = set(inherited_effective_tools) | ||
| inherited_effective_tools = list(inherited_effective_tools) | ||
| for tool in extra: | ||
| if tool not in seen: | ||
| inherited_effective_tools.append(tool) | ||
| seen.add(tool) | ||
| else: | ||
| inherited_effective_tools = extra | ||
|
|
There was a problem hiding this comment.
The new merge behavior when both delegation-inherited tools (_ooo_parent_effective_tools) and additional_tools are present (including deduplication/order preservation) isn’t covered by tests. Adding a regression test for that combined-path would help ensure the merge logic doesn’t regress.
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
8972f89| Triggered by: new push by @jcfernandez-890825
Branch: feature/additional-tools-181-v2 | 2 files, +528/-183 | CI: MyPy ✅, Ruff ✅, tests ✅
Scope: diff-only
What Improved
- The previous Ruff blocker is fixed on this head: the unused top-level
SessionStatusimport is no longer present intests/unit/mcp/tools/test_definitions.py. - The
additional_toolsparameter remains wired into the execute/start tool schema, and the resume-ignore behavior is still covered by tests.
Issue #181 Requirements
| Requirement | Status |
|---|---|
Add an optional additional_tools parameter to ouroboros_execute_seed / ouroboros_start_execute_seed |
MET — the parameter is exposed in the handler schema and reused by the start handler via the shared definition in src/ouroboros/mcp/tools/execution_handlers.py:173-207 and :614-615. |
| Merge caller-supplied tools into the inherited/default tool set for new executions | MET — merge + dedup logic is implemented in src/ouroboros/mcp/tools/execution_handlers.py:274-290. |
Ignore additional_tools on resume so resumed sessions preserve their original tool set |
MET — the feature is gated behind not is_resume in src/ouroboros/mcp/tools/execution_handlers.py:266-279, with regression coverage in tests/unit/mcp/tools/test_definitions.py:629-646. |
Previous Review Follow-up
| Previous Finding | Status |
|---|---|
Unused top-level SessionStatus import caused Ruff failure |
WITHDRAWN — fixed on the current head. |
Missing ouroboros.core.worktree import target |
STILL OPEN — src/ouroboros/core/worktree.py is still absent while the import remains in src/ouroboros/mcp/tools/execution_handlers.py:24-30. |
Missing ouroboros.evaluation.verification_artifacts import target |
STILL OPEN — src/ouroboros/evaluation/verification_artifacts.py is still absent while the import remains in src/ouroboros/mcp/tools/execution_handlers.py:31. |
task_workspace= passed to OrchestratorRunner although the runner constructor does not accept it |
STILL OPEN — the call remains in src/ouroboros/mcp/tools/execution_handlers.py:425-433, while src/ouroboros/orchestrator/runner.py:313-324 still has no task_workspace parameter. |
Launched response reports self.llm_backend or "default" instead of the resolved backend |
STILL OPEN — unchanged in src/ouroboros/mcp/tools/execution_handlers.py:335. |
Blocking Findings
| # | File:Line | Severity | Confidence | Finding |
|---|---|---|---|---|
| 1 | src/ouroboros/mcp/tools/execution_handlers.py:24 |
High | 99% | execution_handlers.py now imports ouroboros.core.worktree, but there is no src/ouroboros/core/worktree.py in this checkout. That makes the module fail during import before any MCP execution tool can load. |
| 2 | src/ouroboros/mcp/tools/execution_handlers.py:31 |
High | 99% | build_verification_artifacts is imported from ouroboros.evaluation.verification_artifacts, but that module is also absent from src/ouroboros/evaluation/. Even if the worktree import were fixed, the handler still would not import successfully. |
| 3 | src/ouroboros/mcp/tools/execution_handlers.py:433 |
High | 99% | OrchestratorRunner is still constructed with task_workspace=workspace, but src/ouroboros/orchestrator/runner.py:313-324 does not accept that keyword. On the current head, launching a seed execution will raise TypeError at runtime. |
| 4 | tests/unit/mcp/tools/test_definitions.py:1902 |
Medium | 98% | The newly added test_generate_seed_ignores_caller_ambiguity_score_override asserts that the stored ambiguity score wins, but GenerateSeedHandler.handle() currently does the opposite when ambiguity_score is supplied (src/ouroboros/mcp/tools/authoring_handlers.py:321-323). As written, this changed test expects behavior the code does not implement. |
| 5 | src/ouroboros/mcp/tools/execution_handlers.py:338 |
Medium | 90% | The new fire-and-forget flow only closes an owned EventStore inside the background task’s finally, but several pre-launch error returns after await event_store.initialize() (:349, :361, :379, :393, :444) bypass that cleanup entirely. Rejected resumes / workspace failures / prepare failures will leak the handler-owned event store. |
Test Coverage
tests/unit/mcp/tools/test_definitions.py:548-597only coversadditional_toolsby itself. There is still no regression test for the combined path where delegated parent tools andadditional_toolsare both present, including overlap/dedup order fromsrc/ouroboros/mcp/tools/execution_handlers.py:274-290.tests/unit/mcp/tools/test_definitions.py:565-590and:671-692fully mockOrchestratorRunner, so they cannot catch constructor drift like the livetask_workspace=mismatch atsrc/ouroboros/mcp/tools/execution_handlers.py:425-433.- There is no test coverage for the new workspace/verification-artifact branches in
src/ouroboros/mcp/tools/execution_handlers.py:325-399and:494-516, so the added worktree restoration, lock handling, and verification-artifact behavior is effectively untested here. - I could not execute the tests in this environment because
pytestis not installed locally, so the review relies on current code/CI inspection rather than rerunning the suite.
Design Notes
The previous Ruff issue is fixed, but the earlier merge blockers are still present: both new imports point at modules that do not exist in this checkout, and the runner constructor mismatch still makes the new execution path unsafe at runtime. Architecturally, this PR is also broader than the advertised additional_tools feature: it mixes tool-set merging with task-worktree management, verification-artifact generation, backend reporting changes, and session-ID allocation. The additional_tools change itself looks small and reasonable; the unsupported workspace/verification plumbing is what introduces the current import-time and runtime regressions.
Follow-up Recommendations
- Resolve the
resolved_llm_backendreporting path insrc/ouroboros/mcp/tools/execution_handlers.py:335so launch metadata reflects the backend actually chosen by runtime/config resolution. - Add one focused regression test for the combined delegated-tools +
additional_toolsmerge path, especially overlap/dedup order. - Consider splitting the worktree / verification-artifact behavior into a separate PR if it is intended, so the
additional_toolsfeature can be reviewed and landed independently.
Files Reviewed
src/ouroboros/mcp/tools/execution_handlers.pytests/unit/mcp/tools/test_definitions.pysrc/ouroboros/orchestrator/runner.pysrc/ouroboros/mcp/tools/authoring_handlers.py
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Response to review findings on
|
Allow callers of ouroboros_execute_seed and ouroboros_start_execute_seed to pass an explicit list of MCP tool names via the additional_tools parameter. These tools are merged with inherited_effective_tools for new executions and ignored on resume to preserve session consistency. Includes 4 unit tests: schema validation, merge behavior, default preservation, and resume-ignore regression.
8972f89 to
04892f7
Compare
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
04892f7| Triggered by: new push by @jcfernandez-890825
Branch: feature/additional-tools-181-v2 | 2 files, +195/-0 | CI: MyPy Type Check pass; Ruff Lint pass; Test Python 3.12 pass; Test Python 3.13 pass; Test Python 3.14 pass
Issue #181 Requirements
| Requirement | Status |
|---|---|
Add an additional_tools parameter to execute_seed / start_execute_seed so delegated tools can be passed through safely. |
NOT MET — the new public parameter widens the child session tool allowlist for any caller, instead of being constrained to already-authorized delegated tools. Evidence: src/ouroboros/mcp/tools/execution_handlers.py:174-185, src/ouroboros/mcp/tools/execution_handlers.py:274-287, src/ouroboros/orchestrator/runner.py:935-942. |
Previous Review Follow-up
| Previous Finding | Status |
|---|---|
| Previous bot review exists on this PR. | MODIFIED — re-checked on current HEAD 04892f7; this review focuses on the remaining in-scope capability-expansion risk in the new additional_tools implementation. |
Blocking Findings
| # | File:Line | Severity | Confidence | Finding |
|---|---|---|---|---|
| 1 | src/ouroboros/mcp/tools/execution_handlers.py:174 |
High | 95% | additional_tools is exposed as a public MCP argument and then merged directly into inherited_effective_tools without validating that the requested tools were already authorized by the parent/delegating session. |
| 2 | src/ouroboros/mcp/tools/execution_handlers.py:274 |
High | 95% | Because the merge happens before runner construction, arbitrary caller-supplied tool names become part of the runner’s inherited tool set. |
| 3 | src/ouroboros/orchestrator/runner.py:937 |
High | 93% | OrchestratorRunner appends every inherited tool name into the session tool catalog if it is not already present, so this PR turns additional_tools into a capability-expansion path rather than a pure delegation-preservation mechanism. |
| 4 | tests/unit/mcp/tools/test_definitions.py:1032 |
Medium | 91% | The new tests assert and lock in the widening behavior, but they do not verify the intended trust boundary (for example, rejecting names outside an allowlist or outside the delegated parent’s effective tool set). |
| 5 | tests/unit/mcp/tools/test_definitions.py:1130 |
Medium | 88% | Resume behavior is covered, but the new suite still misses the highest-risk path: a fresh execution that requests tools the caller should not be able to grant. |
Test Coverage
The added tests cover schema exposure, merge behavior, default behavior, and resume behavior for additional_tools.
They do not cover the risky path: constraining or rejecting caller-supplied tool names against an explicit allowlist, runtime availability, or the parent session’s delegated effective tool set.
I also could not execute the tests locally in this environment because pytest is not installed here.
Design Notes
additional_tools currently mixes two different concepts:
- preserving a parent session’s already-authorized effective tools
- allowing a caller to request new tools for the child run
Those have different trust boundaries. If the goal is safe delegation, the parameter should be validated against an explicit allowlist or the parent session’s effective tool set before it reaches the runner. Otherwise this should remain an internal delegation-only field rather than a public MCP argument.
Follow-up Recommendations
- Validate
additional_toolsagainst the delegated parent effective tool set, or another explicit allowlist, before merging. - Add a regression test that proves unauthorized tool names are rejected and never reach
OrchestratorRunner(... inherited_tools=...). - Add a combined-path test for overlap between inherited delegated tools and
additional_tools, including dedup order.
Files Reviewed
- src/ouroboros/mcp/tools/execution_handlers.py
- tests/unit/mcp/tools/test_definitions.py
Reviewed by ouroboros-agent[bot] via Codex deep analysis
|
@jcfernandez-890825 Could you strip out those extra changes so we can keep this PR strictly focused on the additional_tools feature? If you've got some new workspace or verification features cooking, I'd definitely love to see them in a separate PR later! |
Response to capability-expansion concern on
|
Summary
Add an optional
additional_toolsparameter toouroboros_execute_seedandouroboros_start_execute_seedthat allows callers to explicitly pass MCP tool names from the parent Claude session. These tools are merged with the existing default tool set when creating theOrchestratorRunner.Replaces #223 (clean single-commit rebase on current main).
Changes
src/ouroboros/mcp/tools/execution_handlers.py:additional_toolsparameter (type:ARRAY,items={type: string}, optional)inherited_effective_toolsusingdict.fromkeys()for dedupnot is_resume— resumed sessions preserve original tool setStartExecuteSeedHandlerinherits automatically via shared parameter listtests/unit/mcp/tools/test_definitions.py:test_definition_includes_additional_tools_parameter— schema validation includingitemstest_additional_tools_merged_into_inherited_tools— direct-call mergetest_no_additional_tools_preserves_default_behavior— defaultNonetest_additional_tools_ignored_on_resume— resume regression testDesign
additional_toolsignored whensession_idis providedCloses #181