Integrate sandbox-runtime (srt) and enhance configuration options#3
Integrate sandbox-runtime (srt) and enhance configuration options#3Oops-maker wants to merge 5 commits intoTashanGKD:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR integrates Anthropic’s sandbox-runtime (srt) as the preferred OS-level sandbox backend (with legacy sandbox-exec/bwrap fallback), adds configuration toggles, and updates docs/tests to cover the new behavior.
Changes:
- Add
srtsettings generation (app/agent/srt_config.py) and switch sandbox backend selection to prefersrtwhen available/enabled. - Wrap MCP stdio server commands with
srt --settings ...when enabled, and add an Agent Links runtimesrtCLI wrapper. - Add/adjust tests + local CI helper + docs + Docker image to support
srtinstallation and configuration.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_srt_integration.py | Adds a shim-based integration test for run_in_os_sandbox using a fake srt. |
| tests/test_srt_config.py | Adds unit tests for srt settings generation, detection, and config flag parsing. |
| tests/test_profile_helper_publish_sync.py | Updates auth override fixture to use client.app instead of importing main.app. |
| tests/test_profile_helper_mvp.py | Same fixture pattern update as publish_sync test. |
| tests/test_agent_sdk.py | Makes MCP server assertion compatible with optional srt wrapping. |
| tests/test_agent_links_runtime_srt.py | Adds tests for Agent Links runtime srt CLI wrapper generation/reuse. |
| scripts/ci_local.sh | Adds srt availability reporting + optional npm install attempt. |
| docs/sandbox-isolation.md | Documents Phase 3 srt integration and new behavior/structure. |
| docs/config.md | Documents SANDBOX_USE_SRT and installation/verification steps. |
| app/services/profile_helper/sessions.py | Adds a session ID listing helper (but currently duplicated). |
| app/services/agent_links_runtime.py | Adds optional srt-wrapped claude CLI shim for Agent Links runtime. |
| app/core/config.py | Adds SANDBOX_USE_SRT + ENABLE_SEMANTIC_SEARCH env parsing. |
| app/agent/srt_config.py | New module to build/write srt JSON settings for main + MCP sandboxes. |
| app/agent/sandbox_exec.py | Prefers srt backend when enabled/available; updates detection/logging. |
| app/agent/discussion.py | Wraps MCP stdio server commands with srt when enabled. |
| README.md | Documents new env vars and points to config doc as source of truth for srt. |
| README.en.md | Same as README.md in English. |
| Dockerfile | Installs Node.js + @anthropic-ai/sandbox-runtime and Linux deps for srt. |
| .env.example | Adds SANDBOX_USE_SRT and ENABLE_SEMANTIC_SEARCH examples. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Tracks whether the previous turn completed cleanly (ResultMessage received). | ||
| # If False, the subprocess may be in a dirty state and needs reconnection. | ||
| _turn_complete: dict[str, bool] = {} | ||
| _srt_cli_wrappers: dict[str, str] = {} # workdir -> wrapper path |
There was a problem hiding this comment.
_srt_cli_wrappers caches wrapper paths by workdir, but entries are never removed when session workspaces are cleaned up in _cleanup_orphans(). In a long-running server with many sessions, this dict can grow without bound and retain stale paths. Consider deleting cache entries for removed workdirs as part of orphan cleanup (or avoid global caching and just check for the wrapper file on disk).
|
建议按照功能合并下提交,并且修复下 copilot 提出的问题 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bf33ee8a31
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if SANDBOX_AVAILABLE: | ||
| sandbox_type = "macos-sandbox-exec" if MACOS_SANDBOX else "linux-bwrap" | ||
| logger.info("[SandboxExec] OS sandbox available: %s", sandbox_type) | ||
| SANDBOX_AVAILABLE: bool = SRT_AVAILABLE or MACOS_SANDBOX or LINUX_BWRAP |
There was a problem hiding this comment.
Honor SANDBOX_USE_SRT when computing sandbox availability
SANDBOX_AVAILABLE is now true whenever srt exists, even if SANDBOX_USE_SRT=false. In that configuration, callers such as run_expert_reply_sandboxed() still choose the sandbox path, but run_in_os_sandbox() skips srt and falls through to the legacy backend; on hosts without sandbox-exec/bwrap, this leads to launch-time failures instead of the intended graceful fallback. This makes the new config toggle brittle and can break expert-reply execution on machines that only have srt installed.
Useful? React with 👍 / 👎.
| --with httpx \ | ||
| pytest -m "not integration" -v --tb=short | ||
| pytest -m "not integration" -v --tb=short; then | ||
| UNIT_RC=$? |
There was a problem hiding this comment.
Propagate unit-test failures from ci_local.sh
Inside if ! uv run ...; then, assigning UNIT_RC=$? captures the exit status of the negated test (which is 0 on unit-test failure), not the failing uv command. Because the script also never exits with a stored nonzero unit-test code later, a failed unit-test phase can still produce an overall success exit when subsequent steps pass, which undermines this script as a reliable local CI gate.
Useful? React with 👍 / 👎.
Add get_sandbox_use_srt() to app/core/config.py. This boolean flag (default: true) controls whether the system prefers sandbox-runtime (srt) over the legacy bwrap/sandbox-exec code path for OS-level sandboxing. Operators can set SANDBOX_USE_SRT=false to fall back. Part of the sandbox-runtime integration work. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remote force-push reverted two earlier fixes (commit 202e785). Re-applying them: 1. app/services/profile_helper/sessions.py: re-add list_ids() function that returns all active session IDs after cleanup. Required by agent_links API which imports and calls profile_sessions.list_ids(). 2. tests/test_agent_links_api.py: re-add session_id parameter to _fake_stream_chat() mock. The actual API passes session_id as a keyword argument to stream_chat(). Note: test_copy_mcp_to_workspace_with_env failure is a pre-existing issue from remote (test expects streamableHttp MCP copy, but code intentionally skips it). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ntries Normalize libs metadata JSON formatting and add test sample entries for topiclab_shared experts and moderator_modes. Add .github/copilot-instructions.md for repository-specific Copilot CLI guidance. These changes were created locally during workspace sync and are being committed to keep the repo tidy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…EARCH (env)\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Change Type
Description
Related Issue
Checklist
pytest -q -m "not integration".env:bash scripts/ci_local.sh