feat(security): add sandboxed agent execution (host/docker/apple), fail-closed retries, dead-letter queue, and ops docs #30 by manish-raana · Pull Request #55 · TinyAGI/tinyclaw

manish-raana · 2026-02-14T07:40:28Z

Summary

This PR adds sandbox support for TinyClaw agent execution with runtime modes:

host (existing behavior)
docker (ephemeral container per invocation)
apple (runtime-command adapter)

It preserves existing queue/team routing behavior and adds fail-closed security controls, retry/dead-letter handling, diagnostics commands, and documentation.

This is an enhancement for #30.

Why

Agent invocations were executed directly on host. This change introduces isolated runtime options and explicit runtime/env validation for safer and more predictable operations.

Changes

Runtime and Invocation

Added src/lib/runner.ts:
- HostRunner, DockerRunner, AppleRunner
- timeout enforcement
- env allowlist enforcement (OPENAI_API_KEY, ANTHROPIC_API_KEY)
- fail-closed sandbox behavior
- container-to-host path mapping
Refactored src/lib/invoke.ts to execute through runner abstraction while preserving response parsing.
Added sandbox lifecycle events:
- sandbox_invocation_start
- sandbox_invocation_end
- sandbox_invocation_error

Config and Types

Extended src/lib/types.ts:
- global sandbox config
- per-agent sandbox_mode
- queue retry metadata (attempt, firstSeenAt, errorClass)
Extended src/lib/config.ts:
- sandbox defaults
- getSandboxConfig(...)
- QUEUE_DEAD_LETTER

Queue Reliability

Updated src/queue-processor.ts:
- error classification (terminal vs transient)
- retry transient failures up to sandbox.max_attempts
- dead-letter routing to ~/.tinyclaw/queue/dead-letter
- heartbeat error dedupe
- [send_file: ...] path mapping for sandbox outputs
- redaction-safe error logging

CLI and Setup

Added tinyclaw.sh sandbox subcommands:
- show
- set <host|docker|apple>
- doctor
- build-image
Updated lib/setup-wizard.sh to configure sandbox settings.
Added Dockerfile.agent-runner.

Documentation

Added:
- docs/SANDBOX.md
- docs/MIGRATION-SANDBOX.md
- docs/OPERATIONS-RUNBOOK.md
Updated:
- README.md
- docs/INSTALL.md
- docs/AGENTS.md
- docs/QUEUE.md
- docs/TROUBLESHOOTING.md

Behavior Notes

Docker/Apple modes are fail-closed.
Container runs are ephemeral (docker run --rm per invocation), so docker ps may not show long-lived containers.
Existing installs remain compatible (host mode unless changed by user).

Validation

npx tsc --noEmit ✅
npm run build ✅
bash -n tinyclaw.sh lib/setup-wizard.sh ✅
Manual smoke: Telegram queue flow, retry/dead-letter flow, Docker mount fix verified.

Known Limitations

Dockerfile.agent-runner is minimal; production image should include required provider CLIs (claude, codex).
Apple runtime depends on configured sandbox.apple.runtime_command and runtime environment parity.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c3f32ff6e8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-14T07:45:50Z

src/lib/runner.ts

+    args.push('--user', '1000:1000');
+    args.push('--mount', `type=bind,src=${request.workingDir},dst=${containerWorkingDir}`);


Run sandbox container as the current host UID/GID

The sandbox runner hard-codes --user 1000:1000 while bind-mounting the agent workspace, which causes write failures whenever that workspace is owned by a different UID/GID (common on macOS with UID 501 or root-owned deployments), so provider CLIs cannot persist conversation state or output files and containerized invocations fail with permission errors; derive UID/GID from the running process (or make it configurable) instead of assuming 1000.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-02-14T07:45:50Z

src/lib/runner.ts

+        return runContainer(request, 'docker');
+    }
+
+    return runContainer(request, 'apple');


Validate sandbox mode instead of defaulting to apple

runInSandbox falls through to Apple mode for any unrecognized sandbox.mode, so a typo or corrupted config value silently runs the wrong runtime and surfaces misleading Apple runtime errors instead of an explicit configuration failure; unknown modes should be rejected with a terminal config error.

Useful? React with 👍 / 👎.

Detailed code review of the sandbox execution feature covering: - Security issues (secret exposure via --env args, hard-coded UID) - Bugs (unvalidated sandbox mode fallthrough) - Design concerns (unbounded buffers, falsy-check defaults) - Code quality items (style reformatting, missing validation) https://claude.ai/code/session_01QD8DbEFLdkbLik4hgvrgDm

mczabca-boop · 2026-02-16T05:58:58Z

PR Review Summary

Request changes. There are 2 blocking issues that should be fixed before approval, plus 2 high-priority follow-up issues that should be addressed in this PR or explicitly tracked.

Blocking Issues

Hard-coded container user 1000:1000 can cause cross-environment permission failures
Evidence: src/lib/runner.ts:295, src/lib/runner.ts:296
Issue: --user 1000:1000 with a bind-mounted workspace can fail when the host workspace owner is not UID/GID 1000 (common on macOS and some deployments), preventing provider CLIs from writing session/output files.
Recommendation: derive UID/GID dynamically (for example via process.getuid()/process.getgid()) or make it configurable.
Unknown sandbox.mode silently falls back to apple mode
Evidence: src/lib/runner.ts:363, src/lib/runner.ts:368, src/lib/runner.ts:372
Issue: only host and docker are explicitly handled; any other value routes to runContainer(..., 'apple'), which turns config mistakes into misleading apple runtime errors.
Recommendation: explicitly validate mode and throw a terminal configuration error for unknown values.

Non-blocking / High-priority Suggestions

restricted network mode is currently equivalent to default (semantic mismatch)
Evidence: src/lib/runner.ts:129, src/lib/runner.ts:130, src/lib/runner.ts:131, src/lib/runner.ts:282
Issue: everything except none maps to bridge, so restricted currently has no distinct behavior.
Recommendation: implement real restricted behavior, or remove/clarify the mode in docs to avoid false security expectations.
Numeric sandbox defaults use ||, which discards valid 0 values
Evidence: src/lib/config.ts:140, src/lib/config.ts:141, src/lib/config.ts:142
Issue: in JS/TS, 0 is falsy, so explicit user values like 0 get replaced by defaults.
Recommendation: use ?? (or explicit undefined/null checks) for numeric fields.

Local Validation Notes

Container runtime validation could not be executed in the current WSL environment because Docker CLI/WSL integration is not available.
All four issues above were confirmed via code-level inspection and line-level evidence.

Ready-to-paste PR Overall Comment

Requesting changes due to 2 blocking issues:
(1) hard-coded container UID/GID at src/lib/runner.ts:295, and
(2) unknown sandbox.mode silently falling back to apple at src/lib/runner.ts:372.
I also left 2 follow-up suggestions: restricted network currently behaves like default (src/lib/runner.ts:129, src/lib/runner.ts:282), and numeric sandbox defaults should use nullish checks to preserve valid 0 values (src/lib/config.ts:140).

manish-raana changed the title ~~feat(security): add sandboxed agent execution (host/docker/apple), fail-closed retries, dead-letter queue, and ops docs~~ feat(security): add sandboxed agent execution (host/docker/apple), fail-closed retries, dead-letter queue, and ops docs #30 Feb 14, 2026

chatgpt-codex-connector bot reviewed Feb 14, 2026

View reviewed changes

jlia0 requested review from mczabca-boop and shwdsun February 14, 2026 23:36

jlia0 added this to TinyAGI Roadmap Feb 23, 2026

github-project-automation bot moved this to Todo in TinyAGI Roadmap Feb 23, 2026

manish-raana closed this Feb 23, 2026

manish-raana force-pushed the main branch from c3f32ff to 8d38030 Compare February 23, 2026 19:45

github-project-automation bot moved this from Todo to Done in TinyAGI Roadmap Feb 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat(security): add sandboxed agent execution (host/docker/apple), fail-closed retries, dead-letter queue, and ops docs #30#55

feat(security): add sandboxed agent execution (host/docker/apple), fail-closed retries, dead-letter queue, and ops docs #30#55
manish-raana wants to merge 0 commit intoTinyAGI:mainfrom
manish-raana:main

manish-raana commented Feb 14, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Feb 14, 2026

Uh oh!

chatgpt-codex-connector bot Feb 14, 2026

Uh oh!

mczabca-boop commented Feb 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		args.push('--user', '1000:1000');
		args.push('--mount', `type=bind,src=${request.workingDir},dst=${containerWorkingDir}`);

Comments

Conversation

manish-raana commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Changes

Runtime and Invocation

Config and Types

Queue Reliability

CLI and Setup

Documentation

Behavior Notes

Validation

Known Limitations

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

mczabca-boop commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Blocking Issues

Non-blocking / High-priority Suggestions

Local Validation Notes

Ready-to-paste PR Overall Comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

manish-raana commented Feb 14, 2026 •

edited

Loading

mczabca-boop commented Feb 16, 2026 •

edited

Loading