Skip to content

Comments

feat(security): add sandboxed agent execution (host/docker/apple), fail-closed retries, dead-letter queue, and ops docs #30#55

Closed
manish-raana wants to merge 0 commit intoTinyAGI:mainfrom
manish-raana:main
Closed

feat(security): add sandboxed agent execution (host/docker/apple), fail-closed retries, dead-letter queue, and ops docs #30#55
manish-raana wants to merge 0 commit intoTinyAGI:mainfrom
manish-raana:main

Conversation

@manish-raana
Copy link

@manish-raana manish-raana commented Feb 14, 2026

Summary

This PR adds sandbox support for TinyClaw agent execution with runtime modes:

  • host (existing behavior)
  • docker (ephemeral container per invocation)
  • apple (runtime-command adapter)

It preserves existing queue/team routing behavior and adds fail-closed security controls, retry/dead-letter handling, diagnostics commands, and documentation.

This is an enhancement for #30.

Why

Agent invocations were executed directly on host. This change introduces isolated runtime options and explicit runtime/env validation for safer and more predictable operations.

Changes

Runtime and Invocation

  • Added src/lib/runner.ts:
    • HostRunner, DockerRunner, AppleRunner
    • timeout enforcement
    • env allowlist enforcement (OPENAI_API_KEY, ANTHROPIC_API_KEY)
    • fail-closed sandbox behavior
    • container-to-host path mapping
  • Refactored src/lib/invoke.ts to execute through runner abstraction while preserving response parsing.
  • Added sandbox lifecycle events:
    • sandbox_invocation_start
    • sandbox_invocation_end
    • sandbox_invocation_error

Config and Types

  • Extended src/lib/types.ts:
    • global sandbox config
    • per-agent sandbox_mode
    • queue retry metadata (attempt, firstSeenAt, errorClass)
  • Extended src/lib/config.ts:
    • sandbox defaults
    • getSandboxConfig(...)
    • QUEUE_DEAD_LETTER

Queue Reliability

  • Updated src/queue-processor.ts:
    • error classification (terminal vs transient)
    • retry transient failures up to sandbox.max_attempts
    • dead-letter routing to ~/.tinyclaw/queue/dead-letter
    • heartbeat error dedupe
    • [send_file: ...] path mapping for sandbox outputs
    • redaction-safe error logging

CLI and Setup

  • Added tinyclaw.sh sandbox subcommands:
    • show
    • set <host|docker|apple>
    • doctor
    • build-image
  • Updated lib/setup-wizard.sh to configure sandbox settings.
  • Added Dockerfile.agent-runner.

Documentation

  • Added:
    • docs/SANDBOX.md
    • docs/MIGRATION-SANDBOX.md
    • docs/OPERATIONS-RUNBOOK.md
  • Updated:
    • README.md
    • docs/INSTALL.md
    • docs/AGENTS.md
    • docs/QUEUE.md
    • docs/TROUBLESHOOTING.md

Behavior Notes

  • Docker/Apple modes are fail-closed.
  • Container runs are ephemeral (docker run --rm per invocation), so docker ps may not show long-lived containers.
  • Existing installs remain compatible (host mode unless changed by user).

Validation

  • npx tsc --noEmit
  • npm run build
  • bash -n tinyclaw.sh lib/setup-wizard.sh
  • Manual smoke: Telegram queue flow, retry/dead-letter flow, Docker mount fix verified.

Known Limitations

  • Dockerfile.agent-runner is minimal; production image should include required provider CLIs (claude, codex).
  • Apple runtime depends on configured sandbox.apple.runtime_command and runtime environment parity.

@manish-raana manish-raana changed the title feat(security): add sandboxed agent execution (host/docker/apple), fail-closed retries, dead-letter queue, and ops docs feat(security): add sandboxed agent execution (host/docker/apple), fail-closed retries, dead-letter queue, and ops docs #30 Feb 14, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c3f32ff6e8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 295 to 296
args.push('--user', '1000:1000');
args.push('--mount', `type=bind,src=${request.workingDir},dst=${containerWorkingDir}`);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Run sandbox container as the current host UID/GID

The sandbox runner hard-codes --user 1000:1000 while bind-mounting the agent workspace, which causes write failures whenever that workspace is owned by a different UID/GID (common on macOS with UID 501 or root-owned deployments), so provider CLIs cannot persist conversation state or output files and containerized invocations fail with permission errors; derive UID/GID from the running process (or make it configurable) instead of assuming 1000.

Useful? React with 👍 / 👎.

return runContainer(request, 'docker');
}

return runContainer(request, 'apple');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate sandbox mode instead of defaulting to apple

runInSandbox falls through to Apple mode for any unrecognized sandbox.mode, so a typo or corrupted config value silently runs the wrong runtime and surfaces misleading Apple runtime errors instead of an explicit configuration failure; unknown modes should be rejected with a terminal config error.

Useful? React with 👍 / 👎.

jlia0 pushed a commit that referenced this pull request Feb 16, 2026
Detailed code review of the sandbox execution feature covering:
- Security issues (secret exposure via --env args, hard-coded UID)
- Bugs (unvalidated sandbox mode fallthrough)
- Design concerns (unbounded buffers, falsy-check defaults)
- Code quality items (style reformatting, missing validation)

https://claude.ai/code/session_01QD8DbEFLdkbLik4hgvrgDm
@mczabca-boop
Copy link
Collaborator

mczabca-boop commented Feb 16, 2026

PR Review Summary

Request changes. There are 2 blocking issues that should be fixed before approval, plus 2 high-priority follow-up issues that should be addressed in this PR or explicitly tracked.

Blocking Issues

  1. Hard-coded container user 1000:1000 can cause cross-environment permission failures
    Evidence: src/lib/runner.ts:295, src/lib/runner.ts:296
    Issue: --user 1000:1000 with a bind-mounted workspace can fail when the host workspace owner is not UID/GID 1000 (common on macOS and some deployments), preventing provider CLIs from writing session/output files.
    Recommendation: derive UID/GID dynamically (for example via process.getuid()/process.getgid()) or make it configurable.

  2. Unknown sandbox.mode silently falls back to apple mode
    Evidence: src/lib/runner.ts:363, src/lib/runner.ts:368, src/lib/runner.ts:372
    Issue: only host and docker are explicitly handled; any other value routes to runContainer(..., 'apple'), which turns config mistakes into misleading apple runtime errors.
    Recommendation: explicitly validate mode and throw a terminal configuration error for unknown values.

Non-blocking / High-priority Suggestions

  1. restricted network mode is currently equivalent to default (semantic mismatch)
    Evidence: src/lib/runner.ts:129, src/lib/runner.ts:130, src/lib/runner.ts:131, src/lib/runner.ts:282
    Issue: everything except none maps to bridge, so restricted currently has no distinct behavior.
    Recommendation: implement real restricted behavior, or remove/clarify the mode in docs to avoid false security expectations.

  2. Numeric sandbox defaults use ||, which discards valid 0 values
    Evidence: src/lib/config.ts:140, src/lib/config.ts:141, src/lib/config.ts:142
    Issue: in JS/TS, 0 is falsy, so explicit user values like 0 get replaced by defaults.
    Recommendation: use ?? (or explicit undefined/null checks) for numeric fields.

Local Validation Notes

  1. Container runtime validation could not be executed in the current WSL environment because Docker CLI/WSL integration is not available.
  2. All four issues above were confirmed via code-level inspection and line-level evidence.

Ready-to-paste PR Overall Comment

Requesting changes due to 2 blocking issues:
(1) hard-coded container UID/GID at src/lib/runner.ts:295, and
(2) unknown sandbox.mode silently falling back to apple at src/lib/runner.ts:372.
I also left 2 follow-up suggestions: restricted network currently behaves like default (src/lib/runner.ts:129, src/lib/runner.ts:282), and numeric sandbox defaults should use nullish checks to preserve valid 0 values (src/lib/config.ts:140).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants