-
Notifications
You must be signed in to change notification settings - Fork 4
Description
branch: project-sandbox-deps
Spec: Project-Level Sandbox Dependencies
Overview
Allow projects to declare additional packages or custom Dockerfile layers that get installed into the sandbox image. Dependencies are declared in a .agent-loop/ directory in the project root (not the worktree), so they are shared and cached across all worktrees for the same project.
Two modes are supported, with a clear priority:
Dockerfile.sandbox(full control) — takes precedence if presentdependencies(simple package list) — auto-generates a Dockerfile layer
Both use ARG BASE_IMAGE / FROM ${BASE_IMAGE} so they are agent-agnostic — ralph injects the correct base image via --build-arg at build time. The same project config works regardless of which agent is used.
Project-level images are tagged as agent-loop-sandbox-{agent}-{project}:v{hash} where {project} is derived from the repo directory name and {hash} is an 8-character content hash of the base image tag + project file content.
Additionally, the existing base image content hash is shortened from 12 to 8 characters for readability.
Architecture
Image layering:
docker/sandbox-templates:claude-code (upstream base)
│
▼
docker/agent-loop/claude/Dockerfile (agent base — build-essential, jq, etc.)
│
▼ (only if .agent-loop/ config exists in project)
.agent-loop/Dockerfile.sandbox (custom project layer)
OR
.agent-loop/dependencies (auto-generated project layer)
│
▼
agent-loop-sandbox-claude-myproject:v{hash} (final image used by sandbox)
Lookup flow in ensure_sandbox():
1. Build/cache base image (existing logic, unchanged)
2. Resolve project_dir = repo root (git rev-parse --show-toplevel)
3. Check {project_dir}/.agent-loop/Dockerfile.sandbox → if exists, use it
4. Else check {project_dir}/.agent-loop/dependencies → generate Dockerfile
5. Else → use base image directly (no project layer)
6. Content hash = SHA256(base_tag + project_file_content)[:8]
7. If image tag exists → reuse cached; else build with --build-arg BASE_IMAGE=<base_tag>
File locations:
<project-root>/
.agent-loop/
dependencies # Option A: one apt package per line, # comments
Dockerfile.sandbox # Option B: full Dockerfile (takes precedence)
1. Content Hash Length
Shorten the content hash from 12 to 8 characters throughout. Applies to both base and project image tags.
Current: agent-loop-sandbox-claude:v3a8f2b1c9d0e
New: agent-loop-sandbox-claude:v3a8f2b1c
2. Dependencies File Format
Plain text list of apt packages:
# .agent-loop/dependencies
# One package per line. Comments and blank lines are ignored.
openjdk-21-jdk
python3-venv
nodejs
Parsing rules:
- Strip
#comments (including inline# commentafter a package name) - Strip leading/trailing whitespace
- Skip empty lines
- Remaining lines are package names passed to
apt-get install -y --no-install-recommends
3. Dockerfile.sandbox Format
A standard Dockerfile using ARG BASE_IMAGE:
ARG BASE_IMAGE
FROM ${BASE_IMAGE}
USER root
RUN apt-get update && apt-get install -y --no-install-recommends openjdk-21-jdk \
&& rm -rf /var/lib/apt/lists/*
USER agentRalph builds with: docker build --build-arg BASE_IMAGE=<base_tag> -t <project_tag> -f Dockerfile.sandbox <context_dir>
The build context is the .agent-loop/ directory, so files within it can be COPY'd into the image if needed.
4. Generated Dockerfile (from dependencies)
When dependencies exists but Dockerfile.sandbox does not, ralph generates a Dockerfile in memory:
ARG BASE_IMAGE
FROM ${BASE_IMAGE}
USER root
RUN apt-get update && apt-get install -y --no-install-recommends \
pkg1 pkg2 pkg3 \
&& rm -rf /var/lib/apt/lists/*
USER agentWritten to a temp file only during docker build, then cleaned up.
5. Project Image Tagging
Tag format: agent-loop-sandbox-{agent}-{project}:v{hash}
{agent}— agent name (e.g.,claude){project}—os.path.basename(project_dir)(e.g.,elasticsearch){hash}—SHA256(base_tag + project_dockerfile_content)[:8]
The hash incorporates the base image tag (not just digest), so a base image rebuild cascades to project image rebuilds.
6. --rebuild Flag
--rebuild forces everything: re-pull upstream base, rebuild agent base image, rebuild project layer. No separate flag needed — content-addressed caching handles dependency file changes automatically.
7. Selftest Enhancement
When ralph selftest is run from a directory containing .agent-loop/dependencies or .agent-loop/Dockerfile.sandbox, add a check that verifies the project image builds successfully. Report as "build project image" with the project tag.
Implementation Plan
Each step follows this structure:
- Implement — Write the code
- Test — Write pytest tests
- Verify — Run tests, fix failures until all pass
- Review — Code review for bugs, edge cases, and conventions
- Address feedback — Fix review findings, re-run tests, re-review until clean
- Update spec — Mark the step
[done]and record any decisions or deviations
Spec maintenance rules
- Mark each step
[done]when complete. - Record design decisions that emerged during implementation as notes under the step.
- Minor deviations (e.g. method name changes, reordered logic) should be noted and the spec updated to match.
- Significant design changes (e.g. new subcommands, changed architecture, removed features) require pausing for user review before proceeding.
Step 1: Shorten content hash to 8 characters [done]
Files:
scripts/ralph—Sandbox.content_hash()methodtests/test_ralph.py—TestSandboxContentHash,TestSandboxEnsureImage, and any other tests asserting tag format
Implement:
- In
Sandbox.content_hash(), change[:12]to[:8]
Test:
- Update
TestSandboxContentHashassertions for 8-char hashes - Update any
TestSandboxEnsureImageassertions that check tag length - Search for any other tests asserting 12-char hashes and update them
Verify: Run pytest tests/test_ralph.py -v. Fix any failures and re-run until all pass.
Review: Ensure no hardcoded 12-char hash assumptions remain anywhere.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 2: Add dependencies file parsing [done]
Files:
scripts/ralph— New static methodSandbox.parse_dependencies(content)tests/test_ralph.py— NewTestSandboxParseDependenciesclass
Implement:
- Add
Sandbox.parse_dependencies(content)static method that:- Splits content into lines
- Strips inline
#comments (everything from#to end of line) - Strips whitespace
- Skips empty lines
- Returns a list of package names
Test:
- Basic package list parsing
- Comment-only lines skipped
- Inline comments stripped (
pkg # comment→pkg) - Blank lines skipped
- Whitespace handling (leading/trailing)
- Empty content returns empty list
Verify: Run pytest tests/test_ralph.py::TestSandboxParseDependencies -v.
Review: Edge cases in comment parsing.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 3: Add project Dockerfile generation [done]
Files:
scripts/ralph— New static methodSandbox.generate_project_dockerfile(packages)tests/test_ralph.py— NewTestSandboxGenerateProjectDockerfileclass
Implement:
- Add
Sandbox.generate_project_dockerfile(packages)static method that:- Takes a list of package names
- Returns a Dockerfile string with
ARG BASE_IMAGE,FROM ${BASE_IMAGE},USER root,apt-get update && install,USER agent - Joins packages with spaces in a single
apt-get installline
Test:
- Single package generates correct Dockerfile
- Multiple packages joined correctly
- Output contains
ARG BASE_IMAGEandFROM ${BASE_IMAGE} - Output switches to
USER rootthen back toUSER agent - Output includes
--no-install-recommendsandrm -rf /var/lib/apt/lists/*
Verify: Run pytest tests/test_ralph.py::TestSandboxGenerateProjectDockerfile -v.
Review: Dockerfile best practices (layer ordering, cleanup).
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 4: Add project image detection and building [done]
Files:
scripts/ralph— New methods:Sandbox.find_project_config(project_dir),Sandbox.project_image_tag(agent, project_name, base_tag, config_content),Sandbox.ensure_project_image(agent, base_tag, project_dir, force_rebuild)tests/test_ralph.py— New test classes for each method
Implement:
Sandbox.find_project_config(project_dir)— static method that checks for.agent-loop/Dockerfile.sandboxthen.agent-loop/dependencies. Returns a tuple(type, path)where type is"dockerfile","dependencies", orNoneif neither exists.Sandbox.project_image_tag(agent, project_name, base_tag, config_content)— static method that computes content hash frombase_tag + config_contentand returnsagent-loop-sandbox-{agent}-{project_name}:v{hash}.Sandbox.ensure_project_image(agent, base_tag, project_dir, force_rebuild=False):- Calls
find_project_config(project_dir)— ifNone, returnsbase_tag - Reads the config file content
- If type is
"dependencies", parses packages and generates Dockerfile content - If type is
"dockerfile", reads the file content directly - Computes project tag via
project_image_tag() - If tag exists locally and not
force_rebuild, return cached tag - Otherwise: for
"dockerfile", build with.agent-loop/as build context and-f Dockerfile.sandbox; for"dependencies", write generated Dockerfile to a temp file and build with temp dir as context - Always pass
--build-arg BASE_IMAGE=<base_tag> - Return the project tag
- Calls
Test:
find_project_config: prefersDockerfile.sandboxoverdependencies, returnsNonewhen neither exists, returns correct type and path for eachproject_image_tag: includes project name and agent in tag, hash is 8 chars, different content produces different hash, same content produces same hashensure_project_image: returns base tag when no project config, builds when tag missing, skips build when tag exists,force_rebuildforces build, passes--build-arg BASE_IMAGEcorrectly, uses.agent-loop/as build context forDockerfile.sandbox, uses-f Dockerfile.sandboxfor custom Dockerfiles
Verify: Run pytest tests/test_ralph.py -v -k "project".
Review: Temp file cleanup, build context correctness, content hash determinism.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 5: Wire project images into ensure_sandbox and process_issue [done]
Files:
scripts/ralph—Sandbox.ensure_sandbox(),process_issue(),main()tests/test_ralph.py— UpdateTestSandboxEnsureSandbox,TestProcessIssueSandbox,TestMainSandboxFlags
Implement:
- Update
Sandbox.ensure_sandbox()signature to acceptproject_dir=Noneandforce_rebuild=False - Inside
ensure_sandbox(): afterensure_image(agent)returnsbase_tag, callensure_project_image(agent, base_tag, project_dir, force_rebuild)to getfinal_tag. Usefinal_tagfor sandbox creation. - In
process_issue(), resolverepo_rootviagit.output("rev-parse", "--show-toplevel")and pass it toensure_sandbox()asproject_dir. Passforce_rebuildthrough from caller. - Remove the early
sandbox.ensure_image(agent, force_rebuild=rebuild)call frommain()(line 1721) since image building now happens insideensure_sandbox()which has the project context. - Pass
force_rebuild=rebuildthroughprocess_issue()andpoll_loop()toensure_sandbox().
Test:
TestSandboxEnsureSandbox: verifyensure_project_imageis called with correct args, final tag used for sandbox creationTestProcessIssueSandbox: verifyrepo_rootresolved and passed throughTestMainSandboxFlags: verify--rebuildstill forces rebuild of both layers
Verify: Run pytest tests/test_ralph.py -v. Fix any failures.
Review: Ensure no code path bypasses project image resolution. Verify --rebuild cascades correctly through both layers.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 6: Enhance selftest for project images [done]
Files:
scripts/ralph—selftest()functiontests/test_ralph.py—TestSelftestclass
Implement:
- After the existing "build image" check in
selftest(), detect if the current directory has.agent-loop/config viaSandbox.find_project_config(os.getcwd()) - If project config exists, add a check that builds the project image: call
ensure_project_image(agent, base_tag, os.getcwd()) - Report as
"build project image"with the project tag on success
Test:
- selftest reports project image check when
.agent-loop/dependenciesexists in cwd - selftest reports project image check when
.agent-loop/Dockerfile.sandboxexists in cwd - selftest skips project image check when no
.agent-loop/directory in cwd - selftest reports failure when project image build fails
Verify: Run pytest tests/test_ralph.py::TestSelftest -v.
Review: Ensure selftest cleanup handles project images correctly.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 7: Run all checks [done]
Implement:
- Run the full test suite:
pytest tests/test_ralph.py -v - Run
python3 -c "import py_compile; py_compile.compile('scripts/ralph', doraise=True)"to verify syntax - Fix any failures
Verify: All checks pass clean.
Step 8: Create commit
Implement:
- Stage all changes and create a commit with a descriptive message summarizing the feature.
Verify: git log -1 shows the commit.
Conventions
- Language: Python 3 (stdlib only, no third-party dependencies)
- Tests: pytest with
unittest.mock.patch,tmp_pathfixture for filesystem isolation - Error messages: Prefix with
ralph: - Exit codes: 0=success, 1=runtime error
- Imports: Use
from conftest import import_scriptin tests to import the extensionlessralphscript - Docker: Content-addressed image tags,
subprocess.runfor all Docker CLI calls