Tart macOS VM Sandbox Backend

---
branch: tart-sandbox-backend
---
# Spec: Tart macOS VM Sandbox Backend

## Overview

Ralph currently runs Claude Code inside Docker (Linux) sandboxes. For iOS/Swift projects that need Xcode, SwiftUI, and SwiftData, this is insufficient — those frameworks require macOS. This feature adds Tart (a Virtualization.framework wrapper by Cirrus Labs) as an alternative sandbox backend, enabling full `xcodebuild test` validation in macOS VMs on Apple Silicon.

Projects opt in by adding `.agent-loop/config.json` with `"type": "tart"` to their repo. The existing Docker backend remains the default and is unchanged.

Key design decisions:
- **Backend abstraction**: `SandboxBackend` base class with `DockerSandbox` (existing, renamed) and `TartSandbox` (new)
- **Image caching**: Content-addressed Tart templates (hash of base image name + dependencies file). APFS CoW clones make per-branch VMs instant.
- **Command execution**: `tart exec` via guest agent (no SSH needed)
- **VM user**: Default `admin` user (Cirrus Labs image default, has sudo for brew/npm)
- **VM limit**: Apple SLA permits max 2 macOS VMs; ralph checks and gives clear errors
- **Proxy reuse**: Existing credential proxy stays as Docker container; Tart VMs reach it via host IP instead of `host.docker.internal`
- **Network isolation**: Deferred to follow-up (pf firewall approach documented in plan)
- **Dependencies**: `.agent-loop/dependencies` for Docker stays as apt packages; for Tart it's a shell script. The `config.json` type determines parsing.

## Architecture

```
.agent-loop/                     (in target project repo)
├── config.json                  (NEW — sandbox type + settings)
│   {"type": "tart", "base_image": "ghcr.io/cirruslabs/macos-sequoia-xcode:latest"}
├── dependencies                 (existing — apt packages for Docker, shell script for Tart)
└── Dockerfile.sandbox           (existing — custom Docker image, Docker only)

scripts/ralph                    (in dotfiles repo)
├── SandboxBackend               (NEW — base class with interface)
├── DockerSandbox(SandboxBackend) (RENAMED from Sandbox)
├── TartSandbox(SandboxBackend)  (NEW)
├── load_sandbox_config()        (NEW — reads config.json)
└── create_sandbox_backend()     (NEW — factory)
```

**Tart VM lifecycle:**
```
Base image (OCI registry)
  → tart clone → Template VM (agent-loop-template-{agent}-{hash})
                    dependencies installed, stopped, cached
  → tart clone → Per-branch VM (agent-loop-{agent}-{branch})
                    tart run --no-graphics --dir=workspace:{worktree}
                    kept running between iterations
                    tart exec for all commands
```

**Proxy access from Tart VM:**
```
Tart VM (NAT) → host gateway IP (192.168.64.1) → proxy port → Docker proxy container
```

---

## Implementation Plan

### Step 1: Add .agent-loop/config.json support [done]

**Files:**
- `scripts/ralph` — Add `load_sandbox_config()` function
- `tests/test_ralph.py` — Tests for config loading

**Implement:**
1. Add `load_sandbox_config(project_dir)` function near the existing `find_project_config` method (around line 144). It reads `.agent-loop/config.json` from the project directory. Returns a dict with at least `{"type": "docker"}` as default. If the file exists, parse it with `json.load()`. Validate that `type` is either `"docker"` or `"tart"`, raise `ValueError` for unknown types.
2. For `"tart"` type, the config may include: `base_image` (str, required for tart), `cpu` (int, optional), `memory_gb` (int, optional).
3. This is a standalone module-level function, not a method on Sandbox.

**Test:**
- `TestLoadSandboxConfig`: test default when no config, test docker type, test tart type with base_image, test missing type defaults to docker, test unknown type raises ValueError, test missing config.json returns default, test malformed JSON raises.

**Verify:** Run `pytest tests/test_ralph.py -v -k TestLoadSandboxConfig`. Fix any failures.

**Review:** Ensure config validation catches bad input early with clear error messages.

**Address feedback:** Fix all review findings. Re-run tests.

### Step 2: Extract SandboxBackend base class and rename Sandbox to DockerSandbox [done]

**Files:**
- `scripts/ralph` — Add base class, rename existing class

**Implement:**
1. Add a `SandboxBackend` base class between the `Git` class and the current `Sandbox` class. It defines interface methods that raise `NotImplementedError`: `ensure_image`, `ensure_sandbox`, `setup_git_config`, `run_iteration`, `preflight_check`, `cleanup_sandbox`, `prune_sandboxes`, `remove_sandbox`. Also add `proxy_host()` method. Move the `sandbox_name` static method here (shared logic).
2. Rename `class Sandbox` to `class DockerSandbox(SandboxBackend)`.
3. Add `def proxy_host(self): return "host.docker.internal"` to DockerSandbox.
4. Update all references to `Sandbox` throughout the file: `main()` line ~2243 (`Sandbox(dotfiles_dir)` → `DockerSandbox(dotfiles_dir)`), `prune-sandboxes` subcommand line ~2118, `cleanup_sandbox` static calls that reference `Sandbox.sandbox_name` or `Sandbox.cleanup_sandbox`, and `selftest()`.
5. Do NOT change any behavior — this is a pure rename/extract refactor.
6. Note: `Sandbox` has duplicate method definitions (lines ~107-220 and ~221-317 have `generate_project_dockerfile`, `find_project_config`, `project_image_tag`, `ensure_project_image` defined twice). During the rename, remove the first set of duplicates (keep the second set that's actually used). Verify with tests that the kept methods are the ones being tested.

**Test:**
- All existing `TestSandbox*` test classes should pass with minimal changes (update any explicit `Sandbox(...)` constructor calls to `DockerSandbox(...)`).
- Add a basic test that `SandboxBackend` methods raise `NotImplementedError`.

**Verify:** Run `pytest tests/test_ralph.py -v`. ALL existing tests must pass. Fix any failures.

**Review:** Verify this is a pure refactor — no behavior changes. Grep for any remaining references to `Sandbox(` that should be `DockerSandbox(`.

**Address feedback:** Fix all findings. Re-run full test suite.

### Step 3: Add backend factory and update process_issue/main [done]

**Files:**
- `scripts/ralph` — Add factory, update callers
- `tests/test_ralph.py` — Update test fixtures

**Implement:**
1. Add `create_sandbox_backend(sandbox_type, dotfiles_dir, **kwargs)` factory function. For `"docker"`, returns `DockerSandbox(dotfiles_dir)`. For `"tart"`, it will return `TartSandbox(...)` (stub for now — raise `NotImplementedError("tart backend not yet implemented")`).
2. Modify `process_issue()` signature: replace `sandbox` parameter with `dotfiles_dir`. Inside `process_issue`, after resolving the worktree (`work_dir`), call `config = load_sandbox_config(repo_root)` then `sandbox = create_sandbox_backend(config["type"], dotfiles_dir, **config)`.
3. Replace the hardcoded `"host.docker.internal"` in env_vars (line ~1871) with `sandbox.proxy_host()`.
4. Update `main()`: instead of `sandbox = Sandbox(dotfiles_dir)` → pass `dotfiles_dir` to `process_issue`.
5. Update `poll_loop()` similarly: accept `dotfiles_dir` instead of `sandbox`, pass it to `process_issue`.
6. Update all callers in `main()` that pass `sandbox` to pass `dotfiles_dir` instead.

**Test:**
- `TestCreateSandboxBackend`: docker returns DockerSandbox, tart raises NotImplementedError (for now), unknown raises ValueError.
- Update `TestProcessIssueSandbox` fixtures: mock `load_sandbox_config` to return `{"type": "docker"}`, mock `create_sandbox_backend` to return a mock sandbox. Verify `proxy_host()` is called for env vars.
- Update `TestMainSandboxFlags` to match new signatures.

**Verify:** Run `pytest tests/test_ralph.py -v`. All tests pass.

**Review:** Ensure no remaining hardcoded `host.docker.internal` strings. Ensure `sandbox.proxy_host()` is used everywhere the proxy URL is constructed.

**Address feedback:** Fix all findings. Re-run tests.

### Step 4: Implement TartSandbox — image and template management [done]

**Files:**
- `scripts/ralph` — TartSandbox class (first half)
- `tests/test_ralph.py` — Unit tests

**Implement:**
1. Add `TartSandbox(SandboxBackend)` class. Constructor accepts `config` dict (with `base_image`, optional `cpu`, `memory_gb`, `dependencies_content`). Store config. Initialize `self._vm_procs = {}` dict for tracking running VMs.
2. Implement `_template_name(self, agent)`: hash = SHA256(base_image + "\n" + dependencies_content)[:12], returns `f"agent-loop-template-{agent}-{hash}"`.
3. Implement `_list_vms(self)`: runs `tart list --format json`, returns parsed list. Returns `[]` on failure.
4. Implement `_running_vm_count(self)`: filters `_list_vms()` for `State == "Running"`, returns count.
5. Implement `_check_vm_limit(self)`: if `_running_vm_count() >= 2`, raise `RuntimeError` with message: `"ralph: cannot start VM — {count} macOS VMs already running (Apple SLA permits max 2). Stop an existing VM with: tart stop <name>"`.
6. Implement `_wait_for_guest_agent(self, vm_name, timeout=120)`: poll `tart exec <vm> -- echo ok` every 2 seconds until success or timeout. Raise `RuntimeError` on timeout with message including VM name and timeout.
7. Implement `ensure_image(self, agent, force_rebuild=False)`:
   - Compute template name via `_template_name(agent)`
   - Check if template already exists in `_list_vms()` (by name match)
   - If exists and not force_rebuild, return template name
   - If force_rebuild and exists, delete old template first (`tart delete`)
   - Run `_check_vm_limit()` before starting any VM
   - Clone base image: `tart clone {base_image} {template}`
   - If dependencies_content is non-empty: start VM headless (`tart run {template} --no-graphics` as Popen), wait for guest agent, execute dependencies via `tart exec -i {template} -- bash -e` with dependencies as stdin, then stop VM (`tart stop {template}`) and wait for the Popen to finish
   - Return template name
8. Read dependencies content: when `create_sandbox_backend` creates a TartSandbox, it should read `.agent-loop/dependencies` if it exists and pass the content as `dependencies_content` in the config dict. Update `create_sandbox_backend` to handle this.

**Test:**
- `TestTartTemplateName`: deterministic hash, changes with base_image, changes with dependencies
- `TestTartListVms`: mock subprocess, test parse, test failure returns []
- `TestTartCheckVmLimit`: 0 running passes, 1 running passes, 2 running raises with clear message
- `TestTartWaitForGuestAgent`: succeeds on first try, succeeds after retries, times out raises RuntimeError
- `TestTartEnsureImage`: template exists returns cached, force_rebuild deletes and recreates, dependencies installed via tart exec, no dependencies skips install step

**Verify:** Run `pytest tests/test_ralph.py -v -k TestTart`. Fix any failures.

**Review:** Check that VM limit errors include actionable guidance. Check that the Popen for `tart run` is properly cleaned up in ensure_image (stop + wait).

**Address feedback:** Fix all findings. Re-run tests.

### Step 5: Implement TartSandbox — sandbox lifecycle and command execution [done]

**Files:**
- `scripts/ralph` — TartSandbox remaining methods
- `tests/test_ralph.py` — Unit tests

**Implement:**
1. `ensure_sandbox(self, agent, branch, worktree_path, **kwargs)`:
   - Generate name via `self.sandbox_name(agent, branch)`
   - Check if VM exists and is running → reuse
   - If exists but stopped → delete and recreate
   - Call `_check_vm_limit()` before starting
   - Clone from template: `tart clone {template} {name}`
   - Start headless with directory sharing: `tart run {name} --no-graphics --dir=workspace:{worktree_path}` as Popen, store in `self._vm_procs[name]`
   - Register atexit handler (once) to stop all VMs on exit
   - Wait for guest agent
   - Return name
2. `setup_git_config(self, sandbox_name, user, email)`: run `tart exec {name} -- git config --global user.name {user}`, same for email, same for `safe.directory *`.
3. `run_iteration(self, sandbox_name, spec_content, model, env_vars=None)`:
   - Write spec: `tart exec -i {name} -- tee /tmp/spec.md` with input=spec_content, stdout devnull
   - Build claude command with env vars: `cd '/Volumes/My Shared Files/workspace' && env KEY=VAL ... claude -p '...' --model {model} --dangerously-skip-permissions --effort high`
   - Execute: `tart exec {name} -- bash -c "{claude_cmd}"`
   - Read spec: `tart exec {name} -- cat /tmp/spec.md`
   - Return (exit_code, updated_spec)
4. `proxy_host(self)`: first try `tart exec {name} -- route -n get default` and parse gateway IP. Fallback to `ipconfig getifaddr en0` on the host. Final fallback `192.168.64.1`. Cache the result after first discovery.
5. `cleanup_sandbox(self, agent, branch)`: `tart stop {name}; tart delete {name}`. Remove from `_vm_procs`.
6. `remove_sandbox(self, name)`: same as cleanup but by name.
7. `prune_sandboxes(self, agent)`: list VMs with matching prefix, check if associated worktree path exists (derive from VM name), delete orphans.
8. `preflight_check(self, sandbox_name, agent, proxy_port)`: check token, proxy health, VM responsive (`tart exec echo ok`). Skip network isolation check (log note that Tart VMs don't have network isolation). Return list of failure messages.
9. For `exec_output`, `check_in_sync`, `reset_to_host`, `sync_to_host`: these Docker-specific methods on DockerSandbox use `docker sandbox exec`. TartSandbox needs equivalent implementations using `tart exec`. For `sync_to_host`, since Tart uses VirtioFS (shared directory), host and VM see the same files — commits in the VM are immediately visible on the host. So `sync_to_host` can verify the commit exists on host and return True. `check_in_sync` always returns True (shared filesystem). `reset_to_host` is a no-op (shared filesystem).
10. Update `create_sandbox_backend` to create `TartSandbox` for type "tart" (remove the NotImplementedError stub).

**Test:**
- `TestTartEnsureSandbox`: creates new VM, reuses running VM, deletes stopped VM and recreates, VM limit check
- `TestTartSetupGitConfig`: correct tart exec commands
- `TestTartRunIteration`: writes spec, runs claude with env vars, reads spec back, handles write failure, handles read failure
- `TestTartProxyHost`: gateway parsing, fallback to en0, final fallback
- `TestTartCleanupSandbox`: stops and deletes
- `TestTartPruneSandboxes`: removes orphans, keeps active
- `TestTartPreflightCheck`: all pass, token missing, proxy down, VM unresponsive
- `TestTartSyncToHost`: returns True (shared filesystem)
- `TestTartCheckInSync`: returns True always
- Update `TestCreateSandboxBackend`: tart type now returns TartSandbox instance

**Verify:** Run `pytest tests/test_ralph.py -v`. ALL tests pass.

**Review:** Check that env vars in run_iteration are properly shell-escaped to prevent injection. Check that atexit cleanup handles the case where VMs are already stopped. Check that proxy_host caching works correctly.

**Address feedback:** Fix all findings. Re-run full test suite.

**Notes:**
- Moved `ITERATION_PROMPT` from `DockerSandbox` to `SandboxBackend` base class so both backends can reference it.
- Added `import atexit` and `import shlex` to script imports.
- `_atexit_registered` is a class-level flag; the actual atexit handler is a no-op safety net since `cleanup_sandbox` is the primary cleanup path.
- `prune_sandboxes` uses stopped-state as the orphan heuristic since Tart VMs don't store workspace metadata. Running VMs and templates are preserved.
- `create_sandbox_backend` already returned `TartSandbox` (no `NotImplementedError` stub to remove — that was handled in Step 4).

### Step 6: Wire up CLI and prerequisites [done]

**Files:**
- `scripts/ralph` — Update CLI, selftest, prerequisites

**Implement:**
1. Update `check_dependencies_prereq()` to also check for `tart` when running in tart mode. Since we don't know the sandbox type at prereq check time (it depends on the project), check for `tart` in the Tart backend constructor instead, or make the check lazy. Best approach: add a `check_prerequisites()` method to `SandboxBackend`. `DockerSandbox.check_prerequisites()` verifies `docker` is available. `TartSandbox.check_prerequisites()` verifies both `docker` (for proxy) and `tart` are available.
2. Update `selftest()`: accept an optional `sandbox_type` parameter (default "docker"). When "tart", use TartSandbox-specific checks: build template, clone test VM, verify tart exec works, verify proxy reachable from VM via host IP, verify Claude auth via proxy, cleanup. The tart selftest still uses the Docker proxy (proxy stays unchanged).
3. Add `--type docker|tart` flag to the selftest subcommand parser in `main()`.
4. Update `prune-sandboxes` to detect sandbox type or accept `--type` flag. When type is "tart", use `TartSandbox.prune_sandboxes`. Default remains docker.

**Test:**
- `TestSelftest`: add test for `--type tart` routing
- `TestMainSandboxFlags`: test `--type` flag parsing
- Test prerequisite checks for TartSandbox

**Verify:** Run `pytest tests/test_ralph.py -v`. All tests pass.

**Review:** Check that the selftest cleanup always runs (even on failure). Check error messages when tart is not installed.

**Address feedback:** Fix all findings. Re-run tests.

**Notes:**
- `check_prerequisites()` added to `SandboxBackend` (abstract), `DockerSandbox` (checks docker), and `TartSandbox` (checks tart + docker).
- `selftest()` refactored into shared preamble (token, prerequisites, proxy) + `_selftest_docker()` and `_selftest_tart()` helper functions.
- Introduced `_SelftestAbort` exception to abort early from helper functions while preserving cleanup in the main `selftest()` finally block.
- Cleanup is now always attempted (remove_sandbox is idempotent), rather than conditionally based on `sandbox_created` flag.
- The prerequisites check adds one extra check to the selftest count (9 for docker without project image, 10 with).
- `--type` flag validates against `("docker", "tart")` with exit code 2 for unknown types.

### Step 7: Run all checks [done]

**Implement:**
1. Run the full test suite: `pytest tests/test_ralph.py -v`
2. Run shellcheck on any shell scripts that were modified
3. Run `python3 -c "import py_compile; py_compile.compile('scripts/ralph', doraise=True)"` to verify syntax
4. Fix any failures and commit the fixes

**Verify:** All checks pass clean.

### Step 8: Fix TartSandbox atexit cleanup with class-level VM tracking [done]

**Files:**
- `scripts/ralph` — Move `_vm_procs` to class variable, implement `_atexit_stop_all`
- `tests/test_ralph.py` — Tests for atexit behavior

**Implement:**
1. Change `_vm_procs` from an instance variable to a class variable: add `_vm_procs = {}` at class level on `TartSandbox`, and remove `self._vm_procs = {}` from `__init__`.
2. Implement `_atexit_stop_all`: iterate `TartSandbox._vm_procs.items()`, call `tart stop` for each VM (best-effort, suppress errors), then `proc.wait()` for each tracked Popen. Clear the dict afterward.
3. All existing code that reads/mutates `self._vm_procs` continues to work since Python resolves instance attribute reads to the class variable when no instance attribute shadows it, and dict mutations (subscript assignment, `.pop()`) mutate the class-level dict in place.

**Test:**
- Test that `_atexit_stop_all` calls `tart stop` for each tracked VM and waits for each Popen process.
- Test that `_atexit_stop_all` handles empty `_vm_procs` gracefully.
- Test that `_atexit_stop_all` clears `_vm_procs` after cleanup.
- Test that `_vm_procs` is shared across instances (class-level).

**Verify:** Run `pytest tests/test_ralph.py -v`. Fix any failures.

**Review:** Ensure no code path reassigns `self._vm_procs = ...` (which would shadow the class variable). All mutations must be in-place (subscript, `.pop()`).

**Address feedback:** Fix findings, re-run tests.

**Notes:**
- `_vm_procs = {}` added as class variable on `TartSandbox`; `self._vm_procs = {}` removed from `__init__`.
- `_atexit_stop_all` now iterates `TartSandbox._vm_procs.items()`, calls `tart stop` for each VM (suppressing errors), waits for each Popen with a 10-second timeout, then clears the dict.
- Test classes that mutate `_vm_procs` (`TestTartEnsureSandbox`, `TestTartProxyHost`, `TestTartCleanupSandbox`, `TestTartRemoveSandbox`) now save/restore the class-level dict in `setup_method`/`teardown_method` to prevent cross-test pollution.

### Step 9: Fix config unpacking, interface consistency, and VM list caching [done]

**Files:**
- `scripts/ralph` — Multiple small fixes
- `tests/test_ralph.py` — Update affected tests

**Implement:**
1. **Config unpacking** (critical): In `process_issue`, pop `"type"` from the config dict before passing `**config` to `create_sandbox_backend`, so `type` is not passed twice (once as positional, once in kwargs). Also pop `"project_dir"` since it's added to config and consumed by `create_sandbox_backend` but not a valid `TartSandbox`/`DockerSandbox` constructor argument.
2. **Base class interface**: Add `check_in_sync`, `reset_to_host`, and `sync_to_host` to `SandboxBackend` as abstract methods (raising `NotImplementedError`), since both `DockerSandbox` and `TartSandbox` implement them and `process_issue` calls them polymorphically.
3. **Signature consistency**: Remove the unused `workdir` parameter from `DockerSandbox.run_iteration`. The base class signature is `run_iteration(self, sandbox_name, spec_content, model, env_vars=None)` and `process_issue` never passes `workdir`.
4. **Extract shared folder constant**: Add `SHARED_DIR = "/Volumes/My Shared Files/workspace"` as a class constant on `TartSandbox`. Use it in `run_iteration` where the path is currently hardcoded in the `cd` command.
5. **VM list caching**: Add a `_cached_vm_list` tuple of `(timestamp, result)` to `TartSandbox` (class-level). In `_list_vms`, return the cached result if it's less than 2 seconds old. This avoids redundant `tart list` subprocess calls when `_vm_state` and `_check_vm_limit` are called in quick succession within `ensure_sandbox`.

**Test:**
- Verify `process_issue` no longer passes `type` in kwargs to `create_sandbox_backend`.
- Test that `SandboxBackend.check_in_sync`, `.reset_to_host`, `.sync_to_host` raise `NotImplementedError`.
- Verify `DockerSandbox.run_iteration` works without `workdir` parameter.
- Verify `TartSandbox.SHARED_DIR` is used in `run_iteration` output.
- Test VM list caching: two calls within 2 seconds hit cache, call after expiry fetches fresh.

**Verify:** Run `pytest tests/test_ralph.py -v`. All tests pass.

**Review:** Check that removing `workdir` from Docker doesn't break any callers (grep for `workdir`). Ensure cache TTL is short enough to not mask real VM state changes.

**Address feedback:** Fix findings, re-run tests.

**Notes:**
- Config unpacking: `type` is now popped from the config dict in `process_issue` before `**config` is passed to `create_sandbox_backend`. `project_dir` was already handled by the factory's tart branch (popped in `create_sandbox_backend`).
- Base class interface: `check_in_sync`, `reset_to_host`, `sync_to_host` added to `SandboxBackend` as abstract methods raising `NotImplementedError`.
- `DockerSandbox.run_iteration`: removed `workdir` parameter; `process_issue` no longer passes `workdir=work_dir` to `run_iteration`. Docker sandbox exec's `-w` flag is still used by `exec_output` and `check_in_sync` (separate methods).
- `TartSandbox.SHARED_DIR = "/Volumes/My Shared Files/workspace"` added as class constant, used in `run_iteration`.
- VM list cache: `_vm_list_cache` class variable as `(timestamp, result)` tuple with 2-second TTL using `time.monotonic()`. Tests save/restore the cache in setup/teardown.

### Step 10: Run all checks

**Implement:**
1. Run the full test suite: `pytest tests/test_ralph.py -v`
2. Run `python3 -c "import py_compile; py_compile.compile('scripts/ralph', doraise=True)"` to verify syntax
3. Fix any failures and commit the fixes

**Verify:** All checks pass clean.

---

## Conventions

- **Language:** Python 3 (stdlib only, no third-party dependencies)
- **Tests:** pytest with `unittest.mock` for subprocess mocking. Test classes named `TestXxx`, test methods `test_xxx`.
- **Error messages:** Prefix with `ralph:` (e.g., `ralph: cannot start VM — ...`)
- **Exit codes:** 0=success, 1=runtime error, 2=usage error
- **Imports:** stdlib only. `subprocess.run` for external commands, `json` for config parsing.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tart macOS VM Sandbox Backend #30

branch: tart-sandbox-backend

Spec: Tart macOS VM Sandbox Backend

Overview

Architecture

Implementation Plan

Step 1: Add .agent-loop/config.json support [done]

Step 2: Extract SandboxBackend base class and rename Sandbox to DockerSandbox [done]

Step 3: Add backend factory and update process_issue/main [done]

Step 4: Implement TartSandbox — image and template management [done]

Step 5: Implement TartSandbox — sandbox lifecycle and command execution [done]

Step 6: Wire up CLI and prerequisites [done]

Step 7: Run all checks [done]

Step 8: Fix TartSandbox atexit cleanup with class-level VM tracking [done]

Step 9: Fix config unpacking, interface consistency, and VM list caching [done]

Step 10: Run all checks

Conventions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Tart macOS VM Sandbox Backend #30

Description

branch: tart-sandbox-backend

Spec: Tart macOS VM Sandbox Backend

Overview

Architecture

Implementation Plan

Step 1: Add .agent-loop/config.json support [done]

Step 2: Extract SandboxBackend base class and rename Sandbox to DockerSandbox [done]

Step 3: Add backend factory and update process_issue/main [done]

Step 4: Implement TartSandbox — image and template management [done]

Step 5: Implement TartSandbox — sandbox lifecycle and command execution [done]

Step 6: Wire up CLI and prerequisites [done]

Step 7: Run all checks [done]

Step 8: Fix TartSandbox atexit cleanup with class-level VM tracking [done]

Step 9: Fix config unpacking, interface consistency, and VM list caching [done]

Step 10: Run all checks

Conventions

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions