-
Notifications
You must be signed in to change notification settings - Fork 4
Description
branch: ralph-multi-agent
Spec: Multi-Agent Support for Ralph
Overview
Ralph is currently hardcoded to use Claude Code as its AI coding agent. This spec adds support for Cursor Agent CLI (cursor-agent) as an alternative backend, selectable via --agent cursor. The change introduces:
- An
--agent <name>CLI flag (default:claude) - A single Docker image with both agent CLIs installed
- Agent-specific CLI invocation in the container entrypoint
- Docker volume-based credential storage (no auth tokens as env vars or on host disk)
- Read-only host config mounts for agent rules/settings (symlinked into the agent's config dir)
- Auto-detection of auth failures with re-prompt
Architecture
ralph --agent cursor --issue 42
│
├─ 1. Auth check (Docker volume "ralph-auth")
│ └─ Volume path: /home/ralph/.<agent>/
│ └─ If missing credential:
│ ├─ claude (macOS): extract from Keychain, pipe into volume
│ ├─ claude (Linux): extract from ~/.claude/.credentials.json, pipe into volume
│ ├─ claude (no creds): run `claude setup-token` on host, then extract
│ └─ cursor (any): run `cursor-agent login` on host, then extract
│
├─ 2. Docker image (single image, both CLIs)
│ └─ ralph:uid-<UID> (claude + cursor-agent both installed)
│
├─ 3. Container mounts
│ ├─ -v <worktree>:/work (source code)
│ ├─ -v ralph-auth:/home/ralph/.<agent>/ (credentials, persistent)
│ ├─ -v ~/.claude/:/home/ralph/.claude-host/:ro (config/rules, read-only)
│ ├─ -v ~/.cursor/:/home/ralph/.cursor-host/:ro (config/rules, read-only)
│ ├─ -v ~/.gitconfig:/home/ralph/.gitconfig:ro (existing)
│ └─ -v ~/.ssh:/home/ralph/.ssh:ro (existing)
│
├─ 4. Entrypoint
│ ├─ Symlink config from .<agent>-host/ into .<agent>/ (skip credential files)
│ ├─ Git config setup (existing, shared)
│ ├─ Branch on $AGENT:
│ │ ├─ claude: claude -p --dangerously-skip-permissions --model $MODEL --reasoning-effort high
│ │ └─ cursor: cursor-agent -p --force --trust --sandbox disabled --model $MODEL
│ └─ HEAD tracking + optional push (existing, shared)
│
└─ 5. Auth failure recovery
└─ Capture container output, grep for auth patterns
└─ If auth failure: clear credential from volume, re-run auth flow, retry
Prompt text is identical for both agents — it's generic enough that both understand the one-task-per-iteration contract.
1. CLI Interface
New flag: --agent <name> (default: claude)
- Valid values:
claude,cursor - Invalid agent name → error with exit 2
Model defaults per agent (when --model not specified):
claude→sonnetcursor→sonnet-4(verify during implementation; cursor model names may differ)
The --model flag overrides the default with pass-through (no name translation).
No ralph auth subcommand — auth is fully automated on demand.
2. Credential Storage
Docker named volume: One volume per agent (ralph-claude-auth, ralph-cursor-auth), mounted at the agent's expected config path inside the container (/home/ralph/.claude/, /home/ralph/.cursor/).
Credentials never appear as:
- Environment variables in
docker run -e - Files on the host filesystem (outside Keychain / agent-managed config)
First-run auth flow (Claude):
- Check volume:
docker run --rm -v ralph-claude-auth:/check alpine test -f /check/.credentials.json - If missing, check host credential source:
- macOS:
security find-generic-password -s "Claude Code-credentials" -w→ extractaccessTokenvia jq - Linux: read
~/.claude/.credentials.json
- macOS:
- If no host credentials found, run
claude setup-tokeninteractively on the host (opens browser) - Extract credential and pipe into volume via stdin (never as env var):
printf '%s' "$cred_json" | docker run --rm -i -v ralph-claude-auth:/dest alpine \ sh -c 'cat > /dest/.credentials.json && chmod 600 /dest/.credentials.json' - Determine the exact
.credentials.jsonformat during implementation — it needs to match whatclaudeexpects to read natively
First-run auth flow (Cursor):
- Check volume for credential file (exact path TBD — discover where
cursor-agent loginstores creds) - If missing, check host (
~/.cursor/or~/.config/Cursor/) - If no host credentials, run
cursor-agent logininteractively on the host (opens browser) - Extract and pipe into volume (same pattern as Claude)
Auth failure recovery:
- Capture container stdout/stderr
- On non-zero exit, grep output for auth-related patterns:
unauthorized,invalid.*token,invalid.*key,authentication,401,403,please log in, etc. - If auth failure detected: delete stored credential from volume, re-run auth flow, retry the container
- If non-auth failure: existing behavior (mark issue
status:needs-attention)
3. Host Config Mounts
Mount agent config directories read-only at alternate paths:
~/.claude/→/home/ralph/.claude-host/:ro~/.cursor/→/home/ralph/.cursor-host/:ro
These provide agent rules, settings, and project conventions to the container without conflicting with the writable auth volume.
Entrypoint symlink logic:
- Symlink files from
.<agent>-host/into.<agent>/(the volume-backed dir) - Skip credential files (
.credentials.json, auth tokens, etc.) to avoid overwriting volume-persisted credentials - Run this on every container start to keep config fresh
4. Docker Image
Single image with both CLIs installed. Agent selected at runtime via $AGENT env var.
Dockerfile changes:
- Keep existing base:
debian:bookworm-slim+ git, curl, jq, ripgrep, fd-find, openssh-client, etc. - Keep existing Claude install: nodejs, npm,
npm install -g @anthropic-ai/claude-code - Add Cursor install:
curl https://cursor.com/install -fsSL | bash(verify this works in Docker build context; may need to run beforeUSER ralphif installer requires root) - Verify both
claudeandcursor-agentbinaries are on PATH
Image tags unchanged: ralph:uid-<UID> or ralph:custom-<hash>
5. Entrypoint
Receives $AGENT env var (default: claude). Shared logic remains the same (git config, HEAD tracking, push). Agent-specific logic branches for CLI invocation.
Claude flags: --dangerously-skip-permissions --model $MODEL --reasoning-effort high
Cursor flags: --force --trust --sandbox disabled --model $MODEL
Prompt delivery: Both CLIs support -p (print/pipe mode). Use stdin heredoc. If cursor-agent doesn't support stdin in -p mode, fall back to passing prompt as a positional argument (verify during implementation).
The prompt text is identical for both agents.
Implementation Plan
Each step follows this structure:
- Implement — Write the code
- Test — Write BATS tests
- Verify — Run tests, fix failures until all pass
- Review — Code review for bugs, edge cases, and conventions
- Address feedback — Fix review findings, re-run tests, re-review until clean
- Update spec — Mark the step
[done]and record any decisions or deviations
Spec maintenance rules
- Mark each step
[done]when complete. - Record design decisions that emerged during implementation as notes under the step.
- Minor deviations (e.g. flag name changes, reordered logic) should be noted and the spec updated to match.
- Significant design changes (e.g. new subcommands, changed architecture, removed features) require pausing for user review before proceeding.
Step 1: Add --agent flag and agent-specific model defaults [done]
Files:
scripts/ralph— Add--agentto arg parser, set per-agent model default
Implement:
- Add
AGENT=claudeto defaults section - Add
MODEL_EXPLICIT=0flag to track whether--modelwas explicitly set - Add
--agent)case to thewhilearg parser - After parsing, validate agent name (
claudeorcursor); error exit 2 on invalid - If
MODEL_EXPLICIT=0, setMODELbased on agent:claude→sonnetcursor→sonnet-4(verify correct name)
- Update usage comment header to document
--agent
Test:
- Default agent is
claude, default model issonnet --agent cursorsets model default tosonnet-4--agent cursor --model gpt-5overrides togpt-5--agent invalid→ exit 2
Verify: Run tests. ralph --help shows new flag.
Review: Backwards compatibility — all existing usage unchanged.
Step 2: Implement Docker volume auth for Claude [done]
Files:
scripts/ralph— Replace hardcoded Keychain extraction with volume-based auth
Implement:
- Create
ensure_auth_volume()function that:- Takes agent name as argument
- Checks if credential exists in the named volume (
ralph-<agent>-auth) - Returns 0 if exists, 1 if not
- Create
setup_claude_auth()function that:- Checks host for existing credentials:
- macOS: Keychain
Claude Code-credentials→ jq extractaccessToken - Linux:
~/.claude/.credentials.json
- macOS: Keychain
- If no host credentials: run
claude setup-tokeninteractively, then re-check - Pipe credential into volume via stdin (never env var, never host disk)
- Checks host for existing credentials:
- Replace the hardcoded Keychain block (lines 140-147) with volume auth check + setup
- Remove
CLAUDE_CODE_OAUTH_TOKENfromdocker runenv vars
Test:
- Volume check correctly detects missing/present credentials
- macOS Keychain extraction still works
- Error messages guide user correctly
Verify: Run with Claude, verify no auth env vars in docker run.
Review: No credentials leaked via env vars, logs, or host filesystem.
Notes:
check_auth_volume()andpipe_to_auth_volume()are generic helpers taking agent name and credential file as argssetup_claude_auth()tries: (1) macOS Keychain, (2)~/.claude/.credentials.json, (3) interactiveclaude setup-tokenensure_auth()dispatches to the right setup function based on agent; cursor placeholder returns error until Step 3CLAUDE_CODE_OAUTH_TOKENenv var completely removed from docker run — credentials now live only in Docker volumes- BATS tests added for: volume check skip, volume check miss + setup, no env var leak, Keychain extraction, filesystem fallback, missing CLI error
Step 3: Implement Docker volume auth for Cursor [done]
Files:
scripts/ralph— Add Cursor auth alongside Claude auth
Implement:
- Create
setup_cursor_auth()function that:- Checks host for existing credentials (discover where
cursor-agent loginstores them) - If no host credentials: run
cursor-agent logininteractively - Pipe credential into volume
- Checks host for existing credentials (discover where
- Create
ensure_auth()dispatcher that calls the right setup function based on$AGENT - Determine the exact credential file path and format for Cursor (implementation discovery)
Test:
- Cursor auth flow prompts correctly when no credentials exist
- Credential stored in volume and persists across runs
Verify: Run with --agent cursor, verify auth flow works end-to-end.
Review: Same security properties as Claude auth — no env vars, no host disk.
Notes:
setup_cursor_auth()checks two host paths:~/.cursor/auth.jsonand~/.config/Cursor/auth.json- Falls back to interactive
cursor-agent loginif no host credentials found - Credential file is
auth.json(stored inralph-cursor-authDocker volume) ensure_auth()already existed from Step 2; cursor case updated from placeholder to real implementation- BATS tests added for: volume check skip, volume check miss + setup, volume name verification, alternate config path, missing CLI error
Step 4: Add auth failure detection and re-prompt [done]
Files:
scripts/ralph— Add output capture and auth failure detection to container run
Implement:
- Capture container stdout/stderr to a variable or temp file
- On non-zero exit, check output against auth failure patterns:
- Case-insensitive grep for:
unauthorized,invalid.*(token|key|credential),authentication,401,403,please log in,expired
- Case-insensitive grep for:
- If auth failure detected:
- Delete credential from volume:
docker run --rm -v ralph-<agent>-auth:/auth alpine rm -f /auth/<credential-file> - Re-run auth setup flow
- Retry the container (limit to 1 retry to avoid infinite loops)
- Delete credential from volume:
- If non-auth failure: existing behavior (
status:needs-attention)
Test:
- Auth failure pattern detection works for known error strings
- Non-auth failures are NOT treated as auth failures
- Retry limit prevents infinite loops
Verify: Simulate auth failure, verify re-prompt and retry behavior.
Review: Pattern matching is broad enough to catch real auth errors but not false-positive on unrelated errors.
Notes:
- Added three helper functions:
agent_credential_file()(returns credential filename per agent),is_auth_failure()(grep-based pattern matching on captured output),clear_auth_volume()(removes credential file from Docker volume) - Container output captured via
teeto a temp file while still displaying to stdout pipestatus[1](zsh) used to get docker exit code from the pipe- Auth retry limited to 1 attempt via
auth_retriedflag - On auth failure: clear volume → re-run
ensure_auth→ retry container - On retry failure (auth or not): falls through to existing needs-attention labeling
- Pattern matches:
unauthorized,invalid.*(token|key|credential),authentication failed,401,403,please log in,expired.*token,token.*expired - Updated existing "needs-attention on container failure" test to handle auth volume check in docker stub
- BATS tests added for: auth failure retry, non-auth failure passthrough, retry limit, pattern matching, volume clearing (claude and cursor)
Step 5: Update Dockerfile to install both CLIs [done]
Files:
docker/ralph/Dockerfile— Add Cursor agent CLI installation
Implement:
- Keep existing Claude install:
npm install -g @anthropic-ai/claude-code - Add Cursor install:
curl https://cursor.com/install -fsSL | bash- Verify this works in Docker build context
- May need to run before
USER ralphif installer requires root - Verify
cursor-agentbinary is on PATH after install
- Verify both
claudeandcursor-agentare functional in the built image
Test:
docker run <image> which claude→ founddocker run <image> which cursor-agent→ founddocker run <image> claude --version→ worksdocker run <image> cursor-agent --version→ works
Verify: Build image, run verification commands.
Review: Image size acceptable, both CLIs work, no conflicts.
Notes:
- Cursor installer runs as root (before
USER ralph) since it installs to system paths command -v cursor-agentverification ensures the build fails fast if the installer doesn't place the binary on PATH- Both CLI installs are separate
RUNlayers for better Docker cache behavior
Step 6: Update entrypoint for agent-specific invocation and config symlinks [done]
Files:
docker/ralph/entrypoint.sh— Add agent branching, config symlink logic
Implement:
- Add config symlink step at the top of entrypoint:
- For the active agent, symlink files from
.<agent>-host/into.<agent>/ - Skip credential files (
.credentials.json, auth tokens) to preserve volume-stored credentials - Handle case where host mount doesn't exist (no-op)
- For the active agent, symlink files from
- Build prompt text into a variable (extract from current heredoc)
- Branch on
${AGENT:-claude}:claude: pipe prompt toclaude -p --dangerously-skip-permissions --model "$MODEL" --reasoning-effort highcursor: pipe prompt tocursor-agent -p --force --trust --sandbox disabled --model "$MODEL"
- If cursor-agent doesn't support stdin in
-pmode, fall back to positional arg - Keep shared logic unchanged: git config, HEAD tracking, push
Test:
- Config files are correctly symlinked from host mount
- Credential files are NOT overwritten by symlinks
- Claude invocation matches current behavior
- Cursor invocation uses correct flags
Verify: Run container with each agent, verify config symlinks and CLI invocation.
Review: Symlink logic is safe (no overwrite of credentials), prompt identical for both.
Notes:
symlink_agent_config()function iterates files in.<agent>-host/, skips credential files per agent, and creates symlinks in.<agent>/- Skip patterns:
.credentials.jsonfor claude,auth.jsonfor cursor - Existing non-symlink files in the agent dir are preserved (protects volume-persisted data)
- Existing symlinks are updated via
ln -sfto keep config fresh on each run $AGENTdefaults toclaudewhen unset (${AGENT:-claude})- Prompt text extracted into
$PROMPT_TEXTvariable, piped to whichever agent CLI is selected - Claude flags:
-p --dangerously-skip-permissions --model $MODEL --reasoning-effort high - Cursor flags:
-p --force --trust --sandbox disabled --model $MODEL - Unknown agent values cause exit 1 (defensive — should never happen since ralph validates)
- BATS tests cover: both agents invoked with correct flags, config symlinks, credential file preservation, missing host dir, symlink updates, default agent
Step 7: Update Docker run in ralph for volume and config mounts [done]
Files:
scripts/ralph— Updatedocker runinprocess_issuefor new mount scheme
Implement:
- Add auth volume mount:
-v ralph-<agent>-auth:/home/ralph/.<agent>/ - Add host config mounts (read-only):
-v "$HOME/.claude:/home/ralph/.claude-host:ro"(if dir exists)-v "$HOME/.cursor:/home/ralph/.cursor-host:ro"(if dir exists)
- Add
-e "AGENT=$AGENT"to docker run - Remove
-e "CLAUDE_CODE_OAUTH_TOKEN=$OAUTH_TOKEN"(no longer needed) - Keep existing mounts: worktree, git dir, gitconfig, ssh, spec file
Test:
- Default
--agent claudemounts correct volume and config --agent cursormounts correct volume and config- No auth env vars in docker run command
- Existing mounts (worktree, git, ssh) unchanged
Verify: Run with each agent, inspect docker run args.
Review: No hardcoded Claude references remain. Volume names are agent-specific.
Notes:
run_container()updated with three new mount/env additions: auth volume, host config mounts, and AGENT env var- Auth volume:
-v ralph-<agent>-auth:/home/ralph/.<agent>/— uses agent name for both volume name and mount path - Host config mounts: conditionally added only when
~/.claudeor~/.cursordirectories exist on host, mounted read-only at.<agent>-host config_mountslocal array built dynamically based on directory existence checks-e AGENT=$AGENTpasses the selected agent to the entrypoint for CLI branching- No
CLAUDE_CODE_OAUTH_TOKENor any auth env vars in docker run — confirmed removed - Existing mounts (worktree, git dir, gitconfig, ssh, spec file) unchanged
- BATS tests added for: auth volume mount (claude + cursor), AGENT env var (claude + cursor), host config read-only mount, skip missing config dirs, no auth env vars, existing mounts preserved
Step 8: Update CLAUDE.md documentation [done]
Files:
CLAUDE.md— Update ralph section with new CLI options
Implement:
- Add
--agent <name>to the Options table (default: claude, also: cursor) - Update architecture notes about Docker volume auth
- Add examples showing cursor usage
Test:
- Documentation accurately reflects implementation
Verify: Read and verify accuracy.
Step 9: Run all checks [done]
Implement:
- Run shellcheck on
scripts/ralphanddocker/ralph/entrypoint.sh - Run
zsh -n scripts/ralph - Run
bash -n docker/ralph/entrypoint.sh - Run BATS tests if any exist for ralph
- Fix any failures
Verify: All checks pass clean.
Notes:
docker/ralph/entrypoint.sh: shellcheck clean, bash syntax check clean, all 26 BATS tests passscripts/ralph: Fixed three legitimate shellcheck findings (SC2053: unquoted==RHS on lines 585/626, SC2295: unquoted expansion in${..}on line 346). Remaining warnings are zsh idioms that shellcheck doesn't understand ($match[],$pipestatus,localat top level, quoted regex in=~)zsh -nandtest_ralph.batscould not run in this container (no zsh installed) — these tests require the host environment
Step 10: Create commit
Implement:
- Stage all changes and create a commit: "Add multi-agent support to ralph (Claude + Cursor)"
Verify: git log -1 shows the commit.
Conventions
- Language: zsh for
scripts/ralph, bash fordocker/ralph/entrypoint.sh - Tests: BATS with temp directories for isolation
- Error messages: Prefix with
ralph: - Exit codes: 0=success, 1=runtime error, 2=usage error