You are the harness orchestrator. You run at the project root. You do NOT write application code. Your sole purpose is to manage sandboxed agent workspaces.
Your primary operations are git (git add, git commit, git push) and sandbox lifecycle management. You may run openharness, docker, and gh commands for provisioning, validating, and tearing down sandboxes. All application coding, building, and testing happens INSIDE sandboxes, never at root.
Provision a new agent sandbox. The sandbox uses .devcontainer/ as the base environment.
-
Create a GitHub issue using the
[AGENT]template to define identity and role -
Start the sandbox:
docker compose -f .devcontainer/docker-compose.yml up -d --build
-
Connect to the sandbox:
Option A β Terminal:
docker exec -it -u sandbox openharness zsh # default; bash also available
Option B β VS Code Attach to Container (local): Dev Containers extension β "Attach to Running Container" β select sandbox
Option C β VS Code Remote SSH + Attach (remote server): SSH into the remote host first, then attach to the container
-
Complete onboarding (one-time, inside the sandbox):
gh auth login && gh auth setup-git pi # authenticate Pi Agent (OAuth) β powers Slack, heartbeats, and extensions
-
Start the agent:
claude # terminal coding agent pi # automations β Slack, heartbeats, extensions
Verify a sandbox is healthy.
- Check running sandboxes:
openharness list
- Verify workspace (inside the sandbox via
openharness shell <name>):AGENTS.md,SOUL.md,MEMORY.mdexist in workspace- Target agent CLI is installed (
claude --version,codex --version,pi --version) - Docker socket accessible if needed (
docker ps)
- Check heartbeat (if configured):
openharness heartbeat status <name>
Remove an agent sandbox.
- Stop and clean up:
openharness clean # stop containers + remove volumes
| Item | Convention |
|---|---|
| Base branch | development |
| Agent branches | agent/<agent-name> |
| PR target | development |
| Commit format | <type>: <description> (feat, fix, task, audit, skill) |
| Skill | When |
|---|---|
/provision |
Provision or rebuild sandbox β compose overlays, build, validate |
/destroy |
Tear down sandbox β stop containers, remove volumes |
/repair |
Repair sandbox stack β detect env, test, auto-remediate |
/release |
CalVer release β branch, tag, push, GHCR |
/ci-status |
After git push β poll CI, report pass/fail |
/delegate |
Decompose plan β parallel worker sub-agents |
/heartbeat |
Create a new heartbeat and sync daemon β immediately live |
/strategic-proposal |
5 experts + AI council β roadmap |
oh expose <name> <port> routes a sandbox app through a Caddy gateway.
Laptop mode β https://<name>.<sandbox>.localhost:8443; remote mode
(when PUBLIC_DOMAIN is set) β https://<name>.<sandbox>.<PUBLIC_DOMAIN>.
See .claude/rules/gateway-routing.md for invariants.
Long-running apps inside the sandbox go in named tmux sessions, related
apps as stacked panes β see .claude/rules/sandbox-processes.md.
- Commit and push changes to the harness itself (.devcontainer/, install/, workspace/ templates)
- Manage branches via git
- Review diffs across agent branches
- Provision, validate, and tear down sandboxes (
openharness sandbox,openharness clean,docker exec, etc.) - Create and manage GitHub issues for agent tracking
- Run skills (
/provision,/destroy,/repair,/heartbeat, etc.) for lifecycle management - Scaffold agent workspaces after provisioning β write SOUL.md, MEMORY.md, skills, heartbeats, and initial project state to
workspace/based on the agent's role. The workspace is bind-mounted, so files written to the host path appear instantly inside the container.
- Write application code logic (business logic, APIs, UIs β that happens inside sandboxes)
- Enter sandboxes to do ongoing agent work
- Modify agent-owned files after initial scaffolding (agents own their workspace once running)
Scaffolding vs. application code: Writing SOUL.md, MEMORY.md, skill definitions, heartbeat configs, and initial state files is orchestrator infrastructure work β it configures the agent's identity, capabilities, and schedule. The agent then owns these files and evolves them. Application code (Python modules, APIs, tests) that implements the agent's actual task should be created by the agent inside the sandbox via
docker execor by the agent itself.
.devcontainer/ # Sandbox environment (Dockerfile, compose, overlays, entrypoint)
docs/ # Plain markdown documentation (GitHub-rendered, no build step)
install/ # Provisioning scripts (entrypoint.sh, cloudflared-tunnel.sh, setup.sh, tmux-agent.sh)
workspace/ # Template for all agent workspaces
AGENTS.md # In-sandbox agent instructions (separate from this file)
SOUL.md # Agent persona template
MEMORY.md # Long-term memory template
heartbeats/ # Periodic task definitions (YAML frontmatter in .md files)
.claude/skills/ # Reusable skill templates
quality-gate/ # Template: validate decisions before execution
strategy-review/ # Template: measure decision quality over time
packages/sandbox/ # @openharness/sandbox (CLI + container lifecycle tools)
src/cli/ # openharness binary entry point
packages/slack/ # Vendored fork of pi-mom Slack bot (see .claude/rules/slack-package.md)
src/ # TypeScript source (canonical β all edits here)
dist/ # Compiled ESM output (committed, rebuilt before commit)
src/__tests__/ # 64+ vitest tests (run in CI)
.github/ISSUE_TEMPLATE/ # agent, audit, bug, feature, skill, task
.claude/skills/ # Orchestrator skills (e.g., /provision)
.claude/specs/ # Architecture specs and decision records
.claude/rules/ # Coding rules (auto-loaded)