A distributed worker coordination system that uses tmux as the execution surface and Redis as the control plane.
PaneBus lets you dispatch commands to workers running in tmux panes, with automatic health monitoring, crash recovery, and reliable message delivery.
The Problem: You need to run multiple long-lived worker processes (AI agents, background jobs, processing pipelines) and coordinate them reliably. Traditional approaches require complex container orchestration or custom process management.
The Solution: PaneBus treats tmux panes as lightweight execution containers. Redis provides the coordination layer. You get:
- Reliable message delivery via Redis BLMOVE (at-least-once semantics)
- Automatic health monitoring with TTL-based heartbeats
- Crash recovery with intelligent respawn
- Crash loop protection to prevent runaway respawns
- Dead letter queue for failed commands
- Simple CLI for management and dispatch
┌─────────────────────────────────────────────────────────────┐
│ Redis │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────────────┐ │
│ │cmd:%1 │ │cmd:%2 │ │hb:%1 │ │panes:active │ │
│ │(queue) │ │(queue) │ │(TTL key)│ │(set of pane ids)│ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│ │ ▲
│ │ │ heartbeat
▼ ▼ │
┌──────────────┐ ┌──────────────┐ │
│ Worker %1 │ │ Worker %2 │──┘
│ (tmux pane) │ │ (tmux pane) │
└──────────────┘ └──────────────┘
▲ ▲
│ respawn │ health check
│ │
┌──────────────────────────────────┐
│ Watchdog │
│ - monitors heartbeats │
│ - respawns crashed workers │
│ - quarantines crash-loopers │
└──────────────────────────────────┘
| Component | Role |
|---|---|
| Worker | Runs in a tmux pane, pulls commands from its Redis queue, sends heartbeats |
| Watchdog | Monitors all workers, respawns crashed ones, manages crash loop protection |
| Admin | Dispatches commands, queries status, manages the dead letter queue |
| CLI | Command-line interface for all operations |
# Clone the repository
git clone https://github.com/yourusername/pane-bus.git
cd pane-bus
# Install with pip (editable mode for development)
pip install -e .
# Or install dependencies directly
pip install -r requirements.txt- Python 3.11+
- Redis 6.2+ (for BLMOVE support)
- tmux 3.0+ (workers must run inside tmux panes)
# Using Docker
docker run -d --name redis -p 6379:6379 redis:7-alpine
# Or install locally
brew install redis && brew services start redisImportant: Workers must run inside tmux panes. The worker automatically detects its pane ID from tmux (e.g., %0, %1). Running outside tmux generates a random standalone ID which is harder to work with.
# Create a new tmux session
tmux new-session -d -s panebus -n main
# Split into panes and start workers
tmux split-window -t panebus:main
tmux send-keys -t panebus:main.0 'panebus worker --role processor' Enter
tmux send-keys -t panebus:main.1 'panebus worker --role processor' Enter
# Attach to see the workers
tmux attach -t panebusOr use the bootstrap script:
./scripts/bootstrap.shpanebus watchdog# Simple natural language task dispatch
panebus ask Fix the failing tests
# With specific working directory
panebus ask -d /path/to/project Add user authentication
# Fire and forget (don't wait for completion)
panebus ask --no-wait Refactor the payment module
# Verbose mode (show detailed logs)
panebus ask -v Investigate the codebaseThe ask command automatically orchestrates the full workflow:
- Review - Analyze the codebase
- Plan - Create implementation plan
- Work - Execute the plan
- Compound - Document learnings (if valuable)
# Ping a specific worker
panebus dispatch %1 ping
# Send a prompt to a worker
panebus dispatch %1 prompt --payload '{"text": "Hello, agent!"}'
# Broadcast to all workers
panebus broadcast ping
# Dispatch to all workers with a specific role
panebus dispatch-role processor ping# System overview
panebus status
# List all panes
panebus panes
# Check the dead letter queue
panebus dlq list
# View recent events
panebus eventsFor orchestrated multi-agent workflows with visual debugging:
# Set up tmux session with 5 specialized panes
panebus workflow-setup
# This creates:
# Pane 0: Orchestrator - coordinates workflow phases
# Pane 1: Planner - creates implementation plans
# Pane 2: Reviewer - analyzes codebase
# Pane 3: Worker - executes the plan
# Pane 4: Compounder - documents learnings
# Then from another terminal, send tasks:
panebus ask Fix the authentication bug
# Tear down when done
panebus workflow-teardownNote: For simple tasks, you may not need the full multi-agent setup. Consider using Claude Code directly:
claude -p "Fix the authentication bug"PaneBus is configured via environment variables:
| Variable | Default | Description |
|---|---|---|
PANEBUS_REDIS_URL |
redis://localhost:6379/0 |
Redis connection URL |
PANEBUS_HEARTBEAT_INTERVAL |
10 |
Seconds between heartbeats |
PANEBUS_HEARTBEAT_TTL |
60 |
Seconds before a pane is considered dead |
PANEBUS_COMMAND_TIMEOUT |
300 |
Default command timeout in seconds |
PANEBUS_MAX_QUEUE_DEPTH |
100 |
Maximum commands per queue |
PANEBUS_RESPAWN_LIMIT |
5 |
Respawns before quarantine |
PANEBUS_RESPAWN_WINDOW |
300 |
Window for counting respawns (seconds) |
PANEBUS_LOG_LEVEL |
INFO |
Logging level |
PANEBUS_REDIS_URL=redis://localhost:6379/0
PANEBUS_HEARTBEAT_INTERVAL=10
PANEBUS_HEARTBEAT_TTL=60
PANEBUS_LOG_LEVEL=DEBUG# Send a natural language task (recommended)
panebus ask TASK...
# Options:
# --wait/--no-wait Wait for workflow completion (default: wait)
# --timeout N Timeout in seconds (default: 600)
# -d, --working-dir Working directory (default: current)
# -v, --verbose Show detailed logs
# Examples:
panebus ask Fix the failing tests
panebus ask -d /path/to/project Add user authentication
panebus ask --no-wait Refactor the payment module# Set up multi-agent tmux session
panebus workflow-setup [--session NAME] [--no-attach]
# Tear down session
panebus workflow-teardown [SESSION]
# Check workflow status
panebus workflow-status WORKFLOW_ID [--json]
# List workflows
panebus workflows [--count N] [--active] [--json]# Start a worker inside a tmux pane (auto-detects pane ID)
panebus worker --role processor
# Options:
# --role Worker role for routing (default: worker)Note: Run workers inside tmux panes. The pane ID (e.g., %0) is auto-detected and used for command routing. Running outside tmux generates a random standalone-* ID.
# Start the watchdog
panebus watchdog
# Options:
# --check-interval Seconds between health checks (default: 30)# System status
panebus status
# List panes (optionally filter by role)
panebus panes [--role ROLE]
# Dispatch a command to a specific pane
panebus dispatch PANE_ID COMMAND_TYPE [--payload JSON] [--timeout MS]
# Dispatch to all panes with a role
panebus dispatch-role ROLE COMMAND_TYPE [--payload JSON]
# Broadcast to all panes
panebus broadcast COMMAND_TYPE [--payload JSON]
# Dead letter queue management
panebus dlq list [--limit N]
panebus dlq purge
# View events
panebus events [--count N]| Type | Description | Payload |
|---|---|---|
ping |
Health check | None |
prompt |
Send text to Claude Code | {"text": "...", "context": {"working_dir": "..."}} |
shutdown |
Graceful shutdown | None |
PaneBus workers invoke Claude Code CLI for prompt commands:
# Install Claude Code CLI
npm install -g @anthropic-ai/claude-code
# Install plugins (optional but recommended)
claude /install-plugin https://github.com/EveryInc/compound-engineering-pluginWhen a worker receives a prompt command, it runs:
claude --dangerously-skip-permissions -p "your prompt text"The --dangerously-skip-permissions flag enables autonomous execution without permission prompts.
Example dispatch with working directory:
panebus dispatch %1 prompt --payload '{"text": "Fix the failing tests", "context": {"working_dir": "/path/to/project"}}'Override the handle_command method in your worker subclass:
from panebus.worker import Worker
from panebus.schema import Command
class MyWorker(Worker):
def handle_command(self, cmd: Command) -> None:
if cmd.type == "prompt":
text = cmd.payload.get("text", "")
# Process the prompt...
self.log.info("processed_prompt", text=text[:50])
elif cmd.type == "custom_action":
# Handle custom command type
pass
else:
super().handle_command(cmd)PaneBus uses Redis's BLMOVE command for reliable queue processing:
- Commands are pushed to a pane's queue (
cmd:%pane_id) - Worker atomically moves command to processing list (
processing:%pane_id) - Worker executes the command
- Worker acknowledges by removing from processing list
- If worker crashes, watchdog can recover unacknowledged commands
Workers send heartbeats every 10 seconds (configurable):
- Worker sets a Redis key (
hb:%pane_id) with 60-second TTL - Watchdog checks if heartbeat key exists
- Missing heartbeat = dead worker
- Watchdog respawns using
tmux respawn-pane
To prevent runaway respawns:
- Each respawn is recorded with timestamp
- If 5 respawns occur within 5 minutes, pane is quarantined
- Quarantined panes require manual intervention
- Use
panebus statusto see quarantined panes
Failed commands go to the DLQ:
- Command execution fails or times out
- Error details recorded with original command
- Commands can be inspected and retried
- Purge old entries when no longer needed
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run with coverage
pytest --cov=panebus --cov-report=term-missing
# Run specific test file
pytest tests/test_worker.py -vpane-bus/
├── src/panebus/
│ ├── __init__.py
│ ├── admin.py # Service discovery and dispatch
│ ├── cli.py # Click CLI commands
│ ├── config.py # Configuration management
│ ├── logging.py # Structured logging setup
│ ├── redis_client.py # Redis operations wrapper
│ ├── schema.py # Pydantic models
│ ├── watchdog.py # Health monitoring
│ └── worker.py # Worker implementation
├── tests/
│ ├── conftest.py # Pytest fixtures
│ ├── test_admin.py
│ ├── test_redis_client.py
│ ├── test_schema.py
│ └── test_worker.py
├── scripts/
│ └── bootstrap.sh # tmux setup script
├── pyproject.toml
└── README.md
- Type hints on all public functions
- Pydantic V2 for data validation
- structlog for structured logging
- Ruff for linting and formatting
- Run multiple watchdog instances with leader election (not yet implemented)
- Use Redis Sentinel or Redis Cluster for HA
- Consider running workers across multiple tmux sessions on different hosts
PaneBus emits events to a Redis stream (events):
# View events
panebus events --count 100
# Or directly from Redis
redis-cli XRANGE events - + COUNT 100Key events to monitor:
worker_started/worker_stoppedcommand_received/command_completed/command_failedpane_respawned/pane_quarantinedheartbeat_missed
- Each pane has its own queue (horizontal scaling by adding panes)
- Use roles to route commands to specific worker types
- Queue depth limits prevent memory exhaustion
- Check Redis connectivity:
redis-cli ping - Verify worker is registered:
panebus panes - Check queue depth:
redis-cli LLEN cmd:%pane_id - Look at worker logs for errors
- Ensure watchdog is running:
panebus watchdog - Check if pane is quarantined:
panebus status - Verify tmux target is correct
- Check watchdog logs for respawn errors
- List DLQ entries:
panebus dlq list - Check error messages for root cause
- Fix the issue and retry manually
- Purge processed entries:
panebus dlq purge
MIT License - see LICENSE for details.
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request