Cloud platform for running autonomous AI coding agents on tasks in git repositories.
Agents work in isolated sandboxes, users observe and interact via WebSocket chat, and an analytics dashboard aggregates telemetry. Agents build project knowledge over time via semantic memory and codebase RAG.
License: MIT | Runtime: Bun | Repository: github.com/beshkenadze/pihordev
- System Design
- Core Concepts
- Architecture
- Data Flow
- Agent Lifecycle
- Memory System
- WebSocket Protocol
- Authentication & Authorization
- Data Model
- Analytics Dashboard
- Tech Stack
- Monorepo Structure
- Testing Strategy
- CI/CD
- Quick Start
- Environment Variables
- Production Roadmap
graph TD
subgraph Browser
UI[Dashboard / Chat UI]
end
subgraph "Web App (Next.js)"
API[API Routes]
WS[WebSocket Gateway]
Auth[better-auth]
end
subgraph "Data Layer"
DB[(PostgreSQL<br/>timescaledb-ha:pg16)]
Vec[(pgvector +<br/>pgvectorscale)]
end
subgraph "Agent Infrastructure"
Queue[Sidequest.js Queue]
Worker[Agent Worker]
Sandbox[OpenSandbox Container]
PiAgent[pi-coding-agent]
end
subgraph "Memory Layer (Mastra)"
WM[Working Memory]
SR[Semantic Recall]
GRAG[Codebase GraphRAG]
end
subgraph "External"
LLM[LLM Providers]
Git[Git Repository]
Email[Resend]
end
UI -->|HTTP| API
UI -->|WebSocket| WS
API --> Auth
API --> DB
API -->|Create Task| Queue
Queue -->|Dequeue| Worker
Worker -->|Spawn| Sandbox
Sandbox --> PiAgent
PiAgent -->|LLM calls server-side| LLM
Worker -->|Clone| Git
Worker -->|Events| WS
WS -->|Stream| UI
Worker --> WM
Worker --> SR
Worker --> GRAG
WM --> DB
SR --> Vec
GRAG --> Vec
Auth --> Email
| Concept | Description |
|---|---|
| Project | Linked git repository with persistent memory (working memory + codebase index) |
| Task | Feature or bug described in natural language; one agent per task |
| Agent | pi-coding-agent running inside an isolated OpenSandbox container, enriched with project memory |
| Session | Live WebSocket connection between browser and running agent |
| Persona | Needs | Views |
|---|---|---|
| Org Admin | Spend control, team comparison, adoption metrics, error monitoring | Org dashboard, projects |
| Team Lead | Team throughput, agent utilization, cost per project | Team dashboard, task board |
| Developer | Create tasks, observe agents in real-time, intervene when needed | Task creation, live chat |
Single Next.js 16 application at the repository root, running on Bun.
| Path | Purpose |
|---|---|
/app |
Routes, layouts, server components, API route handlers |
/components |
Reusable UI components (shadcn/ui, charts, dashboard, agent) |
/lib |
Shared utilities, domain logic, integration helpers |
/docker |
Docker Compose (PostgreSQL + OpenSandbox) and agent base image |
The project will evolve into a Bun workspaces + Turborepo monorepo with two apps and nine internal packages.
| App | Purpose |
|---|---|
apps/web |
Next.js 16 App Router — dashboard UI, agent chat, API routes (/api/auth/[...all], /api/v1/...) |
apps/agent-worker |
Sidequest.js worker process that dequeues tasks and spawns agent sandboxes |
| Package | Purpose |
|---|---|
packages/db |
ZenStack v3 schema (.zmodel) + Kysely client. Encryption plugin for sensitive fields. better-auth integration for ACL |
packages/auth |
better-auth with organization plugin (teams) + passkey plugin. Roles: owner, admin, member |
packages/agent |
AgentManager: creates sandbox, clones repo, writes skills, starts agent session with memory. TelemetryCollector records events |
packages/memory |
Mastra-based three-layer memory: working memory, semantic recall, codebase GraphRAG |
packages/queue |
Sidequest.js on PostgreSQL backend. Per-org concurrency limits |
packages/ws |
WebSocket gateway for real-time agent ↔ browser communication |
packages/analytics |
Kysely queries over telemetry tables for dashboard metrics |
packages/email |
React Email templates + Resend for transactional email |
packages/shared |
Shared TypeScript types, model pricing constants, utilities |
User creates task
→ Sidequest queue (PostgreSQL-backed, per-org concurrency)
→ Agent Worker dequeues
→ AgentManager:
1. Create OpenSandbox container
2. Clone git repo → checkout feature branch
3. Write platform skills + AGENTS.md
4. Index codebase if first task (README, docs, configs → GraphRAG)
5. Inject memory: working memory + semantic recall + codebase RAG
6. Start pi-coding-agent session (LLM calls server-side)
7. Subscribe to events → stream via WebSocket to browser
→ On completion:
• Update working memory with learnings
• Index new code into GraphRAG
• Record telemetry to dashboard
→ User can connect/disconnect via WebSocket at any time
stateDiagram-v2
[*] --> Queued: Task created
Queued --> Starting: Worker dequeues
Starting --> Running: Sandbox ready + agent started
Running --> Running: Tool calls, LLM turns
Running --> Attention: Agent needs input
Attention --> Running: User sends steer/followUp
Running --> Completed: Task done
Running --> Error: Failure
Completed --> [*]
Error --> [*]
Running --> Aborted: User aborts
Aborted --> [*]
Sandbox isolation: Each task gets its own OpenSandbox container with a full git clone. Egress allowed, ingress blocked. API keys never enter the sandbox — LLM calls are proxied server-side. Tools (bash, read, write, edit) are proxied through the OpenSandbox SDK.
Platform skills injected into every sandbox: git-workflow, test-runner, code-review, dependency-mgmt, project-setup.
LLM providers (via pi-ai): Anthropic, OpenAI, Google Gemini, GitHub Copilot, Z.AI/GLM, OpenRouter, Groq, Ollama.
Three layers of memory per project, powered by Mastra SDK:
graph LR
subgraph "Per-Project Memory"
WM["Working Memory<br/>(structured JSON)"]
SR["Semantic Recall<br/>(cross-task RAG)"]
GR["Codebase GraphRAG<br/>(repo knowledge graph)"]
end
WM -->|"Tech stack, conventions,<br/>team rules, known issues"| Agent
SR -->|"Relevant context from<br/>completed tasks"| Agent
GR -->|"README, docs, configs<br/>graph traversal"| Agent
Agent -->|"Update learnings"| WM
Agent -->|"Save task summary"| SR
Structured JSON scratchpad persisted across tasks. Contains: tech stack, conventions, team rules, deployment config, known issues, completed features. Agent updates it after each task via updateWorkingMemory tool. Stored in PostgreSQL via @mastra/pg.
RAG over past task conversations. When a new task starts, searches for relevant context from completed tasks in the same project. Cross-thread search scoped by resourceId = projectId. Uses pgvector embeddings with text-embedding-3-small.
Knowledge graph built from repo docs (README, ARCHITECTURE.md, key configs). Indexed on first clone, re-indexed on significant changes. Combines vector similarity with graph traversal for deeper context retrieval. Uses pgvectorscale StreamingDiskANN index for production-grade filtered search (28x lower p95 latency vs HNSW at scale).
All three layers are injected into agent context via the pi extension before_agent_start hook. After task completion, agent_end hook saves learnings back to memory.
Auth required on upgrade. Messages are JSON with type field.
| Type | Description |
|---|---|
agent_event |
Tool calls, LLM output, streaming content |
agent_status |
Status changes: running, done, error |
agent_attention |
Agent needs user input |
| Type | Description |
|---|---|
steer |
Redirect agent's approach |
followUp |
Add context or instructions |
abort |
Kill the running agent |
set_model |
Change LLM provider/model |
Built on better-auth with organization plugin (teams enabled) + passkey plugin.
- Registration: Email signup with verification (React Email + Resend)
- Login: Email/password or passkey (WebAuthn)
- Passkey: Optional enrollment in profile settings (post-registration)
- Organizations: Multi-tenant with invite flow
- Roles: owner → admin → member (enforced via ZenStack ACL)
| Role | Permissions |
|---|---|
| Owner | Full CRUD on everything in org, billing |
| Admin | Manage projects, tasks, members (not billing) |
| Member | Create tasks, view dashboard, observe agents |
Cross-org isolation enforced at the database layer — users in Org A cannot see Org B data.
Single PostgreSQL instance (timescale/timescaledb-ha:pg16) hosts all data.
One database, two clients — a deliberate MVP trade-off. Mastra auto-manages its own schema as part of its API contract.
| Client | Manages | Migrations |
|---|---|---|
| ZenStack v3 (Kysely) | App data, ACL, encryption | zen migrate dev |
Mastra (PgStore + PgVector) |
Memory, threads, embeddings | Auto-created by Mastra |
Both clients share the same DATABASE_URL. Queue tables (Sidequest.js) are also auto-managed in the same database.
organization, team, user, member, project, task, agent_instance, api_key
Sensitive fields (API key values) encrypted at rest via ZenStack v3 encryption plugin.
agent_event, daily_usage, daily_error_summary, daily_tool_usage
threads, messages, embeddings, working_memory
Vector indexes via pgvectorscale StreamingDiskANN (disk-based, not HNSW) for production-grade filtered search on project-scoped embeddings.
Job queue metadata stored in the same PostgreSQL instance.
erDiagram
organization ||--o{ team : has
organization ||--o{ member : has
organization ||--o{ project : owns
user ||--o{ member : "belongs to"
team ||--o{ member : "grouped in"
project ||--o{ task : contains
task ||--o| agent_instance : "runs as"
agent_instance ||--o{ agent_event : emits
organization ||--o{ api_key : has
project ||--o{ working_memory : "memory for"
project ||--o{ embeddings : "vectors for"
Telemetry from AgentSession events. No OpenTelemetry pipeline — direct Kysely queries over telemetry tables.
| Page | Metrics |
|---|---|
| Overview | KPI cards (total tasks, success rate, avg cost, active agents), daily usage trend |
| Usage & Cost | Breakdown by project, team, model. Token consumption over time |
| Performance | p50/p95 latency, success rate over time |
| Errors | Error table grouped by type, sparklines |
| Tools & Agents | Tool usage frequency, agent model distribution |
| Live Agents | Currently running agents with status, model, token count |
(dashboard)/— Overview, Usage, Performance, Errors, Tools, Live Agents, Projects, Settings(auth)/— Login, Signup, Invite accept
| Layer | Technology | Notes |
|---|---|---|
| Runtime & PM | Bun | Package manager and runtime |
| Monorepo | Turborepo | Build caching, parallel execution |
| Linting | Biome v2 | Replaces ESLint + Prettier |
| Frontend | Next.js 16 (App Router) | SSR, API routes, WebSocket |
| UI Components | shadcn/ui + AI elements | Chat/AI components |
| Charts | Tremor | Dashboard visualizations |
| Styling | Tailwind CSS v4 | Utility-first CSS |
| Auth | better-auth | Org + passkey plugins |
| ORM / ACL | ZenStack v3 (Kysely engine) | No Prisma dependency |
| Encryption | ZenStack v3 encryption plugin | Sensitive fields at rest |
| Memory | @mastra/memory | Threads, semantic recall, working memory |
| RAG | @mastra/rag | MDocument, chunking, GraphRAG |
| Vectors | @mastra/pg (pgvector + pgvectorscale) | StreamingDiskANN indexes |
| Database | PostgreSQL (timescaledb-ha:pg16) | App + telemetry + queue + vectors |
| Queue | Sidequest.js | PostgreSQL-backed job queue |
| Agent | pi-coding-agent SDK | Multi-provider LLM, tools, skills |
| Sandbox | Alibaba OpenSandbox | Isolated containers |
| React Email + Resend | Transactional email | |
| WebSocket | ws | Real-time communication |
| Validation | Zod | Runtime type validation |
| Testing | Vitest, Playwright, Testing Library | Unit, E2E, component tests |
| CI/CD | GitHub Actions | Lint, test, build, Docker publish |
| Registry | ghcr.io | Docker images |
pihordev/
├── app/ # Next.js 16 App Router
│ ├── (auth)/ # Login, signup, invite
│ ├── (dashboard)/ # Dashboard pages
│ └── api/ # API routes
├── components/
│ └── ui/ # shadcn/ui components
├── lib/ # Utilities (cn, types, etc.)
├── docker/ # Docker Compose + agent base image
├── docs/ # Plans and specs
├── packages/
│ ├── db/ # ZenStack v3 schema + Kysely client
│ └── shared/ # Types, constants, utilities
├── biome.json
├── next.config.ts
└── package.json
pihordev/
├── apps/
│ ├── web/ # Next.js dashboard + agent chat
│ │ ├── app/
│ │ │ ├── (auth)/ # Login, signup, invite
│ │ │ ├── (dashboard)/ # All dashboard pages
│ │ │ └── api/ # Auth + v1 API routes
│ │ └── components/
│ │ ├── ui/ # shadcn/ui
│ │ ├── ai/ # AI elements (chat)
│ │ ├── charts/ # Tremor
│ │ ├── dashboard/ # KPI cards, usage charts
│ │ └── agent/ # Chat, status badge
│ │
│ └── agent-worker/ # Sidequest worker process
│
├── packages/
│ ├── db/ # ZenStack v3 schema + Kysely client
│ ├── auth/ # better-auth (org + passkey)
│ ├── agent/ # AgentManager + sandbox tools
│ ├── memory/ # Mastra memory (3 layers)
│ ├── queue/ # Sidequest.js config + jobs
│ ├── ws/ # WebSocket gateway
│ ├── analytics/ # Dashboard Kysely queries
│ ├── email/ # React Email + Resend
│ └── shared/ # Types, constants, utilities
│
├── skills/ # Platform skills for agents
├── seed/ # Seed data generator
├── docker/ # Docker Compose + agent base image
├── .github/ # CI/CD workflows + templates
├── turbo.json
├── biome.json
└── package.json
| Level | Tool | Scope | Coverage Target |
|---|---|---|---|
| Unit | Vitest | All packages | > 80% |
| Integration | Vitest + testcontainers | DB, auth, agent, memory, API endpoints | All endpoints + agent + memory |
| E2E | Playwright (Chromium) | Core user flows | Critical paths |
| Lint/Format | Biome v2 | Entire codebase | Zero violations |
- agent: Sandbox tool proxying, timeout/error handling, cost calculation
- memory: Working memory CRUD, semantic recall, GraphRAG indexing/querying, pi extension hooks
- analytics: Time range queries, group by, percentiles, empty results
- ws: Message routing, broadcast, auth rejection, disconnect handling
- queue: Job execution, retry, concurrency limits
- agent: Full lifecycle (start → clone → tools → events → stop), concurrent agents, error recovery
- memory: Real PostgreSQL + pgvectorscale: index → recall → verify context injection
- auth: Signup → login → passkey enrollment → org invite → role verification
- db: ACL enforcement (owner/admin/member), cross-org isolation, encrypted field roundtrip
- Login → overview dashboard → KPI cards visible
- Create project → create task → agent starts → chat messages appear
- Send steer message → appears in chat
- Navigate all dashboard pages
- Invite member, create API key
- 1 org, 3 teams, 8 users, 5 projects, 30 tasks, 30 days of telemetry
- Small pre-indexed repo for memory integration testing
Lint & Format (Biome) → Type Check → Unit Tests → Integration Tests (PostgreSQL service) → E2E (Playwright)
- Concurrency: cancel in-progress runs per branch
- PostgreSQL service:
timescale/timescaledb-ha:pg16with health checks - Artifacts: unit coverage, E2E results on failure
- All jobs use
bun install --frozen-lockfile
- Build + push Docker images to
ghcr.io:- Main app image
- Agent base image (Node.js + git + dev tools)
- Uses GitHub Container Registry with build caching
- CI must pass before merge
- At least 1 review on PRs
- No direct push to
main
# Prerequisites: Bun, Docker
# Install dependencies
bun install
# Start services (PostgreSQL + OpenSandbox)
docker compose -f docker/docker-compose.yml up -d
# Run migrations
bun turbo db:migrate
# Seed data (optional)
bun turbo db:seed
# Dev server
bun turbo dev --filter=webbun turbo build # Build all packages
bun turbo typecheck # Type check
bun biome check . # Lint & format check
bun biome check --write . # Auto-fix lint issues
bun turbo test:unit # Unit tests
bun turbo test:integration # Integration tests (needs PostgreSQL)
bun turbo test:e2e # E2E tests (needs running app + DB)
bun turbo test:unit --filter=memory # Tests for a single package| Variable | Description |
|---|---|
DATABASE_URL |
PostgreSQL connection string (postgres://dev:dev@localhost:5432/pihordev) |
BETTER_AUTH_SECRET |
Auth secret key (random string) |
RESEND_API_KEY |
Resend email API key |
OPENAI_API_KEY |
OpenAI API key (for embeddings: text-embedding-3-small) |
OPENSANDBOX_URL |
OpenSandbox server URL (http://localhost:8080) |
ENCRYPTION_KEY |
ZenStack field encryption key (32-byte hex) |
Features not implemented in MVP, planned for production:
| Feature | Technology | Purpose |
|---|---|---|
| Container orchestration | Kubernetes + OpenSandbox K8s runtime | Auto-scaling agent containers |
| High-throughput queue | BullMQ + Redis | Replace Sidequest.js for scale |
| Telemetry warehouse | ClickHouse | High-volume analytics storage |
| Caching layer | Redis | Dashboard cache, WS pub/sub, memory cache |
| SSO/SAML | better-auth plugins | Enterprise authentication |
| Push notifications | Browser Push API | Real-time alerts |
| Auto PR creation | GitHub App | Agents create PRs on completion |
| Budget alerts | Custom | Cost threshold notifications |
Note on vector storage: pgvectorscale with StreamingDiskANN handles production scale (28x lower p95 vs Pinecone s1 at 50M vectors). A dedicated vector DB (Qdrant/Pinecone) is not needed.