An operating system for building software products with Claude Code. One human, many AI agents, structured workflows from idea to deployed product.
Building products with AI is fast but chaotic. Without structure, you get:
- Claude jumping to code before understanding the problem
- Inconsistent UI across projects (hand-rolled everything, no design system)
- "Tests pass" but the app looks broken in a browser
- No deployment pipeline — code sits unshipped
- Context lost between sessions — you re-explain the same things
AI-Factory separates concerns into three strict modes with distinct AI roles, enforced by a CLAUDE.md operating system that Claude reads at the start of every session.
Human = Product Owner (decides WHAT)
Claude = Product Manager → Designer → Engineer (decides HOW)
The factory handles everything between "I have an idea" and "it's live on the internet."
graph LR
A["Idea"] --> B["Product Taste<br/><i>challenge it</i>"]
B --> C["Spec<br/><i>/opsx:propose</i>"]
C --> D["Design<br/><i>3-pass</i>"]
D --> E["Build<br/><i>TDD</i>"]
E --> F["Test"]
F --> G["QA<br/><i>browser</i>"]
G --> H["Ship<br/><i>PR</i>"]
H --> I["Deploy"]
I --> J["Verify<br/><i>prod QA</i>"]
J --> K["Archive<br/><i>specs</i>"]
K -->|"next task"| C
style B fill:#6b3d7a,color:#fff
style C fill:#2d4a7a,color:#fff
style D fill:#5a3d7a,color:#fff
style E fill:#3d6b4f,color:#fff
style G fill:#4a6b7a,color:#fff
style H fill:#3d6b4f,color:#fff
style I fill:#7a5a2d,color:#fff
style J fill:#4a6b7a,color:#fff
Every step has a skill, stack profile, or template backing it. Nothing is ad hoc.
| Mode | AI Role | What It Does | Outputs |
|---|---|---|---|
| Spec | Product Manager | Challenges the idea, writes specs, generates tasks | Proposal, design doc, specs, task list |
| Design | Product Designer | Three passes: UX architecture → component mapping → visual design | Sitemap, journey maps, component specs, style tokens |
| Execution | Engineer | TDD implementation, code review, browser QA, shipping | Production code, tests, PR |
Modes never mix. The Designer doesn't write code. The Engineer doesn't change specs. This separation is the core discipline that prevents AI from cutting corners.
Design Mode runs three sequential passes before any code is written:
graph LR
P1["Pass 1: UX Architecture<br/><i>sitemap, journeys, hierarchy<br/>3-click rule</i>"]
P2["Pass 2: Component Design<br/><i>shadcn/ui mapping, 5 states<br/>charts, icons</i>"]
P3["Pass 3: Visual Design<br/><i>tokens, theme, interactions<br/>mockups</i>"]
P1 --> P2 --> P3
style P1 fill:#5a3d7a,color:#fff
style P2 fill:#4a3d7a,color:#fff
style P3 fill:#3d3d7a,color:#fff
Pass 1 answers why things go where — information hierarchy, user journeys with click counts, task flows. Pass 2 answers what components to use — maps every screen region to specific shadcn/ui components, designs all 5 states (empty, loading, populated, error, overflow). Pass 3 answers how it looks — style tokens, theme config, interaction specs. Now informed by passes 1-2 instead of decorating in the dark.
Within Execution Mode, the Engineer adopts different mindsets depending on the task:
| Posture | Relationship to Code | Activated By |
|---|---|---|
| Builder | Adds new code. TDD rhythm, forward momentum. | test-driven-development, writing-plans, executing-plans |
| Reviewer | Questions existing code. Skeptical, looking for what's wrong. | code-review, structural-review |
| Debugger | Investigates failures. Hypothesis-driven, no guessing. | systematic-debugging |
| Tester | Uses the app as a real user. Evidence-driven, not code-driven. | qa (browser QA) |
| Shipper | Gets code landed. Changelog, version, PR. No corners cut. | ship, finishing-a-development-branch |
A builder adds, a reviewer questions, a debugger investigates, a tester uses, a shipper packages. Mixing postures weakens all of them.
graph TB
subgraph "Human Layer"
PO["Product Owner"]
end
subgraph "AI Factory"
SUP["Supervisor Claude"]
subgraph "Mode 1: Spec"
OS["OpenSpec"]
end
subgraph "Mode 2: Design"
UX["UX Architecture"]
CD["Component Design"]
VD["Visual Design"]
end
subgraph "Mode 3: Execution"
SP["Superpowers"]
CR["Code Review"]
QA["Browser QA"]
SH["Ship"]
end
end
subgraph "Knowledge Layer"
ST["20 Stack Profiles"]
SK["10 Skills"]
TM["3 Templates"]
end
PO -->|"idea"| SUP
SUP --> OS
SUP --> UX
SUP --> SP
OS -->|"specs"| UX
UX --> CD --> VD
VD -->|"deliverables"| SP
SP --> CR --> QA --> SH
ST -.->|"standards"| SP
SK -.->|"workflows"| SUP
20 stack profiles capture everything Claude needs to write idiomatic, tested, production-quality code. Each acts as a "senior engineer" for that technology — Claude reads it before writing any code.
| Stack | What It Covers |
|---|---|
stacks/typescript/ |
TypeScript + Node.js. Framework-agnostic base for any TS project |
stacks/nextjs/ |
Next.js App Router: Server Components, Server Actions, SSR/SSG/ISR |
stacks/python/ |
Python 3.11+: uv, Ruff, Pydantic, pytest, AI/ML patterns |
stacks/swift/ |
Swift/iOS: SwiftUI, MVVM, structured concurrency, SwiftData |
stacks/kotlin/ |
Kotlin/Android: Jetpack Compose, MVVM, Coroutines, Room, Hilt |
stacks/react-native/ |
React Native: Expo, Expo Router, cross-platform mobile |
stacks/godot/ |
Godot 4 + GDScript: game development, GUT testing, AI asset generation |
| Stack | What It Covers |
|---|---|
stacks/node-backend/ |
Express/Fastify: middleware, auth, Prisma/Drizzle |
stacks/fastapi/ |
FastAPI: Pydantic v2, async, FARM stack patterns |
stacks/dotnet/ |
.NET 8: Minimal APIs, EF Core, MediatR, C# 12+ |
stacks/mcp/ |
MCP server development: tool design, security, publishing |
| Stack | What It Covers |
|---|---|
stacks/sql/ |
PostgreSQL/SQLite: schema design, migrations, RLS, indexing |
stacks/nosql/ |
MongoDB/Redis/DynamoDB: document design, caching, aggregation |
stacks/vector-db/ |
Pinecone/pgvector/ChromaDB: embeddings, RAG, chunking |
| Stack | What It Covers |
|---|---|
stacks/saas/ |
Cloudflare + Supabase + Stripe: full SaaS stack with first-deploy guide |
stacks/landing/ |
Static sites: SEO, analytics, Astro/11ty, Cloudflare Pages |
stacks/infra/ |
CI/CD: GitHub Actions, Docker, Cloudflare, Railway |
stacks/browser-qa/ |
Browser QA: headless Chromium via gstack browse |
stacks/ui/ |
UI toolkit: shadcn/ui, Tailwind, Radix, Recharts, Lucide, design token pipeline |
| Stack | What It Covers |
|---|---|
stacks/template-system/ |
Stack profile scaffolding and validation |
./scripts/new-stack.sh <stack-name> # scaffold from template
./scripts/validate-stacks.sh # check all stacks for completeness10 custom skills extend the factory workflow:
| Skill | When | What It Does |
|---|---|---|
product-taste |
Before proposing features | Challenges ideas: premise, persona, scope modes (expansion/hold/reduction) |
structural-review |
Before landing code | Paranoid audit: race conditions, trust boundaries, error handling, test gaps |
ship |
When ready to ship | Merge, test, review, changelog, version bump, OpenSpec archive, PR |
qa |
After implementing web features | 4-mode browser QA: diff-aware, full, quick, regression. Health score + screenshots |
factory-retrospective |
Periodic check-in | Cross-project retro: velocity, quality, session patterns, trend tracking |
marketing-copy |
When writing launch content | Platform-specific copy: Product Hunt, App Store, landing pages, social |
openspec-propose |
Starting a new feature | Propose a change with all artifacts (proposal, design, specs, tasks) |
openspec-explore |
Investigating an idea | Thinking partner for exploring requirements and clarifying scope |
openspec-apply-change |
Implementing tasks | Work through OpenSpec tasks with progress tracking |
openspec-archive-change |
After shipping | Archive completed changes into master specs |
| Template | For | Includes |
|---|---|---|
templates/ai-product-template/ |
Non-web products (games, CLIs, APIs) | CLAUDE.md, README, .gitignore, src/, tests/ |
templates/web-product/ |
Web products (SaaS, sites) | Above + 3-pass Design Mode, shadcn/ui, CI workflow, browser QA |
templates/stack-profile/ |
New stack profiles | 5 template files for consistent stack documentation |
Three strict modes. Spec, design, and execution never mix. This prevents Claude from jumping to code before the problem is understood.
Stack profiles as senior engineers. Rather than hoping Claude knows best practices, the stack profile tells it exactly how to write code for that technology.
Three-pass design. UX architecture before component selection before visual design. Each pass informs the next. No more decorating in the dark.
Projects are independent. Each product lives in its own git repo under projects/. The factory provides workflow and standards; projects own their code.
End-to-end pipeline. From idea to deployed product: product taste → spec → design → build → test → QA → ship → deploy → verify → archive. Every step is covered.
Context hygiene. Clear the conversation after each major task. Memory files persist across clears, so institutional knowledge is retained without context bleed.
- Claude Code (CLI or Desktop)
- A Claude Pro or Team subscription
claude plugin install superpowers@claude-plugins-official
claude plugin install code-review@claude-plugins-official
claude plugin install commit-commands@claude-plugins-official| Plugin | Role | What It Does |
|---|---|---|
| Superpowers | Engineering Team | TDD, code review, subagent-driven development, worktrees, debugging |
| Code Review | Quality Gate | Pull request review against plans and coding standards |
| Commit Commands | Git Automation | Commit, push, PR creation, branch cleanup |
OpenSpec is invoked via slash commands (/opsx:propose, /opsx:explore, /opsx:archive) — no separate install needed.
For web projects, install gstack browse for headless browser testing:
git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack
cd ~/.claude/skills/gstack && ./setupSee stacks/browser-qa/setup.md for details.
- Clone this repo
- Install Claude Code and the plugins above
- Run
claudefrom the repo root - Create a new project:
# Non-web product (game, CLI, API) cp -r templates/ai-product-template projects/your-project # Web product (SaaS, site) cp -r templates/web-product projects/your-project
- Start with
/opsx:propose "your idea"to enter Spec Mode
The factory currently runs as independent Claude Code sessions — one per project, orchestrated by the human switching between terminals. The next evolution is a control plane that sits above project sessions:
- Knows what each project agent is doing, what's blocked, what just finished
- Surfaces only decisions that need human attention (not raw data)
- Routes information by type: alerts (blocking), notifications (state changes), status (heartbeats)
- Translates operator commands ("publish it") into session-level actions
See docs/drafts/factory-control-plane-vision.md for the full vision.
See docs/plans/2026-03-14-roadmap.md.
Completed: Bucket 1 (workflow skills), Bucket 2 (browser QA + web support). Bucket 3 (control plane, analytics) in progress.
MIT. See LICENSE.