Plans fail because discovery happens too late. This framework catches gaps that only surface during execution - evolved from 117 real plans + 195 handoffs.
"Initial idea was a custom booking system (8 weeks). Stage 0 discovered Calendly + Stripe does 90% of it. Shipped in 3 weeks, saved 5 weeks of engineering."
Full install (recommended - gives you commands, agent, and rule):
git clone https://github.com/primeline-ai/universal-planning-framework
cp -r universal-planning-framework/.claude/* your-project/.claude/Minimal install (rule file only):
mkdir -p .claude/rules
curl -o .claude/rules/universal-planning.md \
https://raw.githubusercontent.com/primeline-ai/universal-planning-framework/main/.claude/rules/universal-planning.mdFormal reasoning foundation (DSV). Most plans fail not because validation is missing, but because the wrong question gets validated. The framework is built on Decompose-Suspend-Validate - a principle that forces you to challenge your interpretation before committing to it. See Theoretical Foundation below.
FAILED conditions are mandatory. Every plan must define when to kill the project - not just when it succeeds. No FAILED condition = zombie project (anti-pattern #11).
21 anti-patterns with detection rules. Not guidelines - specific detection rules that catch vague gates ("looks good"), hallucinated estimates, assumed facts, discovery amnesia, and 17 more. Categorized as 12 Core + 5 AI-Specific + 4 Quality.
Autonomous hardening (Stage 1.5). Run /plan-refine and 6 adversarial perspectives stress-test your plan without asking you a single question. The Pedantic Lawyer catches vague gates. The Devil's Advocate challenges your core assumption. You get back a hardened plan with a change log.
Quality rubric with teeth. Grade C/B/A based on objective criteria. Numbers need sources. Gates must be observable. Delegated work needs input/output specs. FAILED conditions must be measurable.
Behavior specs + acceptance criteria. Optional Behavior Description (tech-agnostic, what the feature does) and Given/When/Then acceptance criteria give plans testable behavior coverage without forcing a separate spec document.
Reference Library. For coding domains, plans must link the official docs they consulted - so the person maintaining the output has the same sources.
8 domain detection. Software, AI/Agent, Business, Content, Infrastructure, Data & Analytics, Research, Multi-Domain. Each domain auto-includes relevant sections.
/plan-new "Add OAuth login to Next.js app"Claude runs through 3 stages:
- Stage 0: Discovers existing work, checks feasibility, challenges your approach (AHA Effect)
- Stage 1: Builds the plan with 5 CORE sections, domain-specific CONDITIONAL sections, and a confidence level
- Stage 1.5: Autonomously hardens the plan from 6 adversarial perspectives
- Stage 2: Meta-checks including Cold Start Test and Discovery Consolidation
/interview-plan path/to/plan.mdFramework-aware questions across 3 tiers: critical gaps, domain-specific probes, quality strengthening. Includes DSV checks for premature commitment and assumption mutation. References anti-patterns by number.
/plan-review path/to/plan.mdObjective assessment against the rubric. Returns grade, anti-patterns found, and top 3 improvements.
/plan-refine path/to/plan.md6 perspectives stress-test the plan. Fixes structural issues, flags strategic decisions for you.
The framework is built on Decompose-Suspend-Validate (DSV) - a reasoning principle that prevents premature commitment at every stage of planning.
Most planning failures share a root cause: the planner validates an assumption without first questioning whether it's the right assumption. You check "Can we build X in 4 weeks?" when the real question is "Should we build X at all?" DSV prevents this by structuring thought into three phases:
| Phase | What it does | Framework mapping |
|---|---|---|
| Decompose | Break the problem into discrete, testable claims | Stage 0 checks 0.1-0.6 (Existing Work, Facts, Docs, Updates, Practices, Research) |
| Suspend | Challenge each claim - explore alternative interpretations before committing | Stage 0 checks 0.7-0.12 (Feasibility, ROI, AHA Effect, Competitive, Constraints, People Risk) |
| Validate | Test each claim independently with explicit methods and failure impacts | Stage 1 Assumptions section (VALIDATE BY + IMPACT IF WRONG) |
The key insight is Suspend. Decomposing is natural. Validating is expected. But actively suspending your first interpretation - asking "what if this means something entirely different?" - is what most planners skip. That's why Stage 0's sparring checks (0.7-0.12) exist.
For time-pressured situations, DSV compresses to three questions:
- "What are the 2-3 key claims?" (Decompose)
- "What alternative interpretation haven't I considered?" (Suspend)
- "Which claim am I least sure about?" (Validate that one first)
This works standalone - even without the full framework. Use it before any decision where you catch yourself feeling "obvious."
12 checks in 3 priority tiers. The agent decides which to run based on context. Maps directly to the DSV phases above: checks 0.1-0.6 decompose, 0.7-0.12 suspend, Stage 1 Assumptions validates.
| Tier | Checks |
|---|---|
| Always | Existing Work, Feasibility, Better Alternatives (AHA Effect) |
| Usually | Factual Verification, Official Docs, ROI |
| Context-dependent | Updates, Best Practices, Deep Research, Competitive, Constraints, People Risk |
The AHA Effect (Check 0.9) is the single most valuable check:
- Custom CMS planned? "Strapi does 90% of it."
- 50 blog posts? "5 pillar posts + derivatives might outperform."
- Building from scratch? "This open-source project does 80%."
5 CORE sections (always required): Context & Why (+ optional Behavior Description), Success Criteria (with FAILED conditions + optional GWT acceptance criteria), Assumptions (with VALIDATE BY + IMPACT IF WRONG), Phases (with binary gates + review checkpoints), Verification (Automated + Manual + Ongoing Observability).
Optional structured formats: Behavior Description captures what a feature DOES in tech-agnostic language (3-5 sentences). Given/When/Then acceptance criteria complement FAILED conditions — FAILED covers what must NOT happen, GWT covers what MUST happen.
18 CONDITIONAL sections (domain-detected): Rollback, Risk, Post-Completion, Budget, User Validation, Legal, Security, Resume Protocol, Incremental Delivery, Delegation, Dependencies, Related Work, Timeline, Stakeholders, Reference Library, Learning & Knowledge Capture, Feedback Architecture, Completion Gate.
Coding domains size phases by scope (files, features, tests), not hours. Non-coding domains use time estimates as rough guides.
Plan Confidence Level: High / Medium / Low - assigned at plan header. Low confidence = Phase 1 must be a validation sprint.
6 adversarial perspectives stress-test the plan:
- Outside Observer - Goal clarity, End State, ambiguous metrics
- Pessimistic Risk Assessor - Single failure points, FAILED condition timeouts
- Pedantic Lawyer - Vague gates, delegation contracts, deployment completeness
- Skeptical Implementer - First blocker, unverified facts, cold start readiness
- The Manager - Resume Protocol, scope realism, deadline acknowledgment
- Devil's Advocate - Core assumption validity, 80/20 path, obsolescence risk
Structural fixes applied in place. Strategic decisions flagged as [Stage 1.5 Note:] for user review. Hardening Log appended as audit trail.
Delegation Strategy, Research Needs, Review Gates, Anti-Pattern Check (21), Cold Start Test, Plan Hygiene Protocol, Discovery Consolidation.
| Domain | Key Sections | Phase Sizing | Review Frequency |
|---|---|---|---|
| Software Development | Rollback, Risk, Delegation, Reference Library | Scope-based | Every 2 phases |
| Multi-Agent / AI | Risk, Delegation, Security, Reference Library | Scope-based | Every 2 phases |
| Business / Strategy | Timeline, Budget, Stakeholders, Validation | Time-based | Per milestone |
| Content / Marketing | Timeline, Validation, Legal, Feedback Architecture | Time-based | Per draft |
| Infrastructure / DevOps | Rollback, Risk, Dependencies, Reference Library | Mixed | Every phase |
| Data & Analytics | Risk, Rollback, Legal, Security, Reference Library | Mixed | Every phase |
| Research / Exploration | Incremental, Budget, Related Work, Learning | Time-based | Per finding |
| Multi-Domain | Union of matched domains | Most conservative | Most conservative |
| Grade | Criteria |
|---|---|
| C (Viable) | All 5 CORE + 1 CONDITIONAL. No critical anti-patterns (#3, #9, #11, #20, #21). |
| B (Solid) | C + Stage 0 + FAILED conditions + Confidence Level + Cold Start Test. Zero anti-patterns. |
| A (Excellent) | B + sparring (0.7-0.9) + Review Checkpoints + Reference Library (coding) + Replanning triggers. |
| Red Flags | Vague criteria, no FAILED conditions, assumptions untested, numbers without sources, delegated work without specs. Fix before implementing. |
Use for: New features, architecture changes (3+ files), multi-phase projects, anything with external dependencies.
Skip when ALL true: Single file, no external dependencies, <50 lines, no schema change, no user-facing change, rollback = git revert.
Traditional planning: Goal - Approach - Steps - Execute - "Oh crap, we didn't consider X."
This framework inverts it: Discovery - Constraints - Assumptions - THEN plan.
Stage 0 is where you find out your "simple feature" touches 6 systems, the existing code does 70% of what you need, your timeline was off by 3x, and there's a legal requirement you didn't know about.
The DSV principle explains why this works: most "overlooked" requirements weren't missing from the problem space - they were missing from the planner's interpretation of it. By forcing decomposition and suspension before validation, the framework catches gaps at the cheapest possible moment: before a single line of the plan is written.
The framework is domain-agnostic. Use it for software features, business launches, content creation, data pipelines, research projects - anything that needs a plan.
| Example | Domain | Grade | Key Features |
|---|---|---|---|
| Business Launch | Business / Strategy | A | AHA Effect saved 6 weeks, 8 sub-phases, full CONDITIONAL suite |
| Content Creation | Content / Marketing | B | 11 sub-phases, code examples, Resume Protocol |
| Software Feature | Software Development | B | OAuth implementation, Security section, Reference Library |
| Infrastructure CI/CD | Infrastructure / DevOps | A | Resume Protocol, Rollback, Reference Library, Review Checkpoints |
See CONTRIBUTING.md for guidelines. Key requirements:
- Example plans must pass their own anti-pattern self-audit honestly
- Use grades C/B/A only (no B+, A-, etc.)
- Coding examples must include a Reference Library
Found a gap the framework doesn't catch? Open an issue.
UPF is one piece of a progression. Each tier works independently - no hard dependencies.
You're here You want this Install this
----------- ------------- ------------
Raw Claude Code -> Session memory -> Starter System (free)
-> Workflow skills -> + Skills Bundle (free)
-> Deep planning -> + UPF (free) <- you are here
-> Deep analysis -> + Quantum Lens (free)
-> AI-powered system -> + Course (paid)
| Component | What It Does | Link |
|---|---|---|
| Starter System | Session memory, handoffs, context awareness | GitHub |
| Skills Bundle | 5 workflow skills: debugging, delegation, planning, code review, config architecture | GitHub |
| UPF | Universal Planning Framework with deep multi-stage planning | You're reading it |
| Quantum Lens | Multi-perspective analysis + solution engineering (7 cognitive lenses) | GitHub |
| Course | Kairn + Synapse: AI-powered memory and knowledge graphs | primeline.cc |
The Skills Bundle includes a lightweight plan-and-execute skill for everyday planning. UPF is the deep version - use it when the stakes are high enough to justify Stage 0 discovery.
MIT License - see LICENSE file.
| Tool | What It Does | Deep Dive |
|---|---|---|
| Evolving Lite | Self-improving Claude Code plugin — memory, delegation, self-correction | Blog |
| Kairn | Persistent knowledge graph with context routing for AI | Blog |
| tmux Orchestration | Parallel Claude Code sessions with heartbeat monitoring | Blog |
| UPF | 3-stage planning with adversarial hardening | Blog |
| Quantum Lens | 7 cognitive lenses for multi-perspective analysis | Blog |
| PrimeLine Skills | 5 production-grade workflow skills for Claude Code | Blog |
| Starter System | Lightweight session memory and handoffs | Blog |
