- Add
trellis-exec logincommand for interactive OAuth inside Docker with credential persistence in a named volume - Forward host plugins and settings into containers, run as non-root
claudeuser - Fix container auth failure caused by
--bareflag skipping OAuth credential loading in newer Claude Code versions
- Fix container mode default network: change from
nonetobridgeso the Claude CLI can reach the Anthropic API
- Fix duplicate binary name in Docker inner command that caused "Unknown command: trellis-exec" when using
--container - Auto-build Docker image when missing instead of failing with a cryptic pull error
- Add
docker:slimanddocker:browsernpm scripts for easier image builds
- Reject plans with phase headings but no task items, falling back to LLM decomposition instead of silently producing empty phases
- Implement Docker container mode (Layer 4):
--containerlaunchesdocker runwith the project mounted, re-invoking trellis-exec inside the container with full tool access and OS-level isolation - Add container launcher module with pure
buildDockerArgsandbuildInnerCliArgsfunctions - Add multi-stage Dockerfile (
slim~200MB,browser~1.5GB with Playwright) - Add container e2e tests with graceful skip when Docker is unavailable
- Add
docs/container-mode.mdwith full documentation on mounts, networking, resource limits, and troubleshooting
- BREAKING: Safe mode is now the default. Agents run with granular permission controls instead of
--dangerously-skip-permissions. Use--unsafefor legacy unrestricted access. - Add permission controls:
buildPermissionArgs()with safe, unsafe, and container modes. Judge and reporter agents are read-only in all modes. - Add git checkpoints: automatic commit + tag before each phase for recovery on failure
- Add budget enforcement: per-phase cap via
--max-phase-budgetand cumulative run-level caps via--max-run-budgetand--max-run-tokens - Add container mode plumbing (
--containerflag and related options) for Docker-based isolation - Add
init-safetysubcommand to generate reference safety config for interactive Claude Code sessions - Strip
tools:from all agent frontmatter; tool permissions are now controlled entirely by the execution mode via CLI flags - Add default 30-minute phase timeout in safe mode when no explicit timeout is set
- Display budget usage in summary report when limits are configured
- Fix browser acceptance tester output parsing when CLI returns content block arrays instead of plain strings
- Add verbose logging of raw browser-tester output for debugging parse failures
- Strengthen browser-tester agent output format instructions to improve JSON compliance
- Track sub-agent token usage (judge, fix) in phase summary reports for accurate cost reporting
- Fix outdated model defaults and add --timeout documentation
- Fix ASCII diagram alignment in README
- Change default orchestrator model from sonnet to opus
- Add default values to CLI reference flags in README
- Make test auto-detection language-agnostic: support Python (pytest), Go, Rust, Ruby (rspec), Java (Maven/Gradle), Elixir, and Makefile projects
- Extend web app detection for Django, Rails, and Phoenix frameworks (two-signal heuristic to avoid false positives on API-only projects)
- Replace JS-specific prompt examples with language-neutral alternatives across prompts, agents, and skills
- Fix
runSinglePhasemissing orchestrator correction pre-application (--phase runs now apply corrections correctly) - Split
phaseRunner.tsinto focused modules:judgeRunner.ts,browserRunner.ts,testDetector.ts - Move
RunContexttype fromcli.tstosrc/types/runner.ts(fixes inverted dependency) - Remove backward-compatibility re-exports from
phaseRunner.ts - Fix
tryParseAssessmentmutating input object in-place (spread into new object) - Cache spec/guidelines content in
RunContextto avoid repeated disk reads during prompt building - Wrap
realpathSyncin try/catch to handle broken symlinks gracefully - Add
KNOWN_AGENT_TYPESenum forsubAgentTypevalidation - Log SHA fallback in
getChangedFilesinstead of silently falling back - Validate fallback
JudgeAssessmentthrough Zod schema - Fix immutability violations in phaseRunner corrective task injection and planEnricher
mergeResolvedField - Enable
noUnusedLocals,noUnusedParameters,noImplicitReturnsin tsconfig - Replace greedy regex in
parseJudgeResultwith iterative JSON.parse scan - Add truncation notice to reporter prompt when diff exceeds 50k chars
- Invert context authority: learnings (Current Understanding) positioned before spec with anti-hack instructions
- Add orchestrator self-correction via corrections field in PhaseReport
- Fix markdown lint errors for blank lines around lists in README and docs
- Update README with browser testing architecture, auto-detection, verification pipeline, new CLI flags, and new agents
- Sync all docs with current implementation: fix orchestrator timeouts, remove stale REPL/worktree references, add browser and verification coverage to phase-runner and harness-comparison docs
- Fix browser acceptance empty-results loop: break on unparseable tester output instead of dispatching fixer on nothing
- Reinforce JSON output requirement in browser-tester prompt and agent definition
- Fix judge attempt numbering off-by-one (
passed on attempt 0→passed on attempt 1) - Add
judgeFixCyclesto PhaseReport; combine withphaseRetriesin summary report Retries column - Rewrite summary table with minimal box-drawing (
│ ─ ┼delimiters) - Show explicit message for empty browser acceptance results instead of
0/0 criteria passed - Add constraint generalization instruction to orchestrator prompt for better cross-phase learning
- Strengthen plain-text output instruction in phase-orchestrator agent to reduce markdown leakage
- Remove dead
phaseReportfield from SharedState schema; makeexitCoderequired in CheckResultSchema - Extract shared judge/rejudge prompt helpers and
applyJudgeOutcome()to reduce duplication - Inline single-use
buildSubAgentPrompt/buildSubAgentArgsintodispatchSubAgent() - Consolidate
getChangedFiles()to use singlegit status --porcelaincall (halves git spawns for no-ref path) - Batch
applyReportToTaskswith Map lookup instead of per-task O(n) scans - Add
prompts.test.tscoveringbuildRejudgePrompt,formatIssue,normalizeReport,parseJudgeResult,collectLearnings - Add tests for
reviewPhaseContract,detectTestCommand,selectJudgeModel - Add edge case tests for devServer, completionVerifier, and stateManager (350 total tests, up from 309)
- Add judge corrections mechanism: judge can return
corrections(e.g., targetPath renames) that update tasks.json before the completion verifier runs, eliminating false-positive failures for.module.css/.cssand.jsx/.jsmismatches - Reorder phase runner flow: completion verifier now runs after judge (not before) so corrections are applied first
- Remove
EXTENSION_VARIANTShack from completionVerifier — path reconciliation is now handled by the judge - Auto-inject CLAUDE.md scaffolding task into phase-1 during compilation, giving all agents persistent project orientation that survives context compaction
- Add browser smoke and dev server integration tests with Playwright fixtures
- Add foundational CLAUDE.md for the trellis-exec project itself
- Consolidate
getChangedFiles/getChangedFilesRangeandgetDiffContent/getDiffContentRangeinto single functions with optionalfromShaparameter, eliminating duplication and simplifying callers - Extract 13 prompt-building and normalization functions from
phaseRunner.tsinto newsrc/runner/prompts.ts, reducing phaseRunner from 2,202 to 1,510 lines
- Add subAgentType-aware execution guidance to phase orchestrator (scaffold/implement/test-writer strategies)
- Add Task Type Summary section to phase context for orchestrator awareness
- Clarify classifySubAgentType() role as orchestrator hint, not automatic dispatch trigger
- Add project-level web app detection (
detectWebApp) that checks for frontend framework deps, build-tool configs, and HTML entry points - Fix
requiresBrowserTestheuristic to propagate across phases for web app projects: sticky propagation once any phase has UI output, and last-phase guarantee so end-of-build acceptance tests always run
- Fix token/cost extraction to read from nested
usage.input_tokens(actual CLI format) instead of non-existent top-levelnum_input_tokens, with fallback for legacy format - Add extension-variant tolerance to completion verifier so
.jstarget paths resolve when.jsxexists on disk, preventing infinite retry loops - Skip corrective tasks in contract review to suppress false warnings about missing acceptance criteria and target paths
- Fix sub-agent CLI calls (judge, fix, etc.) failing silently due to missing
--verboseflag required by Claude CLI forstream-jsonwith piped stdin - Skip fix-judge retry loop when judge sub-agent process itself fails, avoiding wasted corrective task cycles on infrastructure errors
- Add tests for sub-agent CLI failure handling and stream-json
--verboseinvariant
- Add two-tier browser testing with Playwright (optional peer dependency)
- Tier 1: per-phase deterministic smoke check (console errors, blank page detection, interactive element click test) runs before the judge on UI phases
- Tier 2: end-of-build LLM-driven acceptance tests generated from spec criteria, with browser-fixer retry loop (default 3 retries)
- Add
requiresBrowserTestflag to Phase schema, set by compiler prompt and deterministic heuristic - Add
--dev-server,--save-e2e-tests,--browser-test-retriesCLI flags - Add language-agnostic dev server autodiscovery (Node, Python, Rails, Go, Docker Compose)
- Add
browser-tester(Opus) andbrowser-fixer(Sonnet) agent definitions - Feed browser smoke results to judge prompt as additional evidence
- Include browser acceptance results in end-of-run summary report
- Upgrade phase learnings to authoritative "Spec Amendments" that take precedence over spec assumptions
- Add
constraintdecision tier for runtime/toolchain facts discovered during implementation (never evicted) - Reorder phase context so amendments appear after spec/guidelines, giving discovered constraints last-word authority
- Add structured handoff template (Architecture State / Deviations from Spec / Watch List) to orchestrator prompt
- Add consistent output style instructions to phase-orchestrator for
[task-id]progress format
- Add end-of-run summary report showing per-phase time, task completion, judge results, retries, token usage, and cost
- Switch CLI subprocess output from
--printto--output-format stream-jsonto capture token usage from Claude CLI result events - Extract and accumulate token usage (input/output tokens, cost) per phase
- Randomize orchestrator spinner message from 10 fun labels instead of always showing "Orchestrating…"
- Reframe README tagline as phased execution harness
- Start judge attempt count at 1 instead of 0 for user-friendly output
- Use descriptive log message when orchestrator starts, keep spinner as "Orchestrating…"
- Adjust spinner frames so bounce endpoints show 1 bar instead of 0
- Fix false projectRoot/git-root mismatch warning on case-insensitive filesystems (macOS) by using
realpathSyncfor path canonicalization - Add success log for completion verifier so users can confirm it ran
- Fix projectRoot resolution: auto-detect git root instead of defaulting to tasks.json directory, preventing infinite retry loops when tasks.json lives in a subdirectory like
.specs/feature/ - Add "all paths missing" diagnostic in completion verifier to fail fast with a clear error instead of snowballing corrective tasks
- Add early projectRoot sanity warning when resolved path is inside
.specs/or differs from the git root
- Relax sub-agent tool restrictions: implement agent gains Glob/Grep, scaffold agent gains Edit
- Increase default orchestrator timeout to 30 minutes, add
--long-runflag for 2-hour timeout - Add long-running phase protocol with intermediate commit reminders for reporter fallback safety
- Add lightweight completion verification pass (target path existence, TODO/FIXME scan) before judge
- Replace flat 20-entry decisionsLog cap with tiered learnings: architectural decisions never evicted
- Add pre-phase contract review that flags missing or vague acceptance criteria before execution
- Add long-running harness comparison doc analyzing Trellis against Anthropic research
- Fix caller mutation: deep-clone tasksJson in runPhases and shallow-copy ctx to prevent side effects
- Add stdin error handler in execClaude to surface backpressure/early close as a proper rejection
- Fail fast on unreadable spec/guidelines files instead of silently injecting error text into prompts
- Add guidelinesRef to LLM decompose fallback path in compilePlan
- Wrap loadState JSON.parse in try/catch with file path in error message
- Increase default orchestrator timeout to 15m, add --timeout CLI flag and reporter fallback sub-agent
- Allow judge to upgrade timed-out phases when committed work passes review
- Sync task statuses back to tasks.json after each phase completes
- Fix check command auto-detection to use range-based diff for committed test files
- Validate and rewrite all docs to match current native-tools architecture
- Archive 7 obsolete docs (REPL, worktree, old spec) to docs/_archive/
- Update README with all current CLI flags, compile options, env vars, and adaptive judge model
- Add adaptive judge model selection: Sonnet for small diffs (<150 lines, <3 tasks), Opus for larger work
- Add
--judgeflag (always|on-failure|never) to control when the judge runs - Add
--judge-modelflag to explicitly override the judge model - Use targeted re-judge prompt after fix (fix-only diff instead of full phase diff)
- Switch orchestrator sub-agent dispatch from CLI subprocess to native Agent tool
- Remove unused
modifiedFilesandschemaChangesstate fields - Consolidate duplicate
stripCodeFencesutility
- Fix spinner bounce to show zero bars at both endpoints for symmetric animation
- Polish orchestrator spinner: add trailing ellipsis to label and smooth ping-pong with 1-frame dwell at bounce endpoints
- Fix spinner animation to bounce back and forth instead of jumping to start
- Normalize orchestrator spinner label to action word "Orchestrating…"
- Fix spinner leak: stop spinner in executePhase catch block so process exits after orchestrator timeout
- Add cross-phase learnings: surface
decisionsLogentries from prior phases in orchestrator context - Cap accumulated learnings at 20 entries with phase ID prefix for provenance
- Update phase-orchestrator agent to reference learnings and write forward-looking decisionsLog entries
- Add test coverage for streamParser, agentLauncher, and spinner (3 new test files)
- Expand phaseRunner tests with range-based judging and startSha tracking coverage
- Expand prompts tests with buildEnrichmentPrompt coverage
- Export
formatElapsedfrom spinner for testability - Fix markdown lint errors and migrate agnix config to tools array
- Add per-task git commits via orchestrator prompt (conventional commit format with scope and bullet summary)
- Add per-phase git commit in runner after judge passes, summarizing all completed tasks
- Track
startSha/endShaper phase in PhaseReport for commit range tracking - Fix judge silently auto-passing by using range-based diffs (
startSha..HEAD) instead of working tree vs HEAD - Add git helpers:
getCurrentSha,ensureInitialCommit,commitAll,getChangedFilesRange,getDiffContentRange - Add
getPhaseCommitRangeconvenience function to state manager
- Use absolute path for phase report file in completion protocol so subagents resolve the correct location
- Stream orchestrator output in real-time using
claude --output-format stream-jsoninstead of buffered--print - Add pause/resume to spinner so real-time output and elapsed-time indicator coexist cleanly
- Add NDJSON stream parser (
src/ui/streamParser.ts) for extracting assistant text and result events
- Stream orchestrator stdout/stderr in real-time with
--verboseflag - Show actual recommendation (retry/halt/advance) in interactive prompt instead of misleading "continue" label
- Run judge even when phase fails if files were changed, preventing silent skip of judge/fix loop
- Default to "retry" instead of "halt" for transient failures like missing report files
- Store current phase report in
state.json(phaseReportfield) and clean up temp file after reading - Auto-detect new test files after each phase and set check command from
package.jsonor test runner configs - Refactor
execClaudeto use options object instead of positional params
- Use Opus instead of Haiku for plan decomposition to handle ambiguous, architectural inputs
- Add stage-based progress messages during compile (parsing, decomposing, enriching, validating)
- Show elapsed time in spinner during long-running LLM calls
- Increase default compile timeout to 10 minutes with
--timeoutCLI flag for user override - Add
onStderrcallback toexecClaude()for real-time subprocess progress streaming - Add simplified orchestration design document and native tools architecture doc
- Remove unused imports in phaseRunner test to fix lint warnings
- Fix executable bit on dist/cli.js lost during tsc rebuild, causing npx entrypoint to fail with "Permission denied"
- Replace REPL-mediated orchestrator with single
claude --printinvocation per phase using native tools (Read, Write, Edit, Bash, Glob, Grep) - Orchestrator signals completion by writing
.trellis-phase-report.jsonto disk instead of calling a REPL helper - Remove worktree isolation (
--isolationflag) — phases run directly in project root - Remove
turnLimitandmaxConsecutiveErrorsCLI options (no longer applicable without REPL turn loop) - Add previous-attempt context to phase prompt for retries (last report, judge issues, corrective tasks)
- Extract git-diff helpers into standalone
src/git.tsmodule - Replace
AgentLauncherdependency incompilePlanwith simplequerycallback - Delete replManager, replHelpers, worktreeManager and all associated tests (~8,700 lines removed)
- Add input validation to dispatchSubAgent() — returns descriptive error instead of crashing on wrong argument format
- Add writeFile(path, content) REPL helper for simple file creation without spawning a sub-agent
- Add task completion gate on writePhaseReport() — rejects reports missing any task from tasksCompleted or tasksFailed
- Add stuck-loop detection — intervenes after 4 identical REPL outputs with alternative approach guidance
- Update orchestrator prompts with writeFile docs, dispatchSubAgent worked example, and stricter completion rules
- Fix safePath ancestor resolution for symlinked tmpdir paths
- Fix REPL async IIFE wrapper silently dropping sub-agent return values, causing orchestrator to skip tasks and prematurely complete phases
- Wrap dispatchSubAgent and runCheck in sandbox with auto-reporting via capturedConsole so results always appear in REPL output
- Fix statement-form IIFE to return the last
vardeclaration's value
- Add adaptive REPL timeout: 30s for sync expressions, 5min for long-running helpers (dispatchSubAgent, runCheck, llmQuery) so sub-agents aren't killed prematurely
- Strengthen post-timeout feedback to prevent orchestrator from hallucinating that timed-out work was completed
- Pre-load spec and guidelines content into orchestrator phase context to eliminate warm-up turns
- Accept structured judge output objects (
{task, severity, description}) as the primary format, with plain strings as fallback - Normalize
detail→descriptionfield before validation to handle LLM field name variance - Fix "continue" action incorrectly mapping to "halt" when the judge recommends retry
- Show retry counter (
retries used: 1/2) in the interactive phase prompt - Log a message when max retries are exceeded instead of exiting silently
- Remove habbit-tracker example files
- Add bouncing-bar spinner animation on stderr during LLM wait states (orchestrator launch, REPL turns, judge dispatch, plan compilation)
- Add
normalizeReport()to validate and map orchestrator phase reports to the canonical schema, fixing Zod validation errors from LLM-style field names - Detect and skip comment-only code blocks in the REPL turn loop; log raw orchestrator responses in verbose mode for debugging
- Increase verbose output limits from 200 to 500/1000 chars for code/results
- Add default file-existence check when no
--checkcommand is provided, verifying all phasetargetPathsexist - Include untracked files in
getChangedFiles()andgetDiffContent()so the judge reviews new files created by sub-agents - Improve timeout error messages to prevent orchestrator from writing false "complete" reports after sub-agent timeouts
- Enforce all-tasks iteration in orchestrator prompts —
writePhaseReport()must not be called until every task is attempted
- Document judge → fix correction loop in phase-runner.md
- Update README architecture diagram to show judge/fix as post-orchestration Phase Runner step
- Fix judge model reference (Sonnet → Opus) and remove stale env vars from README
- Fix markdown lint errors in docs (trailing punctuation, blank lines, code fence labels)
- Add comprehensive test coverage for critical execution paths (191 → 273 unit tests)
- New test files for replHelpers, extractCode, and compilePlan covering previously untested modules
- Add mergeWorktree tests for clean merges, conflicts, and missing branches
- Add CLI handler subprocess tests for handleStatus, handleCompile, and handleRun
- Add buildJudgePrompt edge case and runPhases halt action tests
- Export
stripCodeFencesfrom compilePlan for direct unit testing
- Fix interactive prompt to accept full words "retry"/"skip"/"quit" (not just single chars)
- Fix REPL variable scoping —
vardeclarations now persist across eval turns for synchronous code - Fix
searchFilesto auto-detect glob patterns in first param instead of treating them as regex - Make spec and guidelines file references explicit in phase context with
readFile()examples
- Move judge invocation from orchestrator to phase runner — judge now runs as a system-controlled gate between phases using git diff for accurate changed-file detection
- Add lightweight fix agent (
agents/fix.md) for targeted corrections from judge feedback - Upgrade judge model from sonnet to opus for better reasoning on tricky issues
- Add bounded judge-fix correction loop (max 2 attempts) before surfacing issues
- Add
getChangedFiles()andgetDiffContent()git diff helpers to worktree manager - Add progress logging during phase runner startup
- Fix cross-phase task dependency validation — tasks can now reference IDs from prior phases without being rejected as non-existent
- Store specRef, planRef, and guidelinesRef as relative paths in tasks.json, making it portable across machines
- Add
projectRootas required field in tasks.json, enabling tasks.json to live outside the project directory - Introduce
RunContexttype that normalizes CLI flags + tasks.json refs into a single resolved config before execution - Add
--spec,--plan,--guidelinesCLI override flags for the run command - Refactor phaseRunner to accept pre-resolved
RunContextinstead of resolving paths internally - Fix markdown lint errors in docs and README
- Format-agnostic plan compiler: accepts any well-structured technical plan, not just phase/task formatted plans
- New
buildDecomposePromptdecomposes plans via LLM using full spec + plan + guidelines as context - Add
--guidelinesCLI flag for the compile command - Add optional
guidelinesReffield to TasksJson schema - Copy guidelines file to project root during execution, mirroring spec pattern
- Include guidelines reference in
buildPhaseContextorchestrator context
- Replace spec-section-injection with spec file copy into project root for simpler, more reliable spec access
- Remove
readSpecSectionsREPL helper andparseSpecSections— orchestrator now reads spec directly viareadFile() - Update phase-orchestrator and skill docs to reference
readFile('spec.md')instead ofreadSpecSections()
- Fix readSpecSections to accept both array and varargs calling conventions
- Update phase-orchestrator prompt to direct agent to use pre-loaded spec sections
- Add test coverage for parseSpecSections, buildPhaseContext spec embedding, and varargs readSpecSections
- Pre-load spec sections into phase context prompt for --project-root compatibility
- Add graceful error handling to readSpecSections when spec file is missing
- Add
--project-rootCLI flag to decouple project root from tasks.json location
- Fix invalid git branch names when specRef is an absolute file path in worktree creation
- Add habit-tracker example specs (plan, spec, tasks, guidelines, pitch)
- Auto-fallback to LLM parsing when deterministic plan parser cannot identify phase boundaries
- Fix CLI entrypoint detection for
npxandnpm linkby resolving symlinks withrealpathSync - Fix vitest config
resolveoption incorrectly nested insidetest(ts2769)
- Rewrite orchestrator to use sequential
--print --continuecalls instead of persistent process - Fix
--agent-file→--agentCLI flag for sub-agents and orchestrator - Add
--dangerously-skip-permissionsfor headless sub-agent execution - Disable orchestrator file tools via
--disallowedToolsto enforce REPL-only interaction - Add
extractCode()to parse JS from Claude responses, filtering natural language - Add corrective nudge when orchestrator outputs natural language instead of JS
- Add REPL helper function docs to phase context
- Fix
dryRunpassthrough inexecutePhase(was hardcoded tofalse) - Gate e2e CLI test behind
TRELLIS_E2E_CLAUDEenv var - Split vitest config into unit and e2e with dedicated scripts
- Add
docs/cli-integration-architecture-changes.mddocumenting the new architecture - Update sub-agent prompt and agent files to use Write/Edit tools instead of text output
- Fix command injection in worktreeManager by replacing execSync with execFileSync
- Fix getState() ENOENT crash on first phase turn (returns empty initial state)
- Fix in-place mutation of phase tasks during retry (spread copies + unique IDs)
- Fix listener leak in agentLauncher orchestrator handle
- Add regex validation in searchFiles to prevent ReDoS and SyntaxError
- Add
--enrichflag to compile CLI for opt-in LLM enrichment - Fix cleanupWorktree running from inside the directory being deleted
- Track and clear sandbox timers on session destroy to prevent leaks
- Add 14 new security and edge-case tests
- Add docs/security.md documenting attack surface and mitigations
- Redesign README banner and add architecture diagram drafts
- Add Claude CLI pre-flight check with clear install message on failure
- Remove dead TRELLIS_EXEC_COMPACTION_THRESHOLD env var from help text
- Add tests for llmQuery default model and interactive mode promptForContinuation
- Document sub-agent permission enforcement model (Claude CLI --agent-file)
- Clean up stale TODO reference in skills doc
- Add e2e integration tests verifying §10 success criteria (compile, dry run, state round-trip, parallel scheduling, phase retry, handoff, REPL truncation, architectural validation)
- Add test fixtures: minimal Node.js test project, sample spec, and sample plan
- Add Group 2 claude CLI tests that skip gracefully when CLI is unavailable
- Add e2e integration tests documentation
- Package for dual npm CLI and Claude Code plugin distribution
- Add README with installation, CLI reference, architecture, configuration, and custom agents docs
- Update package.json with files, keywords, author, and prepublishOnly script
- Update plugin.json description and keywords
- Fix Zod 4
z.record()arity in agent linter
- Add eight orchestrator skills: compile, dispatch-agent, explore-codebase, manage-phase, quick-query, run, status, and verify-work
- Add skill architecture documentation explaining the rationale for skill-based orchestrator design
- Add agnix linter configuration
- Fix agent tools frontmatter to use YAML array syntax
- Fix agent frontmatter to use
toolsinstead ofallowed-toolsper official Claude Code sub-agent docs - Add Zod-based agent frontmatter linter with strict schema validation to catch unknown fields
- Add agent markdown files: phase-orchestrator, implement, test-writer, scaffold, and judge
- Update agent frontmatter to
allowed-toolssyntax matching current Claude Code format - Add agnix and markdownlint-cli2 dev dependencies with
lint:code,lint:md, andlint:agentsscripts - Add
.markdownlint-cli2.jsoncconfig and fix all markdown lint errors across docs and agents - Fix skill
allowed-toolsto space-delimited format per Agent Skills spec
- Add plan enricher (Stage 2) with targeted Haiku calls for ambiguous fields flagged by the deterministic parser
- Add prompt templates for enrichment and full-plan fallback parsing
- Add
compilePlanpipeline wiring Stages 1–3: deterministic parse → targeted enrichment → fallback - Add plan compiler architecture documentation
- Add CLI entry point (
src/cli.ts) withrun,compile, andstatussubcommands - Argument parsing via Node built-in
util.parseArgs()with environment variable fallbacks - CLI flags override environment variables; environment variables override defaults
- Add
binfield andbuildscript topackage.json - Include compiled
dist/output for direct installation - Add CLI reference documentation (
docs/cli.md)
- Add phase runner — the deterministic outer loop that composes all sub-modules into the full execution pipeline
- Implements §6 Phase Runner Logic: phase iteration, orchestrator ↔ REPL turn loop, advance/retry/skip/halt action handling, worktree commits at phase boundaries, and resume from saved state
- Add phase runner documentation
- Add agent launcher module for managing claude CLI subprocesses (sub-agent dispatch, LLM queries, orchestrator sessions)
- Support real, dryRun, and mock operating modes for testability
- Add agent launcher documentation
- Add git worktree manager for isolating execution runs on separate branches (create, commit, merge, cleanup)
- Add check runner for executing user-defined verification commands as a deterministic gate after each task
- Add documentation for worktree manager and check runner modules
- Add REPL manager (replManager.ts) with vm-based sandboxed eval, expression-first async execution, output truncation, scaffold restoration, and consecutive error tracking
- Add REPL helper factory (replHelpers.ts) with filesystem helpers (readFile, listDir, searchFiles, readSpecSections, getState) and stubs for agent/LLM helpers
- Add REPL architecture documentation
- Add JSDoc comments to orchestrator, planParser, and scheduler functions
- Add deterministic plan parser (Stage 1 of plan compiler) that extracts phases, tasks, spec references, file paths, dependencies, sub-agent types, and acceptance criteria from plan.md without LLM calls
- Flag ambiguous fields for Stage 2 enrichment
- Move test fixtures to test/fixtures
- Add task scheduler with dependency resolution and parallel execution grouping
- Implement targetPaths overlap detection for implicit dependency serialization
- Add dependency validation (missing refs, self-refs, circular dependencies)
- Document scheduler grouping vs spec §10 #8 ordering rationale
- Fix invalid --verbose flag in typecheck script
- Add oxlint linter with lint and lint:fix npm scripts
- Fix no-empty-file lint violation in test setup
- Migrate test suite from node:test to vitest with expect-style assertions
- Add vitest.config.ts scoped to *.test.ts files only
- Add test/ directory for shared test concerns
- Add typecheck script with verbose tsc output
- Add tsconfig.test.json for typechecking test files separately from build
- Add state persistence layer (stateManager) with atomic writes and Zod validation
- Add crash-safe trajectory logger with JSONL append
- Enable @types/node in tsconfig and add tsx for test execution
- Add test suites for stateManager and trajectoryLogger (19 tests)
- Add .npmrc with save-exact config
- Add core data model types and Zod schemas for tasks, state, and agents
- Add zod as a dependency
- Switch package type to ESM
- Add Trellis RLM Executor spec v3 documentation
- Add gitkeep files for agents, hooks, and skills directories
- Add project scaffolding (package.json, tsconfig, .gitignore)
- Add plugin manifest and version bump skill
- Add Claude Code settings