efremidze · efremidze · Feb 9, 2026 · Feb 9, 2026
diff --git a/.planning/codebase/ARCHITECTURE.md b/.planning/codebase/ARCHITECTURE.md
@@ -1,183 +1,122 @@
 # Architecture
 
-**Analysis Date:** 2026-01-23
+**Analysis Date:** 2026-02-09
 
 ## Pattern Overview
 
-**Overall:** Layered pipeline architecture with stage-based processing and LLM integration.
+**Overall:** Deterministic, schema-validated stage pipeline orchestrated by a CLI with pluggable LLM runtimes.
 
 **Key Characteristics:**
-- Sequential phase/stage execution with context passing
-- Schema-driven validation at stage boundaries
-- Pluggable LLM runtime providers (OpenAI, Copilot)
-- Configuration-driven task scheduling and action execution
-- Repository analysis and automated maintenance workflows
+- Fixed stage sequence with typed I/O enforced by JSON schema (`packages/core/src/stage.ts`, `packages/core/src/runner.ts`, `packages/schemas/stages/*.json`)
+- CLI-driven orchestration with report/output generation (`packages/cli/src/index.ts`)
+- Runtime adapters for LLM providers separated from orchestration (`packages/runtimes/openai/src/index.ts`, `packages/runtimes/copilot/src/index.ts`)
 
 ## Layers
 
-**CLI Layer:**
-- Purpose: Command-line interface and entry point for all workflows
-- Location: `packages/cli/src/index.ts`, `src/cli.ts`
-- Contains: Command definitions, stage orchestration, runtime initialization
-- Depends on: Core layer, Runtime providers
-- Used by: End users via CLI commands
+**CLI Orchestration:**
+- Purpose: Load config, enforce scope/budgets, build stages, run pipeline, write reports/publish payloads.
+- Location: `packages/cli/src/index.ts`
+- Contains: Command parsing, stage assembly, prompt/schema loading, output writing.
+- Depends on: Core library, runtimes, prompt files, schema files.
+- Used by: CLI executable (`package.json` -> `dist/cli/src/index.js`) and GitHub Action (`action.yml`).
 
-**Core Layer:**
-- Purpose: Core domain logic for repository analysis and workflow execution
+**Core Library:**
+- Purpose: Provide shared primitives for config, scope, budgets, schema validation, and stage execution.
 - Location: `packages/core/src/`
-- Contains: Stage definitions, schema validation, config loading, repository fact collection, prompt management
-- Depends on: None (no external dependencies)
-- Used by: CLI layer, phase implementations
-
-**Phase Layer:**
-- Purpose: Specific maintenance workflow steps
-- Location: `src/phases/`
-- Contains: Scan, Analyze, Plan, Propose, Act phases
-- Depends on: Core types, configuration
-- Used by: Groundkeeper orchestrator
-
-**Runtime Layer:**
-- Purpose: LLM provider abstraction
-- Location: `packages/runtimes/openai/src/`, `packages/runtimes/copilot/src/`
-- Contains: OpenAI and Copilot SDK integrations
-- Depends on: External APIs (OpenAI, Copilot)
-- Used by: CLI stage builders
-
-**Configuration Layer:**
-- Purpose: Configuration loading and validation
-- Location: `src/config.ts`, `packages/core/src/config.ts`
-- Contains: YAML parsing, schema validation, defaults merging
-- Depends on: js-yaml library
-- Used by: Groundkeeper, CLI runner
+- Contains: `config.ts`, `scope.ts`, `budgets.ts`, `schema.ts`, `runner.ts`, `stage.ts`, `repo.ts`, `prompt.ts`.
+- Depends on: Node FS/path and JSON/YAML parsing (`packages/core/src/config.ts`, `packages/core/src/schema.ts`).
+- Used by: CLI (`packages/cli/src/index.ts`).
+
+**Runtime Adapters:**
+- Purpose: Implement provider-specific `generateJson` behavior for LLM calls.
+- Location: `packages/runtimes/*/src/index.ts`
+- Contains: OpenAI fetch-based adapter and Copilot SDK adapter.
+- Depends on: Provider SDKs or APIs (`packages/runtimes/openai/src/index.ts`, `packages/runtimes/copilot/src/index.ts`).
+- Used by: CLI runtime selection (`packages/cli/src/index.ts`).
+
+**Schemas:**
+- Purpose: Define and validate stage input/output contracts and agent config schema.
+- Location: `packages/schemas/`
+- Contains: `agent.schema.json`, `stages/*.schema.json`.
+- Depends on: JSON schema validation in `packages/core/src/schema.ts`.
+- Used by: CLI stage setup and validation in `packages/cli/src/index.ts` and `packages/core/src/runner.ts`.
+
+**Prompts:**
+- Purpose: Stage-specific prompt templates for LLM interaction.
+- Location: `prompts/`
+- Contains: `system.md`, `analysis.md`, `plan.md`, `patch.md`, `verify.md`, `publish.md`.
+- Depends on: Prompt loader in `packages/core/src/prompt.ts`.
+- Used by: CLI stage execution in `packages/cli/src/index.ts`.
+
+**GitHub Action Wrapper:**
+- Purpose: Provide a composite action that runs the CLI and publishes reports.
+- Location: `action.yml`
+- Contains: Setup, install/build, CLI invocation, optional issue publishing.
+- Depends on: Built CLI (`dist/cli/src/index.js`) or global `groundkeeper` binary.
+- Used by: GitHub Actions workflows (e.g., `action.yml`).
 
 ## Data Flow
 
-**Main Workflow (CLI):**
-
-1. Load agent configuration (`packages/core/src/config.ts`)
-2. Collect repository facts (`packages/core/src/repo.ts`)
-3. Validate scope and budgets (`packages/core/src/scope.ts`, `packages/core/src/budgets.ts`)
-4. Build stage pipeline (`packages/cli/src/index.ts` - `buildStages()`)
-5. Execute stages sequentially with:
-   - Input validation via schema (`packages/core/src/schema.ts`)
-   - Stage-specific logic (preflight, analysis, plan, patch, verify, publish)
-   - Output validation via schema
-   - Log persistence to `.groundkeeper/logs/`
-6. Generate report and optional publish payload
-
-**Phase Workflow (Legacy/Alternative):**
-
-1. Load Groundkeeper config (`src/config.ts`)
-2. Initialize Groundkeeper context (`src/groundkeeper.ts`)
-3. Execute phases sequentially:
-   - **Scan**: Repository structure and metadata collection (`src/phases/ScanPhase.ts`)
-   - **Analyze**: Finding generation from scan results (`src/phases/AnalyzePhase.ts`)
-   - **Plan**: Action planning from findings (`src/phases/PlanPhase.ts`)
-   - **Propose**: Concrete proposal generation (`src/phases/ProposePhase.ts`)
-   - **Act**: Output generation and optional execution (`src/phases/ActPhase.ts`)
-4. Stop on first failure; continue on success
+**CLI Run Pipeline:**
 
-**State Management:**
+1. Load and validate agent config (`packages/core/src/config.ts`) from `.github/agent.yml`.
+2. Collect repo facts (`packages/core/src/repo.ts`) and file list (`packages/cli/src/index.ts`).
+3. Enforce scope and budgets (`packages/core/src/scope.ts`, `packages/core/src/budgets.ts`).
+4. Build stage definitions with prompt/templates and schemas (`packages/cli/src/index.ts`, `prompts/*.md`, `packages/schemas/stages/*.json`).
+5. Run stages sequentially with schema validation and logging (`packages/core/src/runner.ts`).
+6. Write report and optional publish payload to `.groundkeeper/` (`packages/cli/src/index.ts`).
 
-- Context object passed through all phases/stages: `GroundkeeperContext` (`src/types.ts`)
-- Each phase adds results to `context.results` dictionary
-- Stage results accumulated in array and logged
-- No global state; all data flows through parameters
+**State Management:**
+- Stage data is passed via `StageContext` and stage outputs chained as the next input (`packages/core/src/stage.ts`, `packages/core/src/runner.ts`).
 
 ## Key Abstractions
 
-**Phase:**
-- Purpose: Base class for workflow steps
-- Examples: `src/phases/ScanPhase.ts`, `src/phases/AnalyzePhase.ts`, `src/phases/PlanPhase.ts`
-- Pattern: Template method with `execute()` (logging/timing) and abstract `run()` (logic)
-
-**StageDefinition:**
-- Purpose: Typed stage with input/output schemas
-- Location: `packages/core/src/stage.ts`
-- Pattern: Generic type with `name`, schema paths, async `run()` function
-
-**Finding:**
-- Purpose: Represents a discovery from analysis
-- Location: `src/types.ts`
-- Fields: id, type, severity, message, file, suggestion
-- Used in: Analysis and planning phases
-
-**ProposedChange:**
-- Purpose: Represents actionable change (create, update, delete)
-- Location: `src/types.ts`
-- Fields: type, target, content, reason
-- Used in: Propose and Act phases
-
-**LlmRuntime:**
-- Purpose: Abstract LLM provider interface
-- Pattern: `{ generateJson(messages: LlmMessage[]): Promise<unknown> }`
-- Implementations: `packages/runtimes/openai/src/index.ts`, `packages/runtimes/copilot/src/index.ts`
+**StageDefinition/StageContext/StageResult:**
+- Purpose: Standardize stage execution contract.
+- Examples: `packages/core/src/stage.ts`, `packages/core/src/runner.ts`, `packages/cli/src/index.ts`.
+- Pattern: Stage pipeline with validated I/O and fail-fast behavior.
+
+**AgentConfig:**
+- Purpose: Configuration contract for scope, budgets, runtime, publish settings.
+- Examples: `packages/core/src/config.ts`, `.github/agent.yml`.
+- Pattern: Load/validate YAML then enforce constraints before running stages.
+
+**PromptPayload:**
+- Purpose: Normalize system/stage/schema/context payload passed to LLM.
+- Examples: `packages/core/src/prompt.ts`, `packages/cli/src/index.ts`.
+- Pattern: Build + serialize JSON payload for provider runtimes.
+
+**Runtime Adapters:**
+- Purpose: Provider-specific JSON generation from prompts.
+- Examples: `packages/runtimes/openai/src/index.ts`, `packages/runtimes/copilot/src/index.ts`.
+- Pattern: `generateJson(messages)` returns parsed JSON or throws errors.
 
 ## Entry Points
 
-**CLI Command: `groundkeeper run`:**
-- Location: `packages/cli/src/index.ts` (main), `src/cli.ts` (legacy)
-- Triggers: User executes `npm start` or `groundkeeper run` command
-- Responsibilities:
-  - Parse command-line options (config, directory)
-  - Load configuration and validate
-  - Collect repository facts
-  - Build and execute stage pipeline
-  - Generate report output
-  - Write publish payload if configured
-
-**CLI Command: `groundkeeper init`:**
-- Location: `src/cli.ts`
-- Triggers: User runs `groundkeeper init`
-- Responsibilities: Generate default `groundkeeper.yml` config template
-
-**CLI Command: `groundkeeper scan`:**
-- Location: `src/cli.ts`
-- Triggers: User runs `groundkeeper scan`
-- Responsibilities: Execute only scan phase
-
-**CLI Command: `groundkeeper analyze`:**
-- Location: `src/cli.ts`
-- Triggers: User runs `groundkeeper analyze`
-- Responsibilities: Execute scan and analyze phases
-
-**Groundkeeper.run():**
-- Location: `src/groundkeeper.ts`
-- Entry for programmatic use
-- Executes all phases in sequence
+**CLI Binary:**
+- Location: `packages/cli/src/index.ts`
+- Triggers: `groundkeeper run` via `package.json` bin (`dist/cli/src/index.js`).
+- Responsibilities: Parse args, load config, run stage pipeline, write outputs.
+
+**GitHub Action:**
+- Location: `action.yml`
+- Triggers: GitHub Actions workflows.
+- Responsibilities: Install/build CLI, run it, publish issue payload, upload artifacts.
 
 ## Error Handling
 
-**Strategy:** Fail-fast on validation or phase failure; continue on info-level findings.
+**Strategy:** Fail fast on schema or runtime errors, return structured stage results.
 
 **Patterns:**
-
-- **Schema Validation**: `validateSchema()` in `packages/core/src/schema.ts` returns error array; stage stops if errors present
-- **Phase Execution**: `Phase.execute()` wraps `run()` with try-catch; logs error and returns failed result
-- **Config Validation**: `ConfigLoader.validateConfig()` throws on invalid format or missing required fields
-- **Pipeline Stopping**: `Groundkeeper.executePhases()` breaks loop on first failed phase result
-- **Stage Dependency**: `runStages()` breaks loop on validation failure or execution error
+- Input/output validation errors stop the pipeline with `success: false` (`packages/core/src/runner.ts`).
+- Config validation throws with aggregated error list (`packages/core/src/config.ts`).
 
 ## Cross-Cutting Concerns
 
-**Logging:**
-- Approach: Console-based logging with phase/stage prefixes
-- Pattern: `console.log()` with `[${phaseName}]` or `[${stageName}]` prefix
-- Timing: Phase base class logs start/completion with duration
-- Files: Stage logs written to `.groundkeeper/logs/{stageName}.input.json` and `.groundkeeper/logs/{stageName}.output.json`
-
-**Validation:**
-- Approach: JSON schema validation at stage boundaries
-- Location: `packages/core/src/schema.ts` - `validateSchema()` recursive validator
-- Applied to: Stage input/output before/after execution
-- Configuration: YAML schema validation via regex patterns
-
-**Authentication:**
-- Approach: Environment variable-based token management
-- Tokens used: `GITHUB_TOKEN`, `OPENAI_API_KEY` (OpenAI runtime), Copilot SDK tokens
-- Read at: CLI initialization in `packages/cli/src/index.ts`
-- Scope checking: `packages/core/src/scope.ts` enforces allowed file patterns
+**Logging:** Stage start logs and JSON input/output logs in `.groundkeeper/logs` (`packages/core/src/runner.ts`).
+**Validation:** JSON schema validation for stages (`packages/core/src/schema.ts`) and manual config validation (`packages/core/src/config.ts`).
+**Authentication:** Provider credentials via env vars (`packages/cli/src/index.ts`) and GitHub token injection in action runtime (`action.yml`).
 
 ---
 
-*Architecture analysis: 2026-01-23*
+*Architecture analysis: 2026-02-09*