Epic: Codebase maintainability audit — structural refactors for long-term health

## Context

PR #482 (hub lock ownership refactor) demonstrated the pattern: **identify code that lives in module A but actually manages state owned by module B, then move it to B with a clean entry point.** That refactor was small (+118/−106) but significantly improved the hub system's maintainability.

This audit applies that same lens across the entire codebase (102k lines, 153 files). The goal is not to rewrite — it's to find every place where the code does the right thing but in a harder-to-maintain way, and plan incremental refactors.

Findings are grouped by severity and area. Each section is a potential sub-issue.

---

## 1. Stringly-Typed Domain Values (Pervasive)

**Impact:** Runtime bugs that should be compile-time errors. Every `match` on a string status/priority is a silent breakage vector.

**Locations:**
- `models.rs` — `status: String`, `priority: String` on `Issue` struct
- `trust_model.rs` — `model: String` ("local-only", "multi-tenant", "public-api")
- `issue_file.rs` — comment `kind: String` validated against `KNOWN_COMMENT_KINDS` list
- `findings.rs` / `issue_filing.rs` — `severity: String` throughout
- `token_usage.rs` — model names matched with `.contains()` on lowercase
- `lock_check.rs` — `read_auto_steal_config()` accepts bool/number/string for same field

**Fix:** Extract enums: `Status { Open, Closed }`, `Priority { Critical, High, Medium, Low }`, `CommentKind { Plan, Decision, Observation, ... }`, `Severity { Critical, High, Medium, Low, Info }`, `TrustModel { LocalOnly, MultiTenant, PublicApi }`. Derive `serde` with `#[serde(rename_all = "kebab-case")]` for backward-compatible serialization.

---

## 2. N+1 Query Patterns (Production Risk)

**Impact:** Linear DB round-trips per issue. Slow at scale, invisible until list views hit 100+ issues.

### 2a. `server/handlers/issues.rs:156-164` — `list_issues()`
For each issue in the result set, calls `db.get_labels(id)` and `db.get_blockers(id)` separately.
100 issues = 200 queries. Should be 2 batch queries returning `HashMap<i64, Vec<_>>`.

### 2b. `server/handlers/issues.rs:246-282` — `get_issue()`
Calls 6 separate queries per issue detail view: labels, comments, blockers, blocking, subissues, milestone.
**Fix:** `db.get_issue_detail(id)` returning a hydrated struct in 1-2 queries using JOINs.

### 2c. Silent error swallowing
Same handlers use `.unwrap_or_default()` on all those queries — if the DB is corrupted or locked, the user gets empty data with a 200 status, not a 500 error.

---

## 3. `main.rs` Command Dispatch (2,782 lines)

**Impact:** Every new command requires edits in 3+ places. Hidden aliases duplicate argument definitions verbatim.

### 3a. Clap definition duplication
`Commands::Create` (lines 299-323) and `IssueCommands::Create` (lines 465-489) have **identical field definitions** — title, description, priority, template, label, work, defer_id, parent. Same for Quick, List, Show, Close. ~10 hidden aliases × ~8 fields each.

### 3b. Per-arm resource initialization
`find_crosslink_dir()` called ~50 times, `get_db()` ~45 times across match arms. No lazy caching, no shared initialization.

### 3c. Swarm nested match (lines 2625-2757)
29 `SwarmCommands` arms, every one follows the same pattern: extract crosslink_dir, call `commands::swarm::<fn>()`. Should be a single `commands::swarm::dispatch(action, &crosslink_dir, json, quiet)` call.

### 3d. Inline business logic
- `Heartbeat` (lines 2504-2520): loads agent config, pushes heartbeat — should be in `daemon::heartbeat()`
- `Kickoff` (lines 2586-2611): hardcodes default `KickoffCommands::Launch` fields that are already defined as clap defaults
- `Config` (lines 2570-2576): bare vs. subcommand branching belongs in the config module

### 3e. JSON output branching
List, Search, Show commands all have `if json { run_json() } else { run() }` — repeated 3 times in `dispatch_issue()`. Should be handled at the command module level.

### 3f. External repo branching
List, Search, Show all copy-paste the same `if let Some(repo) = repo { external_issues::... } else { local::... }` pattern.

### 3g. Inconsistent param naming
Half the command enums use `action: FooCommands`, the other half use `command: FooCommands`. No rule for which.

---

## 4. `init.rs` Kitchen Sink (2,969 lines)

**Impact:** Changing Python detection requires reading through TUI rendering code. Five unrelated concerns in one file.

### Distinct concerns that should be separate modules:
1. **Python toolchain detection** (lines 26-210): `detect_python_prefix()`, `cpitd_is_installed()`, `install_cpitd()` — used by hooks/daemon too, not just init
2. **Git/signing configuration** (lines 212-286): `setup_driver_signing()` — security setup orthogonal to project init
3. **TUI walkthrough** (lines 956-1556, ~600 lines): `InitWalkthroughApp`, `draw_init_walkthrough()`, `run_tui_walkthrough()` — rendering logic that belongs in `tui/`
4. **Filesystem merging** (lines 365-545): `write_root_gitignore()`, `write_mcp_json_merged()`, `write_settings_json_merged()` — JSON merge utilities reused by config.rs
5. **The `run()` function** (lines 1598-2969, 1,370 lines): orchestrator that should delegate to the above

**Proposed structure:**
```
commands/init/
  mod.rs          (~300 lines, orchestration)
  python_env.rs   (~150 lines)
  signing.rs      (~80 lines)
  filesystem.rs   (~300 lines)
  tests.rs
tui/init_wizard.rs (~600 lines)
```

---

## 5. `signing.rs` Monolith (2,285 lines)

**Impact:** Five unrelated concerns in one file. Platform-specific branches make testing hard.

### 5a. Mixed concerns
- SSH key generation + Unix chmod + Windows icacls (lines 48-153, 105-line function)
- Git worktree detection (lines 246-281) — belongs in git-ops module
- AllowedSigners lifecycle (lines 423-556) — file I/O + CRUD + validation
- SSH signature pipeline (lines 566-762) — sign, verify, parse multi-format output
- Platform helpers (lines 764-807) — `home_dir_fallback()`, `make_temp_dir()`

### 5b. Repeated subprocess pattern
`Command::new("ssh-keygen")` appears 4+ times, each with its own output checking and error context. No shared `check_command_output()` helper.

### 5c. `AllowedSigners::parse()` uses `eprintln!()` for validation errors
Can't suppress in silent mode, can't test for specific warnings. Should return `Result` or use `tracing::warn!`.

### 5d. `generate_agent_key()` — 105 lines
Platform branches for Unix/Windows permissions are 57 lines of the function. Extract `set_key_permissions()`.

---

## 6. `seam.rs` Pipeline (2,149 lines)

**Impact:** Eight algorithmic concerns in one file. Hardcoded thresholds, Rust-specific detection mixed with language-agnostic collection.

### 6a. Eight concerns that should be separate modules
1. File system traversal + collection (lines 127-193)
2. Rust module boundary detection (lines 195-367) — only works for Rust
3. Directory-based fallback partitioning (lines 372-401)
4. Coverage validation + deduplication (lines 408-444)
5. Git coupling analysis (lines 453-519) — subprocess + output parsing
6. Graph-based partition merging with inlined union-find (lines 523-615)
7. Size-based partition adjustment (lines 622-691)
8. Partition count trimming (lines 763-803)

### 6b. Hardcoded thresholds with no configuration
`MAX_PARTITION_LINES = 2000`, `MIN_PARTITION_LINES = 200`, `GIT_LOG_DEPTH = 200`, `COUPLING_THRESHOLD = 3`.

### 6c. Inconsistent line counting
`ensure_complete_coverage()` creates `_uncategorized` partition with `line_count = 0`. `split_partition()` approximates per-file counts. Root path not threaded to all helpers.

### 6d. Label format is an implicit contract
Partitions use `::` for modules, `/a`/`/b` for splits, `+` for merges — no type safety.

---

## 7. TUI Duplication (8,718 lines across 7 files)

**Impact:** Adding a new list-based tab requires reimplementing navigation, search, detail scroll, and table rendering from scratch.

### 7a. Duplicated list rendering (~5 instances)
All tabs use identical vertical layout: header + content + footer. Table column construction, highlight styling, and status row formatting are copy-pasted.

### 7b. Duplicated input handlers (~21 occurrences of navigation pattern)
`Down/j → selected + 1`, `Up/k → selected - 1` appears in every tab. Search input handling duplicated between issues_tab and knowledge_tab. Detail scroll duplicated in 4 tabs.

**Fix:** Extract `ListNavigationHandler` and `SearchHandler` traits.

### 7c. Hardcoded colors scattered
`Color::Indexed(236)` (highlight bg) appears 5 times. Priority/status colors duplicated.

### 7d. Inconsistent state management
IssuesTab blocks UI thread on DB load. AgentsTab uses background thread + mpsc. ConfigTab uses mixed approach. No shared `BackgroundLoader<T>`.

### 7e. TUI code in `commands/`
`init.rs` ~600 lines, `config.rs` ~300 lines, `kickoff/wizard.rs` 1,193 lines of TUI rendering in command modules.

---

## 8. `compaction.rs` Monolith (3,508 lines)

### 8a. `apply()` — duplicated event reduction (lines 318-518)
13 of 14 event types follow identical pattern: check exists → mutate → update timestamp → track change. Label/relation/dependency add/remove are near-identical pairs.

### 8b. `compact()` — too many responsibilities (lines 144-279)
Mixes: lock acquisition, watermark reading, event collection, sorting, state reset, reduction loop, materialization, skew detection, checkpoint writing.

---

## 9. Duplicated Code Across Modules

### 9a. Test setup boilerplate — 19 files
`setup_test_db()` copy-pasted into 19 command test modules.

### 9b. Jaccard similarity — 2 implementations
`findings.rs` and `issue_filing.rs` both implement Jaccard with different thresholds (0.5 vs 0.7).

### 9c. String truncation — 3 implementations
`utils.rs::truncate()`, `utils.rs::truncate_slug()`, `kickoff/helpers.rs::slugify_with_max()`.

### 9d. JSON file read-or-empty — 3+ implementations
`init.rs`, `config.rs`, `knowledge/operations.rs` all implement the same pattern.

### 9e. Dynamic SQL construction — 3 instances
`db/issues.rs`, `db/token_usage.rs` (×2) build WHERE clauses with manual string concatenation. Extract `QueryBuilder`.

---

## 10. `hydration.rs` Complexity

### 10a. `hydrate_to_sqlite()` does too much (lines 74-359)
Clears data, deduplicates, hydrates 7 entity types, preserves SQLite-only issues, toggles FK constraints — all in one function.

### 10b. Three V1/V2 layout paths
`read_locks()`, `read_locks_v2()`, `read_locks_auto()` each reimplement error handling. Layout version not cached.

### 10c. Deduplication gaps
Groups by `display_id` but doesn't deduplicate by UUID.

---

## 11. `shared_writer` Brittleness

### 11a. `emit_compact_push()` — silent partial staging
If `cache_dir/issues/` doesn't exist, staging silently does nothing.

### 11b. `mutations.rs` — `Cell<i64>` for closure capture
If closure never executes (error path), `display_id.get()` is 0 with no guard.

### 11c. `hydrate_with_retry()` — undifferentiated retry
No distinction between transient and permanent errors. Single retry, no backoff.

---

## 12. Sync Layer Risks

### 12a. Lock retry without jitter
Multiple agents retry simultaneously after push conflict → thundering herd.

### 12b. PID-based stale lock detection
PID wrapping: recycled PID could cause incorrect lock auto-removal.

### 12c. 30-second polling timeout
300 iterations at 100ms = effective spin-wait under contention.

---

## 13. Business Logic in Wrong Layers

### 13a. Graph traversal in DB layer
`db/relations.rs` — DFS cycle detection in same file as SQL queries. Should be `utils::graph`.

### 13b. Server handler does post-filtering
`handlers/issues.rs` — search returns all issues, handler filters in memory. Should be in DB.

### 13c. Relation normalization inconsistency
`add_relation()` normalizes (smaller ID first), `get_related_issues()` queries both directions.

---

## 14. Remaining Single-File Modules

- `knowledge/operations.rs` (1,637 lines) → split into basic/sync/search
- `kickoff/launch.rs` (706 lines) → split into worktree/git/container/agent
- `kickoff/monitor.rs` (884 lines) → split into status/logs/report
- `kickoff/helpers.rs` (697 lines) → distribute to appropriate utility modules

---

## 15. Minor Issues

- `external.rs`: manual timeout loop for `git_ls_remote_ok()` instead of timeout crate
- `daemon.rs`: no SIGTERM signal handling
- `clock_skew.rs`: re-runs `git log` every detection cycle, no caching
- Magic numbers: similarity thresholds (0.5, 0.7), stale timeouts (30, 60), severity mappings — hardcoded without rationale
- `tui/issues_tab.rs:1109` has private copy of `format_issue_id()` from `utils.rs`

---

## Suggested Execution Order

### Phase 1 — Quick wins (low risk, high value)
- [ ] Extract `setup_test_db()` to shared test helper (19 files)
- [ ] Consolidate truncation/slugification (3 impls → 1)
- [ ] Extract JSON read-or-empty utility
- [ ] Deduplicate Jaccard similarity
- [ ] Introduce `Status`, `Priority`, `CommentKind` enums

### Phase 2 — Structural refactors (medium risk)
- [ ] Break `init.rs` into module directory
- [ ] Move TUI code out of `commands/` (init, config, kickoff wizards)
- [ ] Split `signing.rs` into `signing/` module directory
- [ ] Split `seam.rs` into `seam/` module directory
- [ ] Refactor `main.rs` dispatch (delegate nested matches, deduplicate aliases)

### Phase 3 — Performance & correctness
- [ ] Fix N+1 queries in `handlers/issues.rs` (batch labels/blockers)
- [ ] Add `db.get_issue_detail()` hydrated query
- [ ] Replace `.unwrap_or_default()` with proper error propagation in handlers
- [ ] Extract `QueryBuilder` for dynamic SQL
- [ ] Add jitter to lock retry loops

### Phase 4 — Deep refactors (higher risk)
- [ ] Refactor `compaction.rs` apply() to declarative field operations
- [ ] Split `hydrate_to_sqlite()` into per-entity functions
- [ ] Unify V1/V2 layout code paths
- [ ] Extract graph algorithms from `db/relations.rs`
- [ ] Split `knowledge/operations.rs`, `kickoff/launch.rs`, `kickoff/monitor.rs`

---

*This audit was performed on the `develop` branch at commit `83a59c14`. Findings reflect the codebase as of 2026-03-21.*

Epic: Codebase maintainability audit — structural refactors for long-term health #498

Description

Context

1. Stringly-Typed Domain Values (Pervasive)

2. N+1 Query Patterns (Production Risk)

2a. server/handlers/issues.rs:156-164 — list_issues()

2b. server/handlers/issues.rs:246-282 — get_issue()

2c. Silent error swallowing

3. main.rs Command Dispatch (2,782 lines)

3a. Clap definition duplication

3b. Per-arm resource initialization

3c. Swarm nested match (lines 2625-2757)

3d. Inline business logic

3e. JSON output branching

3f. External repo branching

3g. Inconsistent param naming

4. init.rs Kitchen Sink (2,969 lines)

Distinct concerns that should be separate modules:

5. signing.rs Monolith (2,285 lines)

5a. Mixed concerns

5b. Repeated subprocess pattern

5c. AllowedSigners::parse() uses eprintln!() for validation errors

5d. generate_agent_key() — 105 lines

6. seam.rs Pipeline (2,149 lines)

6a. Eight concerns that should be separate modules

6b. Hardcoded thresholds with no configuration

6c. Inconsistent line counting

6d. Label format is an implicit contract

7. TUI Duplication (8,718 lines across 7 files)

7a. Duplicated list rendering (~5 instances)

7b. Duplicated input handlers (~21 occurrences of navigation pattern)

7c. Hardcoded colors scattered

7d. Inconsistent state management

7e. TUI code in commands/

8. compaction.rs Monolith (3,508 lines)

8a. apply() — duplicated event reduction (lines 318-518)

8b. compact() — too many responsibilities (lines 144-279)

9. Duplicated Code Across Modules

9a. Test setup boilerplate — 19 files

9b. Jaccard similarity — 2 implementations

9c. String truncation — 3 implementations

9d. JSON file read-or-empty — 3+ implementations

9e. Dynamic SQL construction — 3 instances

10. hydration.rs Complexity

10a. hydrate_to_sqlite() does too much (lines 74-359)

10b. Three V1/V2 layout paths

10c. Deduplication gaps

11. shared_writer Brittleness

11a. emit_compact_push() — silent partial staging

11b. mutations.rs — Cell<i64> for closure capture

11c. hydrate_with_retry() — undifferentiated retry

12. Sync Layer Risks

12a. Lock retry without jitter

12b. PID-based stale lock detection

12c. 30-second polling timeout

13. Business Logic in Wrong Layers

13a. Graph traversal in DB layer

13b. Server handler does post-filtering

13c. Relation normalization inconsistency

14. Remaining Single-File Modules

15. Minor Issues

Suggested Execution Order

Phase 1 — Quick wins (low risk, high value)

Phase 2 — Structural refactors (medium risk)

Phase 3 — Performance & correctness

Phase 4 — Deep refactors (higher risk)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

2a. `server/handlers/issues.rs:156-164` — `list_issues()`

2b. `server/handlers/issues.rs:246-282` — `get_issue()`

3. `main.rs` Command Dispatch (2,782 lines)

4. `init.rs` Kitchen Sink (2,969 lines)

5. `signing.rs` Monolith (2,285 lines)

5c. `AllowedSigners::parse()` uses `eprintln!()` for validation errors

5d. `generate_agent_key()` — 105 lines

6. `seam.rs` Pipeline (2,149 lines)

7e. TUI code in `commands/`

8. `compaction.rs` Monolith (3,508 lines)

8a. `apply()` — duplicated event reduction (lines 318-518)

8b. `compact()` — too many responsibilities (lines 144-279)

10. `hydration.rs` Complexity

10a. `hydrate_to_sqlite()` does too much (lines 74-359)

11. `shared_writer` Brittleness

11a. `emit_compact_push()` — silent partial staging

11b. `mutations.rs` — `Cell<i64>` for closure capture

11c. `hydrate_with_retry()` — undifferentiated retry