diff --git a/AGENTS.md b/AGENTS.md
index 2dab179..d81a0cb 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -33,15 +33,19 @@ Use this order:
 - prefer clarity and explicit boundaries over magical automation
 - preserve `/srv/abyss-stack` as the canonical deployed runtime root unless explicitly redesigned
 - preserve the split between normative platform docs, public-safe host facts, and private host facts
+- treat current-machine fit as a first-class runtime concern before latency-sensitive or accelerator-sensitive work
 
 ## Host-facts rule
 
 - `docs/REFERENCE_PLATFORM.md` owns the intended host posture.
 - `docs/REFERENCE_PLATFORM_SPEC.md` owns the machine-readable contract and capture destinations.
+- `docs/MACHINE_FIT_POLICY.md` owns the current-machine adaptation policy and capture destinations.
 - `scripts/aoa-doctor` answers readiness, not durable inventory.
 - `scripts/aoa-host-facts` captures durable host facts.
+- `scripts/aoa-machine-fit` captures the bounded current-machine runtime posture.
 - public-safe artifacts may live under `docs/reference-platform/`
 - private captures belong under `${AOA_STACK_ROOT}/Logs/host-facts/`
+- private machine-fit captures belong under `${AOA_STACK_ROOT}/Logs/machine-fit/`
 
 ## Repository reading order
 
diff --git a/README.md b/README.md
index a32922a..973751f 100644
--- a/README.md
+++ b/README.md
@@ -18,6 +18,7 @@ This repository is the right home for:
 - runtime-facing return and bounded context-rebuild policy for agent-facing routes
 - security, runbook, backup, and restore posture
 - normative host posture and machine-readable host-facts contracts
+- current-machine fit policy, driver freshness posture, and bounded machine-local tuning guidance
 - platform-adaptation policy and public-safe/private tuning record contracts
 - infra helper services that support AoA and ToS
 
@@ -59,22 +60,23 @@ This repository should not absorb:
 15. Read [docs/STORAGE_LAYOUT](docs/STORAGE_LAYOUT.md).
 16. Read [docs/REFERENCE_PLATFORM](docs/REFERENCE_PLATFORM.md).
 17. Read [docs/REFERENCE_PLATFORM_SPEC](docs/REFERENCE_PLATFORM_SPEC.md).
-18. Read [docs/PLATFORM_ADAPTATION_POLICY](docs/PLATFORM_ADAPTATION_POLICY.md).
-19. Read [docs/BRANCH_POLICY](docs/BRANCH_POLICY.md).
-20. Read [docs/MEMO_RUNTIME_SEAM](docs/MEMO_RUNTIME_SEAM.md).
-21. Read [docs/EVAL_RUNTIME_SEAM](docs/EVAL_RUNTIME_SEAM.md).
-22. Read [docs/PLAYBOOK_RUNTIME_SEAM](docs/PLAYBOOK_RUNTIME_SEAM.md).
-23. Read [docs/MODEL_PROFILES](docs/MODEL_PROFILES.md).
-24. Read [docs/CONTEXT_BUDGET_POLICY](docs/CONTEXT_BUDGET_POLICY.md).
-25. Read [docs/RECURRENCE_RUNTIME_POLICY](docs/RECURRENCE_RUNTIME_POLICY.md).
-26. Read [docs/DEPLOYMENT](docs/DEPLOYMENT.md).
-27. Read [docs/FIRST_RUN](docs/FIRST_RUN.md).
-28. Read [docs/DOCTOR](docs/DOCTOR.md).
-29. Read [docs/SECRETS_BOOTSTRAP](docs/SECRETS_BOOTSTRAP.md).
-30. Read [docs/LIFECYCLE](docs/LIFECYCLE.md).
-31. Read [docs/RUNBOOK](docs/RUNBOOK.md).
-32. Read [docs/SECURITY](docs/SECURITY.md).
-33. Read [docs/MIGRATION_FROM_OLD](docs/MIGRATION_FROM_OLD.md).
+18. Read [docs/MACHINE_FIT_POLICY](docs/MACHINE_FIT_POLICY.md).
+19. Read [docs/PLATFORM_ADAPTATION_POLICY](docs/PLATFORM_ADAPTATION_POLICY.md).
+20. Read [docs/BRANCH_POLICY](docs/BRANCH_POLICY.md).
+21. Read [docs/MEMO_RUNTIME_SEAM](docs/MEMO_RUNTIME_SEAM.md).
+22. Read [docs/EVAL_RUNTIME_SEAM](docs/EVAL_RUNTIME_SEAM.md).
+23. Read [docs/PLAYBOOK_RUNTIME_SEAM](docs/PLAYBOOK_RUNTIME_SEAM.md).
+24. Read [docs/MODEL_PROFILES](docs/MODEL_PROFILES.md).
+25. Read [docs/CONTEXT_BUDGET_POLICY](docs/CONTEXT_BUDGET_POLICY.md).
+26. Read [docs/RECURRENCE_RUNTIME_POLICY](docs/RECURRENCE_RUNTIME_POLICY.md).
+27. Read [docs/DEPLOYMENT](docs/DEPLOYMENT.md).
+28. Read [docs/FIRST_RUN](docs/FIRST_RUN.md).
+29. Read [docs/DOCTOR](docs/DOCTOR.md).
+30. Read [docs/SECRETS_BOOTSTRAP](docs/SECRETS_BOOTSTRAP.md).
+31. Read [docs/LIFECYCLE](docs/LIFECYCLE.md).
+32. Read [docs/RUNBOOK](docs/RUNBOOK.md).
+33. Read [docs/SECURITY](docs/SECURITY.md).
+34. Read [docs/MIGRATION_FROM_OLD](docs/MIGRATION_FROM_OLD.md).
 
 For the shortest next route by intent:
 - if you need the ecosystem center, layer map, or federation rules, go to [`Agents-of-Abyss`](https://github.com/8Dionysus/Agents-of-Abyss)
@@ -88,6 +90,7 @@ For the shortest next route by intent:
 - if you need the Windows host and WSL bridge workflow, read [docs/WINDOWS_BRIDGE](docs/WINDOWS_BRIDGE.md), [docs/WINDOWS_SETUP](docs/WINDOWS_SETUP.md), and [docs/WINDOWS_PERFORMANCE](docs/WINDOWS_PERFORMANCE.md)
 - if you need runtime benchmark ownership, storage, and manifest rules, read [docs/RUNTIME_BENCH_POLICY](docs/RUNTIME_BENCH_POLICY.md)
 - if you need normative host posture or machine-readable host-facts capture, read [docs/REFERENCE_PLATFORM](docs/REFERENCE_PLATFORM.md) and [docs/REFERENCE_PLATFORM_SPEC](docs/REFERENCE_PLATFORM_SPEC.md)
+- if you need to tune the runtime to the current machine, confirm driver freshness, or decide which preset the host should prefer, read [docs/MACHINE_FIT_POLICY](docs/MACHINE_FIT_POLICY.md)
 - if you need a compact record of platform-specific quirks, adaptations, and portability notes, read [docs/PLATFORM_ADAPTATION_POLICY](docs/PLATFORM_ADAPTATION_POLICY.md)
 - if you need the repo merge and branch discipline, read [docs/BRANCH_POLICY](docs/BRANCH_POLICY.md)
 - if you need the runtime-side memo mirror, recall seam, or export candidates, read [docs/MEMO_RUNTIME_SEAM](docs/MEMO_RUNTIME_SEAM.md)
@@ -164,6 +167,7 @@ The repository now includes:
 - render-truth helpers for actual composed runtime output
 - runtime benchmark policy, schema, and example artifacts
 - reference-platform schema and host-facts capture support
+- machine-fit schema and current-host adaptation capture support
 - platform-adaptation schema, example artifacts, and capture support
 - preset-aware composition helpers and preset introspection
 - Windows host bridge scripts and WSL guidance docs
@@ -174,9 +178,9 @@ The repository now includes:
 
 ## Current status
 
-`abyss-stack` is now a live multi-service runtime with stateful storage, local and Intel-aware inference paths, monitoring, host-facts capture, platform-adaptation logging, and landed federation advisory seams for sibling AoA repositories.
+`abyss-stack` is now a live multi-service runtime with stateful storage, local and Intel-aware inference paths, monitoring, host-facts capture, machine-fit capture, platform-adaptation logging, and landed federation advisory seams for sibling AoA repositories.
 The first live consumer step has now landed in `langchain-api` through opt-in `POST /run/federated`, which can consume advisory playbook and memo seams without changing the default `POST /run` path.
-The next large step is no longer whether the live runtime should consume those seams at all, but how broadly and how deeply the runtime loop should rely on them.
+The next large step is no longer bootstrap or mirror landing, or whether the live runtime should consume those seams at all; it is deciding how broadly and how deeply the runtime loop should rely on those already-landed seams.
 
 ## License
 
diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md
index d15fa20..846639d 100644
--- a/docs/DEPLOYMENT.md
+++ b/docs/DEPLOYMENT.md
@@ -23,6 +23,7 @@ If you want the least-friction path, use:
 ```bash
 scripts/aoa-doctor
 scripts/aoa-first-run --strict
+scripts/aoa-machine-fit --mode private --write "${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json"
 ```
 
 `aoa-first-run --strict` is strict about layout and bootstrapped config presence, but still ignores missing secrets on that first pass by design.
@@ -102,6 +103,15 @@ Use `--strict` if warnings should fail the command.
 
 From a Windows host, use `pwsh -File scripts/aoa.ps1 host-doctor` for the Windows+WSL readiness pass before invoking the Linux doctor.
 
+### `scripts/aoa-machine-fit`
+
+Captures the bounded current-machine runtime posture after the layout exists.
+Use it to record:
+- which preset this host should currently prefer
+- whether the relevant host packages are current in configured repos
+- what validated local tuning should be reused
+- whether the current host envelope is too noisy for latency-sensitive work
+
 ### `scripts/aoa-install-layout`
 
 Creates the non-destructive runtime directory skeleton under `${AOA_STACK_ROOT}`.
@@ -204,6 +214,7 @@ scripts/aoa-install-layout
 scripts/aoa-sync-configs
 scripts/aoa-bootstrap-configs
 scripts/aoa-check-layout --ignore-secrets --strict
+scripts/aoa-machine-fit --mode private --write "${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json"
 scripts/aoa-sync-federation-surfaces --layer aoa-agents   # optional
 scripts/aoa-sync-federation-surfaces --layer aoa-routing  # optional
 scripts/aoa-sync-federation-surfaces --layer aoa-memo     # optional
diff --git a/docs/DOCTOR.md b/docs/DOCTOR.md
index f5f6e43..969112a 100644
--- a/docs/DOCTOR.md
+++ b/docs/DOCTOR.md
@@ -18,6 +18,8 @@ The current doctor pass looks at things like:
 - whether the optional vault path appears mounted
 - whether the stack root is the canonical `/srv/abyss-stack`
 - whether the selected runtime includes internal-only layers that should later be checked with `aoa-smoke --with-internal`
+- whether a current machine-fit record is missing for the deployed runtime root
+- whether the current host envelope looks noisy for latency-sensitive work
 
 ## Preset-aware and profile-aware behavior
 
@@ -39,6 +41,8 @@ Use `aoa-doctor` to decide whether a selected runtime is ready to start.
 
 Use `scripts/aoa-host-facts` to capture durable machine-readable host facts.
 
+Use `scripts/aoa-machine-fit` to capture the bounded current-machine runtime posture after host facts exist.
+
 The two surfaces complement each other and should not absorb each other's job.
 
 ## Usage
@@ -72,6 +76,7 @@ Durable host-facts capture:
 ```bash
 scripts/aoa-host-facts --mode public --write /tmp/reference-host.public.review.json
 scripts/aoa-host-facts --mode private --write "${AOA_STACK_ROOT}/Logs/host-facts/latest.private.json"
+scripts/aoa-machine-fit --mode private --write "${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json"
 ```
 
 Keep `docs/reference-platform/reference-host.public.json` for later canonical-host refreshes, not routine local captures.
@@ -110,6 +115,7 @@ For a generic full bundle:
 scripts/aoa-doctor --preset agent-full
 scripts/aoa-first-run --strict
 scripts/aoa-check-layout --strict
+scripts/aoa-machine-fit --mode private --write "${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json"
 scripts/aoa-smoke --with-internal --preset agent-full
 ```
 
@@ -119,5 +125,6 @@ For an Intel-aware full bundle:
 scripts/aoa-doctor --preset intel-full
 scripts/aoa-first-run --strict
 scripts/aoa-check-layout --strict
+scripts/aoa-machine-fit --mode private --write "${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json"
 scripts/aoa-smoke --with-internal --preset intel-full
 ```
diff --git a/docs/FIRST_RUN.md b/docs/FIRST_RUN.md
index 7c15c31..2dfb0fe 100644
--- a/docs/FIRST_RUN.md
+++ b/docs/FIRST_RUN.md
@@ -51,18 +51,20 @@ Then validate the fully bootstrapped layout:
 scripts/aoa-check-layout --strict
 ```
 
-## Optional but recommended: capture host facts
+## Optional but recommended: capture host facts and machine fit
 
-Once the runtime roots exist, record both the public-safe and local-private host posture:
+Once the runtime roots exist, record both the public-safe and local-private host posture, then capture the bounded current-machine fit:
 
 ```bash
 scripts/aoa-host-facts --mode public --write /tmp/reference-host.public.review.json
 scripts/aoa-host-facts --mode private --write "${AOA_STACK_ROOT}/Logs/host-facts/latest.private.json"
+scripts/aoa-machine-fit --mode private --write "${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json"
 ```
 
 Review the public artifact before commit.
 Do not commit the private artifact.
 Only refresh `docs/reference-platform/reference-host.public.json` when you are intentionally updating the reviewed canonical Linux reference host snapshot.
+Refresh the private machine-fit record when kernel, firmware, container runtime, or validated local tuning changes.
 
 ## Inspect the profile before launch
 
@@ -107,6 +109,8 @@ scripts/aoa-profile-modules --profile agentic --paths
 scripts/aoa-profile-endpoints --profile agentic
 scripts/aoa-render-services --profile agentic
 scripts/aoa-up --profile agentic
+scripts/aoa-smoke --profile agentic
+scripts/aoa-qwen-check --case exact-reply
 ```
 
 ### Intel-aware runtime
@@ -118,6 +122,8 @@ scripts/aoa-profile-modules --profile intel --paths
 scripts/aoa-profile-endpoints --profile intel
 scripts/aoa-render-services --profile intel
 scripts/aoa-up --profile intel
+scripts/aoa-smoke --profile intel
+scripts/aoa-qwen-check --case exact-reply
 ```
 
 ## Use a preset instead of spelling the whole composition
@@ -128,8 +134,21 @@ scripts/aoa-profile-endpoints --preset agent-full
 scripts/aoa-render-services --preset agent-full
 scripts/aoa-up --preset agent-full
 scripts/aoa-smoke --with-internal --preset agent-full
+scripts/aoa-qwen-bench --preset agent-full
 ```
 
+## Optional supervised local AI qualification
+
+Once the intended Qwen path is healthy, materialize the bounded local pilot and run the runtime wave:
+
+```bash
+scripts/aoa-local-ai-trials materialize
+scripts/aoa-local-ai-trials run-wave W0
+```
+
+That flow keeps machine-readable trial truth under `Logs/local-ai-trials/` and writes Markdown mirrors to `Dionysus/reports/local-ai-trials/`.
+Use [LOCAL_AI_TRIALS](LOCAL_AI_TRIALS.md) for the full contract.
+
 ## Compose optional layers manually
 
 ### Agent runtime plus tools
@@ -168,6 +187,7 @@ Then read:
 - [DEPLOYMENT](DEPLOYMENT.md)
 - [DOCTOR](DOCTOR.md)
 - [REFERENCE_PLATFORM_SPEC](REFERENCE_PLATFORM_SPEC.md)
+- [MACHINE_FIT_POLICY](MACHINE_FIT_POLICY.md)
 - [PRESETS](PRESETS.md)
 - [PROFILE_RECIPES](PROFILE_RECIPES.md)
 - [RENDER_TRUTH](RENDER_TRUTH.md)
diff --git a/docs/LOCAL_AI_TRIALS.md b/docs/LOCAL_AI_TRIALS.md
new file mode 100644
index 0000000..6f5b4e2
--- /dev/null
+++ b/docs/LOCAL_AI_TRIALS.md
@@ -0,0 +1,197 @@
+# LOCAL AI TRIALS
+
+## Purpose
+
+This document defines the bounded local-trial surface for supervised model trials on `abyss-stack`.
+
+It is narrower than a proof layer and narrower than a benchmark-only surface:
+
+- runtime truth stays local to `abyss-stack`
+- per-case trial packets stay explicit and reviewable
+- durable human+AI-readable summaries may be mirrored elsewhere
+- no new HTTP APIs are introduced for the trial surface
+
+## Canonical pilot in this runtime
+
+Current program:
+- `qwen-local-pilot-v1`
+
+Canonical baseline:
+- preset: `intel-full`
+- runtime path: `langchain-api /run`
+- local Qwen posture:
+  - `LC_OLLAMA_NUM_THREAD=6`
+  - `LC_OLLAMA_NUM_BATCH=32`
+  - `LC_OLLAMA_THINK=false`
+
+## Dual-surface reporting
+
+Runtime truth root:
+- `${AOA_STACK_ROOT}/Logs/local-ai-trials/qwen-local-pilot-v1/`
+
+Durable human+AI-readable mirror:
+- `/srv/Dionysus/reports/local-ai-trials/qwen-local-pilot-v1/`
+
+Keep the split explicit:
+
+- `abyss-stack` owns machine-readable trial truth and runtime-local artifacts
+- `Dionysus` may mirror curated Markdown reports and wave digests
+- do not move raw runtime truth into `Dionysus`
+- do not let the mirror become a shadow owner of runtime behavior
+
+## Required packet shape
+
+Each executed case must own one packet with:
+
+- `case.spec.json`
+- `run.manifest.json`
+- `result.summary.json`
+- `report.md`
+
+Each wave must own:
+
+- `wave-index.json`
+- `wave-index.md`
+
+The fixed report sections are:
+
+- `Goal`
+- `Inputs`
+- `Expected Result`
+- `Actual Result`
+- `Evidence`
+- `Boundary Check`
+- `Verdict`
+- `Failures`
+- `Follow-up`
+
+## Runner
+
+Use the runtime helper:
+
+```bash
+scripts/aoa-local-ai-trials materialize
+scripts/aoa-local-ai-trials run-wave W0
+scripts/aoa-local-ai-trials run-wave W1
+scripts/aoa-local-ai-trials run-wave W2
+scripts/aoa-local-ai-trials run-wave W3
+scripts/aoa-local-ai-trials prepare-wave W4 --lane docs
+scripts/aoa-local-ai-trials apply-case W4 <case-id>
+```
+
+What the helper does now:
+
+- materializes contracts and frozen case specs for `W0` through `W4`
+- writes planned wave indexes for later waves
+- executes `W0` on the intended local runtime path
+- executes `W1` through grounded local snippets on the same `langchain-api /run` path
+- executes `W2` through supervised read-only grounding on the same `langchain-api /run` path
+- executes `W3` through grounded exact-only selection on the same `langchain-api /run` path
+- prepares `W4` proposals through a staged supervised-edit flow
+- applies approved `W4` cases only after isolated worktree validation
+- restores the baseline after the parity sample
+
+What it does not do:
+
+- it does not introduce a new serving API
+- it does not upgrade runtime success into portable proof wording
+- it does not collapse `W4` into a silent monolithic mutator
+
+## W1 grounded execution
+
+Use:
+
+```bash
+scripts/aoa-qwen-run --prompt-file /tmp/example.prompt.txt --json
+```
+
+The `W1` runner:
+
+- reads only local text `source_refs`
+- stores bounded grounded excerpt capture in `grounding.txt`
+- builds `prompt.txt` from compact prompt slices derived from the same local refs
+- calls `aoa-qwen-run` with `temperature=0`
+- scores exact repo ownership and boundary confusion cases without introducing new HTTP APIs
+
+## W2 supervised read-only execution
+
+The `W2` runner:
+
+- requires a green `W1` gate before execution
+- captures local refs, HTTP `GET` evidence, and declared read-only command outcomes before prompting Qwen
+- stores `grounding.txt`, `prompt.txt`, `judge.prompt.txt`, and `evidence.summary.json` per case
+- uses a compact JSON answer contract instead of free-form prose
+- runs a second bounded judge pass through `aoa-qwen-run`
+- allows honest non-zero read-only command outcomes when the model reports them accurately and preserves boundaries
+- treats fabricated refs, paths, URLs, or commands as hard failures across the whole wave
+
+## W3 exact-only selection execution
+
+The `W3` runner:
+
+- requires a green `W2` gate before execution
+- captures local file refs and live HTTP source refs into `grounding.txt`, `prompt.txt`, and `evidence.summary.json`
+- uses `aoa-qwen-run` with `temperature=0`, `max_tokens=48`, and an exact-only plain-text answer contract
+- scores deterministically without a judge pass
+- treats silent widening as a case failure
+- treats unsafe-case mismatches or silent widening as wave-critical selection errors
+
+## W4 staged supervised edits
+
+The `W4` runner uses staged commands instead of `run-wave W4`.
+
+Use:
+
+```bash
+scripts/aoa-local-ai-trials prepare-wave W4 --lane docs
+scripts/aoa-local-ai-trials prepare-wave W4 --lane generated
+scripts/aoa-local-ai-trials apply-case W4 <case-id>
+```
+
+The `W4` flow:
+
+- requires a green `W3` gate before proposal preparation or apply
+- keeps docs-only and generated-refresh cases in separate lanes
+- prepares one proposal packet per case without mutating the target repo
+- keeps the public `prepare-wave W4` and `apply-case W4` interface stable while using a smaller staged internal docs flow
+- runs docs-lane `qwen_patch` preparation in four internal steps: `target-selection`, `alignment-plan`, `edit-spec exact`, and `edit-spec anchor fallback`
+- trims applicable root and nested `AGENTS.md` guidance to a bounded heading whitelist instead of copying full guide files into docs prompts
+- uses a hybrid docs mutation contract: `exact_replace` first, then `anchored_replace` if exact replacement is unavailable or ambiguous
+- fails closed when an edit-spec cannot be applied uniquely
+- builds `proposal.diff` deterministically inside the runner instead of accepting model-written raw unified diffs
+- uses `script_refresh` mode for generated cases and records the frozen builder command instead of asking the model for a diff
+- creates `approval.status.json` per case and requires explicit `approved` status before any mutation
+- runs every mutation first in an isolated git worktree
+- validates touched files against the frozen allowed-file scope before landing
+- reruns acceptance checks in the main repo only after the worktree passes
+- blocks generated-lane apply until docs lane has at least `5/6` passes and zero critical failures
+- continues docs-lane preparation across all cases even if one proposal is invalid
+
+W4-specific artifacts include:
+
+- `proposal.target.json`
+- `proposal.plan.json`
+- `proposal.edit-spec.json`
+- `proposal.prompt.txt`
+- `proposal.retry.prompt.txt`
+- `proposal.diff`
+- `proposal.summary.json`
+- `approval.status.json`
+- `worktree.manifest.json`
+
+W4 critical failures remain:
+
+- `unauthorized_scope_expansion`
+- `post_change_validation_failure`
+
+## Relationship to runtime benchmarks
+
+`aoa-qwen-bench` remains a bounded runtime benchmark helper.
+
+The local trial runner may reuse benchmark artifacts as evidence inside a case packet, but that reuse does not make the benchmark layer the owner of trial verdict meaning.
+
+Keep these boundaries:
+
+- runtime bench evidence is local machine truth
+- local trial packets are curated bounded case records
+- portable proof belongs in `aoa-evals`, not here
diff --git a/docs/MACHINE_FIT_POLICY.md b/docs/MACHINE_FIT_POLICY.md
new file mode 100644
index 0000000..a53f2dd
--- /dev/null
+++ b/docs/MACHINE_FIT_POLICY.md
@@ -0,0 +1,141 @@
+# MACHINE FIT POLICY
+
+## Purpose
+
+This document defines the bounded machine-fit layer for `abyss-stack`.
+
+The stack is not meant to run as if every host were interchangeable.
+It should:
+- discover what the current machine can actually do
+- prefer the strongest validated runtime path available on that machine
+- record driver and package freshness as part of runtime posture
+- keep that posture explicit enough for humans and agents to re-check later
+
+## What machine-fit is
+
+`machine-fit` is the current-host answer to:
+
+**what runtime selection, acceleration posture, and validated local tuning should this machine use right now?**
+
+It sits between:
+- `REFERENCE_PLATFORM.md`, which says what the stack is shaped for in general
+- host facts, which say what this host looks like
+- platform-adaptation records, which say what seam bent and what bounded change helped
+- runtime benchmarks, which say what latency or behavior was actually measured
+
+## What belongs here
+
+Use this layer for:
+- preferred preset or profile selection for the current host
+- current driver posture for visible accelerators
+- package freshness for the host packages that matter to the runtime path
+- validated local runtime settings such as bounded Ollama thread or batch posture
+- warnings about noisy host envelopes that can distort latency-sensitive work
+- compact refs to host facts, benchmark evidence, and adaptation records
+
+Do not use this layer for:
+- secret-bearing config
+- general troubleshooting diaries
+- broad capability marketing
+- proof-layer quality claims
+- authored doctrine from sibling AoA repositories
+
+## Relationship to other artifacts
+
+- `aoa-host-facts` records what the machine is
+- `aoa-machine-fit` records what runtime posture the machine should currently prefer
+- `aoa-platform-adaptation` records what specific seam bent and what bounded change helped
+- runtime benchmarks record measured behavior on the intended path
+
+The machine-fit layer is the operational bridge between inventory and retestable posture.
+
+## Artifact surfaces
+
+- `docs/machine-fit/schema.v1.json` defines the public contract
+- `docs/machine-fit/machine-fit.public.json.example` shows the intended public-safe shape
+- `${AOA_STACK_ROOT}/Logs/machine-fit/` is the local capture root
+
+## Capture modes
+
+### `public`
+
+Use when the artifact may live in git or be shared across machines.
+
+It should include:
+- hardware class
+- kernel release
+- visible accelerator posture
+- package freshness state
+- preferred preset or profile set
+- validated public-safe tuning keys
+- compact refs to public-safe host facts and reviewed adaptation examples when available
+
+It must not include:
+- hostnames
+- exact local-only paths
+- usernames or home directories unless intentionally public
+- secret-bearing env values
+
+### `private`
+
+Use when preserving the local machine record that operators and agents will actually consult.
+
+It may add:
+- local refs under `${AOA_STACK_ROOT}/Logs/`
+- fuller local driver and device posture
+- local benchmark refs
+- current host envelope warnings
+
+It still must not capture secrets.
+
+## Storage contract
+
+Recommended active tree:
+
+```text
+${AOA_STACK_ROOT}/Logs/machine-fit/
+  latest/
+    latest.private.json
+  records/
+    2026-03-29T230000Z__machine-fit__intel-core-ultra-9-285h/
+      machine-fit.private.json
+```
+
+Rules:
+- keep the JSON compact and export-friendly
+- reference bulky evidence instead of copying it
+- treat the machine-fit record as operational posture, not as benchmark truth
+- refresh it when kernel, firmware, drivers, container runtime, or validated local tuning changes
+
+## Strong record checklist
+
+A strong machine-fit record captures:
+- the current hardware class
+- the visible accelerator and driver posture
+- whether relevant host packages are current in configured repos
+- the preferred preset or profile set
+- the bounded validated runtime settings worth reusing
+- whether the current host envelope is quiet enough for latency-sensitive work
+- what to re-test when the machine drifts
+
+## Suggested commands
+
+Public-safe review:
+
+```bash
+scripts/aoa-machine-fit --mode public --write /tmp/machine-fit.public.review.json
+```
+
+Local private capture:
+
+```bash
+scripts/aoa-machine-fit \
+  --mode private \
+  --write "${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json"
+```
+
+## Boundary to preserve
+
+`abyss-stack` may own the runtime-local record of what this machine should run and re-check.
+
+It does not own the global meaning of sibling AoA layers, and it does not replace runtime benchmarks or proof artifacts.
diff --git a/docs/PLATFORM_ADAPTATION_POLICY.md b/docs/PLATFORM_ADAPTATION_POLICY.md
index 62f6626..ffbd6f8 100644
--- a/docs/PLATFORM_ADAPTATION_POLICY.md
+++ b/docs/PLATFORM_ADAPTATION_POLICY.md
@@ -27,6 +27,7 @@ Do not use this surface for:
 
 ## Relationship to other artifacts
 - `aoa-host-facts` records what a concrete machine looks like
+- `aoa-machine-fit` records what runtime posture that machine should currently prefer
 - runtime benchmarks record measured runtime behavior
 - platform-adaptation records connect the two with bounded diagnosis and adaptation notes
 
diff --git a/docs/PROFILE_RECIPES.md b/docs/PROFILE_RECIPES.md
index 2a0934b..70361b4 100644
--- a/docs/PROFILE_RECIPES.md
+++ b/docs/PROFILE_RECIPES.md
@@ -78,6 +78,8 @@ scripts/aoa-render-services --profile agentic
 scripts/aoa-up --profile agentic
 scripts/aoa-wait --profile agentic
 scripts/aoa-smoke --profile agentic
+scripts/aoa-qwen-check --case exact-reply
+scripts/aoa-qwen-bench --profile agentic
 ```
 
 ## `intel`
@@ -102,6 +104,8 @@ scripts/aoa-render-services --profile intel
 scripts/aoa-up --profile intel
 scripts/aoa-wait --profile intel
 scripts/aoa-smoke --profile intel
+scripts/aoa-qwen-check --case exact-reply
+scripts/aoa-qwen-bench --profile intel
 ```
 
 ## `federation`
@@ -211,6 +215,7 @@ Preset form:
 aoa-preset-profiles --preset agent-tools --paths
 aoa-up --preset agent-tools
 aoa-smoke --with-internal --preset agent-tools
+aoa-qwen-bench --preset agent-tools
 ```
 
 ### `agentic + observability`
@@ -236,6 +241,7 @@ Preset form:
 aoa-preset-profiles --preset agent-observability --paths
 aoa-up --preset agent-observability
 aoa-smoke --with-internal --preset agent-observability
+aoa-qwen-bench --preset agent-observability
 ```
 
 ### `agentic + federation`
@@ -314,4 +320,5 @@ Preset form:
 aoa-preset-profiles --preset intel-full --paths
 aoa-up --preset intel-full
 aoa-smoke --with-internal --preset intel-full
+aoa-qwen-bench --preset intel-full
 ```
diff --git a/docs/REFERENCE_PLATFORM.md b/docs/REFERENCE_PLATFORM.md
index bd496dd..c27acfd 100644
--- a/docs/REFERENCE_PLATFORM.md
+++ b/docs/REFERENCE_PLATFORM.md
@@ -13,18 +13,21 @@
 This file is normative. It names the intended operating posture.
 
 Observed machine facts belong to the machine-readable host-facts layer described in [REFERENCE_PLATFORM_SPEC](REFERENCE_PLATFORM_SPEC.md).
+The current-host runtime choice belongs to [MACHINE_FIT_POLICY](MACHINE_FIT_POLICY.md).
 
 Recommended local review flow:
 
 ```bash
 scripts/aoa-host-facts --mode public --write /tmp/reference-host.public.review.json
 scripts/aoa-host-facts --mode private --write "${AOA_STACK_ROOT}/Logs/host-facts/latest.private.json"
+scripts/aoa-machine-fit --mode private --write "${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json"
 ```
 
 The repository may carry a reviewed canonical public snapshot at `docs/reference-platform/reference-host.public.json`.
 Refresh that file intentionally when you are updating the chosen canonical Linux reference host, not during routine local captures.
 
 `aoa-doctor` stays focused on readiness. It is not the durable inventory surface.
+`aoa-machine-fit` is the bounded surface that says what this concrete machine should currently prefer.
 
 ## Fedora-first means
 
@@ -68,6 +71,14 @@ It is shaped around:
 - fast SSD or NVMe for active state
 - enough free headroom for models, service state, and logs
 
+## Operational principle
+
+The stack should not pretend that every machine deserves the same runtime posture.
+Once the normative posture is satisfied, the next step is to fit the runtime to the actual host:
+- prefer the strongest validated preset the host can support
+- preserve the driver and package freshness state that shaped that decision
+- refresh the machine-fit record when the host drifts
+
 ## Known user-specific fit
 
 This repository is intentionally aligned with:
diff --git a/docs/REFERENCE_PLATFORM_SPEC.md b/docs/REFERENCE_PLATFORM_SPEC.md
index 06cd72f..b594f51 100644
--- a/docs/REFERENCE_PLATFORM_SPEC.md
+++ b/docs/REFERENCE_PLATFORM_SPEC.md
@@ -6,6 +6,7 @@ This document defines the machine-readable host-facts layer for `abyss-stack`.
 
 `REFERENCE_PLATFORM.md` tells you the intended host shape.
 The host-facts layer records what a concrete machine actually looks like.
+The machine-fit layer then decides what that host should currently prefer.
 
 ## Artifact surfaces
 
@@ -76,7 +77,8 @@ If a proposed field makes attacker reconnaissance easier but does not materially
 2. Capture a public snapshot and review it before commit.
 3. Capture a private snapshot locally when you need fuller deployment evidence.
 4. Keep the schema version stable until the contract changes.
-5. When the shape changes, update this doc, the schema, the capture script, validation, and workflow coverage together.
+5. Use [MACHINE_FIT_POLICY](MACHINE_FIT_POLICY.md) when you need the bounded current-host runtime posture.
+6. When the shape changes, update this doc, the schema, the capture script, validation, and workflow coverage together.
 
 ## Suggested commands
 
diff --git a/docs/RUNBOOK.md b/docs/RUNBOOK.md
index 96618ed..b32dda9 100644
--- a/docs/RUNBOOK.md
+++ b/docs/RUNBOOK.md
@@ -10,17 +10,18 @@ When something feels wrong, use this order:
 4. check internal-only probes when relevant
 5. check rendered runtime truth when composition may be the problem
 6. capture or compare host facts when the machine itself may have drifted
-7. capture a bounded platform-adaptation record when the seam looks machine-specific or likely to recur on another platform
-8. check container state
-9. check health endpoints
-10. check logs
-11. inspect memo export candidates under `${AOA_STACK_ROOT}/Logs/memo-exports/` when recurrence, checkpoint, or review artifacts may need bounded export toward `aoa-memo`
-12. inspect eval export candidates under `${AOA_STACK_ROOT}/Logs/eval-exports/` when runtime evidence selections or artifact hooks may need bounded export toward `aoa-evals`
-13. inspect `route-api` playbook advisory surfaces when activation, failure posture, or composition seams may explain the current route
-14. inspect `route-api` KAG and `Tree-of-Sophia` handoff advisory surfaces when retrieval, regrounding, or source-authority seams may explain the current route
-15. inspect `POST /run/federated` plus its `advisory_trace` when the live runtime may be consuming playbook or memo seams incorrectly
-16. decide whether to fix forward or roll back
-17. inspect the latest return events under `${AOA_STACK_ROOT}/Logs/returns/` when the route appears to be looping, widening context, or silently re-entering
+7. refresh or compare machine-fit when the question is what this host should currently prefer
+8. capture a bounded platform-adaptation record when the seam looks machine-specific or likely to recur on another platform
+9. check container state
+10. check health endpoints
+11. check logs
+12. inspect memo export candidates under `${AOA_STACK_ROOT}/Logs/memo-exports/` when recurrence, checkpoint, or review artifacts may need bounded export toward `aoa-memo`
+13. inspect eval export candidates under `${AOA_STACK_ROOT}/Logs/eval-exports/` when runtime evidence selections or artifact hooks may need bounded export toward `aoa-evals`
+14. inspect `route-api` playbook advisory surfaces when activation, failure posture, or composition seams may explain the current route
+15. inspect `route-api` KAG and `Tree-of-Sophia` handoff advisory surfaces when retrieval, regrounding, or source-authority seams may explain the current route
+16. inspect `POST /run/federated` plus its `advisory_trace` when the live runtime may be consuming playbook or memo seams incorrectly
+17. decide whether to fix forward or roll back
+18. inspect the latest return events under `${AOA_STACK_ROOT}/Logs/returns/` when the route appears to be looping, widening context, or silently re-entering
 
 ## Useful commands
 
@@ -29,6 +30,7 @@ aoa-doctor
 aoa-doctor --preset agent-full
 aoa-check-layout
 aoa-host-facts --mode public
+aoa-machine-fit --mode private --write "${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json"
 aoa-platform-adaptation --mode private --title "Short seam title" --summary "One bounded summary" --issue-class performance
 aoa-export-memo-candidate --runtime-surface checkpoint_export --input-file /tmp/checkpoint-export.json --write
 aoa-export-runtime-evidence-selection --input-file /tmp/runtime-evidence-selection.json --write
diff --git a/docs/RUNTIME_BENCH_POLICY.md b/docs/RUNTIME_BENCH_POLICY.md
index 78fc042..26cbc4d 100644
--- a/docs/RUNTIME_BENCH_POLICY.md
+++ b/docs/RUNTIME_BENCH_POLICY.md
@@ -106,6 +106,33 @@ A strong runtime benchmark run should produce:
 
 `notes.md` carries human review notes, caveats, and non-claims.
 
+## First bounded runner
+
+For the current local Qwen path, use the runtime-local bench wrapper:
+
+```bash
+scripts/aoa-qwen-bench --profile agentic
+scripts/aoa-qwen-bench --preset intel-full
+```
+
+This runner stays on the intended `langchain-api /run` path and writes machine-local evidence under `${AOA_STACK_ROOT}/Logs/runtime-benchmarks/runs/`.
+It performs one uncounted warmup call per case before measured repeats so warm-latency reads stay warm by definition instead of by accident.
+
+## Relationship to local trial programs
+
+If you need a supervised per-case trial program rather than a standalone benchmark run, use:
+
+```bash
+scripts/aoa-local-ai-trials materialize
+scripts/aoa-local-ai-trials run-wave W0
+```
+
+That helper may reuse runtime benchmark artifacts as evidence inside case packets, but it does not change the benchmark boundary:
+
+- benchmark artifacts remain runtime-local truth in `abyss-stack`
+- wave verdicts remain bounded trial judgments, not portable eval canon
+- portable proof wording still belongs in `aoa-evals`
+
 ## Comparison hygiene
 Before treating two runs as comparable, keep stable:
 - host hardware class or disclose the delta
diff --git a/docs/machine-fit/README.md b/docs/machine-fit/README.md
new file mode 100644
index 0000000..f86fc9c
--- /dev/null
+++ b/docs/machine-fit/README.md
@@ -0,0 +1,18 @@
+# machine-fit
+
+This directory defines the commit-safe contract for `abyss-stack` machine-fit records.
+
+Use it when you need one compact artifact that says:
+- what the current host can visibly support
+- which runtime selection the stack should currently prefer
+- whether the relevant host package set looks fresh in configured repos
+- what bounded tuning posture is worth carrying forward on that machine
+
+Surfaces:
+- `schema.v1.json` — machine-readable contract
+- `machine-fit.public.json.example` — public-safe example shape
+
+Private captures belong under:
+- `${AOA_STACK_ROOT}/Logs/machine-fit/`
+
+Do not commit private captures from live machines.
diff --git a/docs/machine-fit/machine-fit.public.json.example b/docs/machine-fit/machine-fit.public.json.example
new file mode 100644
index 0000000..b27cfde
--- /dev/null
+++ b/docs/machine-fit/machine-fit.public.json.example
@@ -0,0 +1,179 @@
+{
+  "artifact_kind": "aoa.machine-fit",
+  "schema_version": "1",
+  "capture_mode": "public",
+  "captured_at": "2026-03-29T23:10:00Z",
+  "captured_by": "scripts/aoa-machine-fit",
+  "assessment_id": "2026-03-29T231000Z__machine-fit__intel-core-ultra-9-285h",
+  "machine": {
+    "os_id": "fedora",
+    "os_version_id": "43",
+    "kernel_release": "6.19.9-200.fc43.x86_64",
+    "arch": "x86_64",
+    "cpu_model": "Intel(R) Core(TM) Ultra 9 285H",
+    "logical_cpus": 16,
+    "memory_total_bytes": 33035354112,
+    "hardware_class": "intel-core-ultra-9-285h"
+  },
+  "driver_posture": {
+    "kernel_modules_loaded": [
+      "i915",
+      "xe",
+      "intel_vpu"
+    ],
+    "dri": {
+      "dev_dri_present": true,
+      "render_nodes": [
+        "renderD128"
+      ],
+      "current_user_in_render_group": true,
+      "current_user_in_video_group": true
+    },
+    "accel": {
+      "dev_accel_present": true,
+      "accel_nodes": [
+        "accel0"
+      ]
+    },
+    "display_devices": [
+      {
+        "header": "00:02.0 Display controller [0380]: Intel Corporation Arrow Lake-P [Intel Graphics] [8086:7d51]",
+        "driver_in_use": "i915",
+        "kernel_modules": [
+          "i915",
+          "xe"
+        ]
+      }
+    ],
+    "ai_devices": [
+      {
+        "header": "00:0b.0 Processing accelerators [1200]: Intel Corporation Arrow Lake-P Gaussian & Neural Accelerator [8086:774c]",
+        "driver_in_use": null,
+        "kernel_modules": []
+      },
+      {
+        "header": "00:0b.1 Processing accelerators [1200]: Intel Corporation Meteor Lake NPU [8086:7d1d]",
+        "driver_in_use": "intel_vpu",
+        "kernel_modules": [
+          "intel_vpu"
+        ]
+      }
+    ]
+  },
+  "package_freshness": {
+    "package_manager": "dnf",
+    "state": "up-to-date",
+    "packages": [
+      {
+        "name": "kernel-core",
+        "installed": true,
+        "version": "6.19.9-200.fc43.x86_64"
+      },
+      {
+        "name": "linux-firmware",
+        "installed": true,
+        "version": "20260309-1.fc43.noarch"
+      },
+      {
+        "name": "fwupd",
+        "installed": true,
+        "version": "2.0.20-1.fc43.x86_64"
+      },
+      {
+        "name": "podman",
+        "installed": true,
+        "version": "5.8.1-1.fc43.x86_64"
+      },
+      {
+        "name": "podman-compose",
+        "installed": true,
+        "version": "1.5.0-4.fc43.noarch"
+      },
+      {
+        "name": "mesa-dri-drivers",
+        "installed": true,
+        "version": "25.3.6-3.fc43.x86_64"
+      },
+      {
+        "name": "mesa-vulkan-drivers",
+        "installed": true,
+        "version": "25.3.6-3.fc43.x86_64"
+      },
+      {
+        "name": "intel-media-driver",
+        "installed": true,
+        "version": "25.3.4-1.fc43.x86_64"
+      },
+      {
+        "name": "libva-intel-media-driver",
+        "installed": true,
+        "version": "25.4.6-1.fc43.x86_64"
+      },
+      {
+        "name": "intel-compute-runtime",
+        "installed": true,
+        "version": "25.48.36300.8-3.fc43.x86_64"
+      }
+    ],
+    "updates_available": [],
+    "missing_packages": [],
+    "checked_command": "dnf -q check-update kernel-core linux-firmware fwupd podman podman-compose mesa-dri-drivers mesa-vulkan-drivers intel-media-driver libva-intel-media-driver intel-compute-runtime"
+  },
+  "runtime_recommendation": {
+    "preferred_preset": "intel-full",
+    "preferred_profile_set": [
+      "intel",
+      "tools",
+      "observability"
+    ],
+    "preferred_runtime_path": "intel-full -> langchain-api /run -> litellm/ollama + route-api",
+    "validated_acceleration_posture": "OVMS embeddings on Intel GPU; Qwen chat via Ollama; Intel NPU is visible but not yet part of the validated canonical path.",
+    "validated_settings": {
+      "LC_OLLAMA_NUM_THREAD": "6",
+      "LC_OLLAMA_NUM_BATCH": "32",
+      "LC_OLLAMA_THINK": "false"
+    },
+    "recommended_overlays": [],
+    "current_overlays": [],
+    "host_facts_ref": "repo:docs/reference-platform/reference-host.public.json.example",
+    "platform_adaptation_ref": "repo:docs/platform-adaptations/platform-adaptation.public.json.example"
+  },
+  "host_envelope": {
+    "loadavg_1m": 1.22,
+    "loadavg_5m": 1.14,
+    "loadavg_15m": 1.08,
+    "available_memory_bytes": 15232413696,
+    "latency_trial_ready": true,
+    "notes": []
+  },
+  "fit_verdict": {
+    "status": "qualified",
+    "summary": "Preferred preset is intel-full. Qwen chat should stay on langchain-api /run through the validated local path. Relevant host packages are current in the configured Fedora repositories.",
+    "next_actions": [
+      "Run scripts/aoa-doctor --preset intel-full before launch.",
+      "Refresh host facts when the host or kernel changes.",
+      "Re-run machine-fit after driver, kernel, container-runtime, or benchmark drift."
+    ],
+    "retest_on": [
+      "kernel update",
+      "linux-firmware update",
+      "mesa or Intel runtime update",
+      "Ollama or langchain-api runtime change",
+      "host load envelope change before latency-sensitive trials"
+    ]
+  },
+  "evidence_refs": [
+    "repo:docs/MACHINE_FIT_POLICY.md"
+  ],
+  "non_claims": [
+    "This record does not claim global model quality.",
+    "This record does not replace bounded runtime benchmarks.",
+    "This record does not prove latency budgets under arbitrary concurrent desktop load."
+  ],
+  "redaction": {
+    "redacted_fields": [
+      "local-only hostnames",
+      "exact local paths outside repo refs"
+    ]
+  }
+}
diff --git a/docs/machine-fit/schema.v1.json b/docs/machine-fit/schema.v1.json
new file mode 100644
index 0000000..1c070f0
--- /dev/null
+++ b/docs/machine-fit/schema.v1.json
@@ -0,0 +1,461 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "https://aoa.invalid/abyss-stack/machine-fit/schema.v1.json",
+  "title": "AoA Machine Fit Record",
+  "type": "object",
+  "additionalProperties": false,
+  "required": [
+    "artifact_kind",
+    "schema_version",
+    "capture_mode",
+    "captured_at",
+    "captured_by",
+    "assessment_id",
+    "machine",
+    "driver_posture",
+    "package_freshness",
+    "runtime_recommendation",
+    "host_envelope",
+    "fit_verdict",
+    "evidence_refs",
+    "non_claims"
+  ],
+  "properties": {
+    "artifact_kind": {
+      "const": "aoa.machine-fit"
+    },
+    "schema_version": {
+      "const": "1"
+    },
+    "capture_mode": {
+      "enum": [
+        "public",
+        "private"
+      ]
+    },
+    "captured_at": {
+      "type": "string",
+      "format": "date-time"
+    },
+    "captured_by": {
+      "const": "scripts/aoa-machine-fit"
+    },
+    "assessment_id": {
+      "type": "string",
+      "minLength": 1
+    },
+    "machine": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": [
+        "os_id",
+        "os_version_id",
+        "kernel_release",
+        "arch",
+        "cpu_model",
+        "logical_cpus",
+        "memory_total_bytes",
+        "hardware_class"
+      ],
+      "properties": {
+        "os_id": {
+          "type": [
+            "string",
+            "null"
+          ]
+        },
+        "os_version_id": {
+          "type": [
+            "string",
+            "null"
+          ]
+        },
+        "kernel_release": {
+          "type": [
+            "string",
+            "null"
+          ]
+        },
+        "arch": {
+          "type": [
+            "string",
+            "null"
+          ]
+        },
+        "cpu_model": {
+          "type": [
+            "string",
+            "null"
+          ]
+        },
+        "logical_cpus": {
+          "type": [
+            "integer",
+            "null"
+          ]
+        },
+        "memory_total_bytes": {
+          "type": [
+            "integer",
+            "null"
+          ]
+        },
+        "hardware_class": {
+          "type": [
+            "string",
+            "null"
+          ]
+        }
+      }
+    },
+    "driver_posture": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": [
+        "kernel_modules_loaded",
+        "dri",
+        "accel",
+        "display_devices",
+        "ai_devices"
+      ],
+      "properties": {
+        "kernel_modules_loaded": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        },
+        "dri": {
+          "type": "object",
+          "additionalProperties": false,
+          "required": [
+            "dev_dri_present",
+            "render_nodes",
+            "current_user_in_render_group",
+            "current_user_in_video_group"
+          ],
+          "properties": {
+            "dev_dri_present": {
+              "type": "boolean"
+            },
+            "render_nodes": {
+              "type": "array",
+              "items": {
+                "type": "string"
+              }
+            },
+            "current_user_in_render_group": {
+              "type": "boolean"
+            },
+            "current_user_in_video_group": {
+              "type": "boolean"
+            }
+          }
+        },
+        "accel": {
+          "type": "object",
+          "additionalProperties": false,
+          "required": [
+            "dev_accel_present",
+            "accel_nodes"
+          ],
+          "properties": {
+            "dev_accel_present": {
+              "type": "boolean"
+            },
+            "accel_nodes": {
+              "type": "array",
+              "items": {
+                "type": "string"
+              }
+            }
+          }
+        },
+        "display_devices": {
+          "type": "array",
+          "items": {
+            "$ref": "#/$defs/pciDevice"
+          }
+        },
+        "ai_devices": {
+          "type": "array",
+          "items": {
+            "$ref": "#/$defs/pciDevice"
+          }
+        }
+      }
+    },
+    "package_freshness": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": [
+        "package_manager",
+        "state",
+        "packages",
+        "updates_available",
+        "missing_packages",
+        "checked_command"
+      ],
+      "properties": {
+        "package_manager": {
+          "type": [
+            "string",
+            "null"
+          ]
+        },
+        "state": {
+          "enum": [
+            "up-to-date",
+            "updates-available",
+            "unknown"
+          ]
+        },
+        "packages": {
+          "type": "array",
+          "items": {
+            "$ref": "#/$defs/packageRecord"
+          }
+        },
+        "updates_available": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        },
+        "missing_packages": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        },
+        "checked_command": {
+          "type": [
+            "string",
+            "null"
+          ]
+        }
+      }
+    },
+    "runtime_recommendation": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": [
+        "preferred_preset",
+        "preferred_profile_set",
+        "preferred_runtime_path",
+        "validated_acceleration_posture",
+        "validated_settings",
+        "recommended_overlays",
+        "current_overlays",
+        "host_facts_ref",
+        "platform_adaptation_ref"
+      ],
+      "properties": {
+        "preferred_preset": {
+          "type": "string"
+        },
+        "preferred_profile_set": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        },
+        "preferred_runtime_path": {
+          "type": "string"
+        },
+        "validated_acceleration_posture": {
+          "type": "string"
+        },
+        "validated_settings": {
+          "type": "object",
+          "additionalProperties": {
+            "type": "string"
+          }
+        },
+        "recommended_overlays": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        },
+        "current_overlays": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        },
+        "host_facts_ref": {
+          "type": [
+            "string",
+            "null"
+          ]
+        },
+        "platform_adaptation_ref": {
+          "type": [
+            "string",
+            "null"
+          ]
+        }
+      }
+    },
+    "host_envelope": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": [
+        "loadavg_1m",
+        "loadavg_5m",
+        "loadavg_15m",
+        "available_memory_bytes",
+        "latency_trial_ready",
+        "notes"
+      ],
+      "properties": {
+        "loadavg_1m": {
+          "type": [
+            "number",
+            "null"
+          ]
+        },
+        "loadavg_5m": {
+          "type": [
+            "number",
+            "null"
+          ]
+        },
+        "loadavg_15m": {
+          "type": [
+            "number",
+            "null"
+          ]
+        },
+        "available_memory_bytes": {
+          "type": [
+            "integer",
+            "null"
+          ]
+        },
+        "latency_trial_ready": {
+          "type": "boolean"
+        },
+        "notes": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        }
+      }
+    },
+    "fit_verdict": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": [
+        "status",
+        "summary",
+        "next_actions",
+        "retest_on"
+      ],
+      "properties": {
+        "status": {
+          "enum": [
+            "qualified",
+            "qualified-noisy-host",
+            "needs-attention"
+          ]
+        },
+        "summary": {
+          "type": "string"
+        },
+        "next_actions": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        },
+        "retest_on": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        }
+      }
+    },
+    "evidence_refs": {
+      "type": "array",
+      "items": {
+        "type": "string"
+      }
+    },
+    "non_claims": {
+      "type": "array",
+      "items": {
+        "type": "string"
+      }
+    },
+    "redaction": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": [
+        "redacted_fields"
+      ],
+      "properties": {
+        "redacted_fields": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        }
+      }
+    }
+  },
+  "$defs": {
+    "pciDevice": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": [
+        "header",
+        "driver_in_use",
+        "kernel_modules"
+      ],
+      "properties": {
+        "header": {
+          "type": "string"
+        },
+        "driver_in_use": {
+          "type": [
+            "string",
+            "null"
+          ]
+        },
+        "kernel_modules": {
+          "type": "array",
+          "items": {
+            "type": "string"
+          }
+        }
+      }
+    },
+    "packageRecord": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": [
+        "name",
+        "installed",
+        "version"
+      ],
+      "properties": {
+        "name": {
+          "type": "string"
+        },
+        "installed": {
+          "type": "boolean"
+        },
+        "version": {
+          "type": [
+            "string",
+            "null"
+          ]
+        }
+      }
+    }
+  }
+}
diff --git a/scripts/AGENTS.md b/scripts/AGENTS.md
index ed5380a..f44ed35 100644
--- a/scripts/AGENTS.md
+++ b/scripts/AGENTS.md
@@ -17,12 +17,15 @@ This directory owns the runtime bridge, bootstrap helpers, introspection helpers
 11. `docs/PATHS.md`
 12. `docs/REFERENCE_PLATFORM.md`
 13. `docs/REFERENCE_PLATFORM_SPEC.md`
+14. `docs/MACHINE_FIT_POLICY.md`
 
 ## Directory contract
 - Bash wrappers are operator-facing helpers and should be safe by default.
 - Shared env defaults, selector parsing, compose resolution, and probe helpers live in `scripts/aoa-lib.sh`.
 - `scripts/validate_stack.py` is the repo-structure validator. Keep it stdlib-only unless the repo explicitly changes policy.
 - `scripts/aoa-host-facts` owns durable machine-readable host-facts capture. Keep it stdlib-only and secret-safe.
+- `scripts/aoa-machine-fit` owns the durable bounded record of what the current machine should prefer right now. Keep it stdlib-only and secret-safe.
+- `scripts/aoa-qwen-run` is the generic bounded prompt runner for `langchain-api /run`. Keep it stdlib-only and local-only.
 
 ## Shell script rules
 - Use `#!/usr/bin/env bash` and `set -euo pipefail`.
@@ -49,16 +52,18 @@ This directory owns the runtime bridge, bootstrap helpers, introspection helpers
   - the relevant docs in `docs/`
 - If you introduce or remove required runtime files, update both `aoa-check-layout` and `validate_stack.py`.
 - If you change host-facts shape or capture destinations, update `docs/REFERENCE_PLATFORM.md`, `docs/REFERENCE_PLATFORM_SPEC.md`, `docs/reference-platform/`, `scripts/validate_stack.py`, and `.github/workflows/validate-stack.yml` in the same change.
+- If you change machine-fit shape or capture destinations, update `docs/MACHINE_FIT_POLICY.md`, `docs/machine-fit/`, `scripts/validate_stack.py`, and `.github/workflows/validate-stack.yml` in the same change.
 - If the runtime wrapper consumes a return-policy file or writes return-event bundles, keep those contracts explicit in docs, layout checks, and render-truth guidance.
 
 ## Verify
 For shell work, run the smallest useful set:
 ```bash
 python scripts/validate_stack.py
-python -m py_compile scripts/validate_stack.py scripts/aoa-host-facts
+python -m py_compile scripts/validate_stack.py scripts/aoa-host-facts scripts/aoa-machine-fit scripts/aoa-qwen-run
 shellcheck scripts/aoa-lib.sh scripts/<touched-script>
 bash -n scripts/<touched-script>
 scripts/aoa-host-facts --mode public
+scripts/aoa-machine-fit --mode public
 ```
 
 For bootstrap or lifecycle changes, rehearse the flow encoded in `.github/workflows/validate-stack.yml` with a temporary runtime root.
diff --git a/scripts/aoa-doctor b/scripts/aoa-doctor
index db41c84..1c51877 100755
--- a/scripts/aoa-doctor
+++ b/scripts/aoa-doctor
@@ -117,6 +117,31 @@ if has_module "51-browser-tools.yml" || has_module "60-monitoring.yml"; then
   doctor_ok "internal-only services selected; use aoa-smoke --with-internal after startup"
 fi
 
+machine_fit_path="${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json"
+if [[ -f "${machine_fit_path}" ]]; then
+  doctor_ok "machine-fit record ${machine_fit_path}"
+else
+  doctor_warn "machine-fit record missing; run ${AOA_CONFIGS_ROOT}/scripts/aoa-machine-fit after bootstrap"
+fi
+
+if [[ -r /proc/loadavg ]]; then
+  load_1m="$(awk '{print $1}' /proc/loadavg 2>/dev/null || true)"
+  cpu_count="$(getconf _NPROCESSORS_ONLN 2>/dev/null || true)"
+  if [[ -n "${load_1m}" && -n "${cpu_count}" ]]; then
+    if python3 - "$load_1m" "$cpu_count" <<'PY'
+import sys
+load = float(sys.argv[1])
+cpus = int(sys.argv[2])
+sys.exit(0 if load > (cpus * 0.50) else 1)
+PY
+    then
+      doctor_warn "host loadavg ${load_1m} is noisy for latency-sensitive trials on ${cpu_count} logical CPUs"
+    else
+      doctor_ok "host load envelope looks reasonable for latency-sensitive work"
+    fi
+  fi
+fi
+
 if command -v findmnt >/dev/null 2>&1; then
   if findmnt "${AOA_VAULT_ROOT}" >/dev/null 2>&1; then
     doctor_ok "vault mount ${AOA_VAULT_ROOT}"
diff --git a/scripts/aoa-first-run b/scripts/aoa-first-run
index 56e987c..b8640a6 100755
--- a/scripts/aoa-first-run
+++ b/scripts/aoa-first-run
@@ -48,4 +48,4 @@ fi
 
 aoa_note "first-run bootstrap phase complete"
 aoa_note "missing secrets were intentionally ignored on this pass"
-aoa_note "next: run ${AOA_CONFIGS_ROOT}/scripts/aoa-doctor and create real secrets as described in ${AOA_CONFIGS_ROOT}/docs/SECRETS_BOOTSTRAP.md"
+aoa_note "next: run ${AOA_CONFIGS_ROOT}/scripts/aoa-doctor, capture ${AOA_STACK_ROOT}/Logs/machine-fit/latest/latest.private.json with aoa-machine-fit, and create real secrets as described in ${AOA_CONFIGS_ROOT}/docs/SECRETS_BOOTSTRAP.md"
diff --git a/scripts/aoa-local-ai-trials b/scripts/aoa-local-ai-trials
new file mode 100755
index 0000000..b6a6ff1
--- /dev/null
+++ b/scripts/aoa-local-ai-trials
@@ -0,0 +1,7634 @@
+#!/usr/bin/env python3
+from __future__ import annotations
+
+import argparse
+import json
+import re
+import shlex
+import shutil
+import subprocess
+import sys
+import tempfile
+import textwrap
+import time
+import urllib.error
+import urllib.request
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any
+
+PROGRAM_ID = "qwen-local-pilot-v1"
+MODEL = "qwen3.5:9b"
+
+STACK_ROOT = Path("/srv/abyss-stack")
+CONFIGS_ROOT = STACK_ROOT / "Configs"
+SCRIPTS_ROOT = CONFIGS_ROOT / "scripts"
+LOG_ROOT_DEFAULT = STACK_ROOT / "Logs" / "local-ai-trials" / PROGRAM_ID
+MIRROR_ROOT_DEFAULT = Path("/srv/Dionysus/reports/local-ai-trials") / PROGRAM_ID
+
+DATE_STAMP = datetime.now().astimezone().date().isoformat()
+
+VALIDATED_POSTURE = {
+    "LC_OLLAMA_NUM_THREAD": "6",
+    "LC_OLLAMA_NUM_BATCH": "32",
+    "LC_OLLAMA_THINK": "false",
+}
+
+RUNTIME_SELECTION_DEFAULT = {
+    "preset": "intel-full",
+    "profile": None,
+    "path": "langchain-api:/run",
+}
+
+PROGRAM_SUMMARY = (
+    "Supervised local pilot for Qwen3.5:9B on the canonical abyss-stack runtime "
+    "path with per-case reporting and wave-level gates."
+)
+
+WAVE_METADATA = {
+    "W0": {
+        "slug": "runtime",
+        "title": "Runtime Qualification",
+        "summary": "Qualify the local Qwen runtime path before any higher-layer trials.",
+    },
+    "W1": {
+        "slug": "routing",
+        "title": "Routing And Ownership",
+        "summary": "Check source-of-truth routing and repo-ownership discipline.",
+    },
+    "W2": {
+        "slug": "read-only-federation",
+        "title": "Read-Only Federation Tasks",
+        "summary": "Check useful read-only work across repo docs, validators, runtime, and route-api.",
+    },
+    "W3": {
+        "slug": "selection",
+        "title": "Selection And Orchestration",
+        "summary": "Check skill, playbook, agent, tier, and eval-selection choices before execution.",
+    },
+    "W4": {
+        "slug": "supervised-edits",
+        "title": "Low-Risk Supervised Edits",
+        "summary": "Bounded edit candidates with frozen file scopes and required validation.",
+    },
+}
+
+W3_UNSAFE_CASE_IDS = {
+    "select-playbook-cross-repo-boundary-rollout",
+    "select-playbook-restartable-inquiry-loop",
+    "select-tier-router",
+    "select-tier-planner",
+    "decide-memo-stay-unused",
+    "decide-kag-use-required",
+}
+
+W4_DOC_CASE_IDS = {
+    "aoa-skills-doc-wording-alignment",
+    "aoa-routing-doc-boundary-alignment",
+    "aoa-evals-contract-wording-alignment",
+    "aoa-techniques-doc-index-alignment",
+    "agents-of-abyss-role-clarity-docs",
+    "8dionysus-profile-routing-clarity",
+}
+
+W4_GENERATED_CASE_IDS = {
+    "aoa-routing-generated-surface-refresh",
+    "aoa-evals-generated-catalog-refresh",
+}
+
+W4_CRITICAL_FAILURES = {
+    "unauthorized_scope_expansion",
+    "post_change_validation_failure",
+}
+
+W4_IGNORED_UNTRACKED_SUFFIXES = {
+    "__pycache__",
+}
+
+W4_WORKTREE_NEIGHBOR_REPOS = [
+    "8Dionysus",
+    "Agents-of-Abyss",
+    "Dionysus",
+    "Tree-of-Sophia",
+    "abyss-stack",
+    "aoa-agents",
+    "aoa-evals",
+    "aoa-kag",
+    "aoa-memo",
+    "aoa-playbooks",
+    "aoa-routing",
+    "aoa-skills",
+    "aoa-techniques",
+]
+
+W4_DOC_PREPARE_ORDER = [
+    "8dionysus-profile-routing-clarity",
+    "agents-of-abyss-role-clarity-docs",
+    "aoa-evals-contract-wording-alignment",
+    "aoa-routing-doc-boundary-alignment",
+    "aoa-techniques-doc-index-alignment",
+    "aoa-skills-doc-wording-alignment",
+]
+
+W4_DOC_TARGET_FALLBACKS = {
+    "8dionysus-profile-routing-clarity": "README.md",
+    "agents-of-abyss-role-clarity-docs": "docs/LAYERS.md",
+    "aoa-evals-contract-wording-alignment": "runners/reportable_proof_contract.md",
+    "aoa-routing-doc-boundary-alignment": "docs/RECURRENCE_NAVIGATION_BOUNDARY.md",
+    "aoa-techniques-doc-index-alignment": "README.md",
+    "aoa-skills-doc-wording-alignment": "docs/PUBLIC_SURFACE.md",
+}
+
+W4_GENERATED_PREPARE_ORDER = [
+    "aoa-routing-generated-surface-refresh",
+    "aoa-evals-generated-catalog-refresh",
+]
+
+W4_AGENTS_HEADINGS = {
+    "Purpose",
+    "Project identity",
+    "Repository purpose",
+    "What owns truth here",
+    "Owns",
+    "Does not own",
+    "Editing rules",
+    "Editing priorities",
+    "Editing posture",
+    "When editing README.md",
+    "When editing GLOSSARY.md",
+    "Role of this directory",
+}
+
+CASE_SCHEMA = {
+    "$schema": "https://json-schema.org/draft/2020-12/schema",
+    "title": "Qwen Local Pilot Case Spec",
+    "type": "object",
+    "required": [
+        "artifact_kind",
+        "program_id",
+        "wave_id",
+        "case_id",
+        "title",
+        "repo_scope",
+        "task_family",
+        "mutation_policy",
+        "runtime_selection",
+        "allowed_tools",
+        "source_refs",
+        "expected_result",
+    ],
+    "properties": {
+        "artifact_kind": {"const": "aoa.local-ai-trial.case-spec"},
+        "program_id": {"type": "string"},
+        "wave_id": {"type": "string"},
+        "case_id": {"type": "string"},
+        "title": {"type": "string"},
+        "repo_scope": {"type": "array", "items": {"type": "string"}},
+        "task_family": {"type": "string"},
+        "mutation_allowed": {"type": "boolean"},
+        "mutation_policy": {"type": "object"},
+        "runtime_selection": {"type": "object"},
+        "allowed_tools": {"type": "array", "items": {"type": "string"}},
+        "source_refs": {"type": "array", "items": {"type": "string"}},
+        "observed_actions": {"type": "array", "items": {"type": "object"}},
+        "execution_mode": {"type": "string"},
+        "lane": {"type": "string"},
+        "expected_result": {"type": "object"},
+        "scoring": {"type": "object"},
+        "acceptance_checks": {"type": "array", "items": {"type": "string"}},
+        "notes": {"type": "array", "items": {"type": "string"}},
+    },
+}
+
+RUN_MANIFEST_SCHEMA = {
+    "$schema": "https://json-schema.org/draft/2020-12/schema",
+    "title": "Qwen Local Pilot Run Manifest",
+    "type": "object",
+    "required": [
+        "artifact_kind",
+        "program_id",
+        "wave_id",
+        "case_id",
+        "executed_at",
+        "runtime_selection",
+        "model",
+        "backend",
+        "commands",
+        "artifact_refs",
+    ],
+    "properties": {
+        "artifact_kind": {"const": "aoa.local-ai-trial.run-manifest"},
+        "program_id": {"type": "string"},
+        "wave_id": {"type": "string"},
+        "case_id": {"type": "string"},
+        "executed_at": {"type": "string"},
+        "runtime_selection": {"type": "object"},
+        "model": {"type": "string"},
+        "backend": {"type": "string"},
+        "commands": {"type": "array", "items": {"type": "object"}},
+        "artifact_refs": {"type": "array", "items": {"type": "string"}},
+        "latency": {"type": "object"},
+        "shared_evidence": {"type": "array", "items": {"type": "string"}},
+        "notes": {"type": "array", "items": {"type": "string"}},
+    },
+}
+
+RESULT_SUMMARY_SCHEMA = {
+    "$schema": "https://json-schema.org/draft/2020-12/schema",
+    "title": "Qwen Local Pilot Result Summary",
+    "type": "object",
+    "required": [
+        "artifact_kind",
+        "program_id",
+        "wave_id",
+        "case_id",
+        "status",
+        "score_breakdown",
+        "reviewer_decision",
+    ],
+    "properties": {
+        "artifact_kind": {"const": "aoa.local-ai-trial.result-summary"},
+        "program_id": {"type": "string"},
+        "wave_id": {"type": "string"},
+        "case_id": {"type": "string"},
+        "status": {"enum": ["pass", "fail", "planned"]},
+        "score_breakdown": {"type": "object"},
+        "failure_class": {"type": ["string", "null"]},
+        "reviewer_decision": {"type": "object"},
+        "boundary_check": {"type": "object"},
+        "observed": {"type": "object"},
+        "next_action": {"type": "string"},
+    },
+}
+
+WAVE_INDEX_SCHEMA = {
+    "$schema": "https://json-schema.org/draft/2020-12/schema",
+    "title": "Qwen Local Pilot Wave Index",
+    "type": "object",
+    "required": [
+        "artifact_kind",
+        "program_id",
+        "wave_id",
+        "wave_title",
+        "case_count",
+        "status_counts",
+        "gate_result",
+        "cases",
+    ],
+    "properties": {
+        "artifact_kind": {"const": "aoa.local-ai-trial.wave-index"},
+        "program_id": {"type": "string"},
+        "wave_id": {"type": "string"},
+        "wave_title": {"type": "string"},
+        "case_count": {"type": "integer"},
+        "status_counts": {"type": "object"},
+        "gate_result": {"type": "string"},
+        "next_action": {"type": "string"},
+        "cases": {"type": "array", "items": {"type": "object"}},
+        "gate_detail": {"type": "object"},
+    },
+}
+
+
+def utc_now() -> str:
+    return (
+        datetime.now(timezone.utc)
+        .replace(microsecond=0)
+        .isoformat()
+        .replace("+00:00", "Z")
+    )
+
+
+def write_json(path: Path, payload: dict[str, Any]) -> None:
+    path.parent.mkdir(parents=True, exist_ok=True)
+    path.write_text(json.dumps(payload, indent=2, ensure_ascii=True) + "\n", encoding="utf-8")
+
+
+def write_text(path: Path, text: str) -> None:
+    path.parent.mkdir(parents=True, exist_ok=True)
+    path.write_text(text.rstrip() + "\n", encoding="utf-8")
+
+
+def write_text_exact(path: Path, text: str) -> None:
+    path.parent.mkdir(parents=True, exist_ok=True)
+    path.write_text(text, encoding="utf-8")
+
+
+def absolute(path: str | Path) -> str:
+    return str(Path(path).resolve())
+
+
+def repo_path(repo: str, relative: str) -> str:
+    return absolute(Path("/srv") / repo / relative)
+
+
+def stack_path(relative: str) -> str:
+    return absolute(STACK_ROOT / relative)
+
+
+def configs_path(relative: str) -> str:
+    return absolute(CONFIGS_ROOT / relative)
+
+
+def route_endpoint(path: str) -> str:
+    return f"http://127.0.0.1:5402{path}"
+
+
+def langchain_endpoint(path: str) -> str:
+    return f"http://127.0.0.1:5401{path}"
+
+
+def case_dir(log_root: Path, wave_id: str, case_id: str) -> Path:
+    return log_root / "waves" / wave_id / case_id
+
+
+def case_report_name(wave_id: str, case_id: str) -> str:
+    return f"{DATE_STAMP}.{PROGRAM_ID}.{wave_id}.{case_id}.md"
+
+
+def wave_index_name(wave_id: str) -> str:
+    meta = WAVE_METADATA[wave_id]
+    return f"{wave_id}-{meta['slug']}-index"
+
+
+def format_command(parts: list[str]) -> str:
+    return shlex.join(parts)
+
+
+def run_command(parts: list[str], *, cwd: Path | None = None, timeout_s: float | None = None) -> dict[str, Any]:
+    started = time.perf_counter()
+    started_at = utc_now()
+    try:
+        proc = subprocess.run(
+            parts,
+            cwd=str(cwd) if cwd else None,
+            text=True,
+            capture_output=True,
+            timeout=timeout_s,
+            check=False,
+        )
+        timed_out = False
+        exit_code = proc.returncode
+        stdout = proc.stdout
+        stderr = proc.stderr
+    except subprocess.TimeoutExpired as exc:
+        timed_out = True
+        exit_code = 124
+        stdout = exc.stdout or ""
+        stderr = exc.stderr or ""
+    finished_at = utc_now()
+    elapsed_s = round(time.perf_counter() - started, 3)
+    return {
+        "command": parts,
+        "display": format_command(parts),
+        "cwd": str(cwd) if cwd else None,
+        "started_at": started_at,
+        "finished_at": finished_at,
+        "elapsed_s": elapsed_s,
+        "exit_code": exit_code,
+        "timed_out": timed_out,
+        "stdout": stdout,
+        "stderr": stderr,
+    }
+
+
+def persist_command_result(case_root: Path, label: str, result: dict[str, Any]) -> dict[str, Any]:
+    safe = label.replace("/", "-")
+    out_path = case_root / "artifacts" / f"{safe}.stdout.txt"
+    err_path = case_root / "artifacts" / f"{safe}.stderr.txt"
+    meta_path = case_root / "artifacts" / f"{safe}.command.json"
+
+    write_text(out_path, result["stdout"])
+    write_text(err_path, result["stderr"])
+    meta_payload = {
+        "command": result["command"],
+        "display": result["display"],
+        "cwd": result["cwd"],
+        "started_at": result["started_at"],
+        "finished_at": result["finished_at"],
+        "elapsed_s": result["elapsed_s"],
+        "exit_code": result["exit_code"],
+        "timed_out": result["timed_out"],
+        "stdout_path": str(out_path),
+        "stderr_path": str(err_path),
+    }
+    write_json(meta_path, meta_payload)
+    return {
+        "display": result["display"],
+        "cwd": result["cwd"],
+        "elapsed_s": result["elapsed_s"],
+        "exit_code": result["exit_code"],
+        "timed_out": result["timed_out"],
+        "stdout_path": str(out_path),
+        "stderr_path": str(err_path),
+        "command_meta": str(meta_path),
+    }
+
+
+def http_get(url: str, *, timeout_s: float) -> dict[str, Any]:
+    started = time.perf_counter()
+    started_at = utc_now()
+    headers: dict[str, str] = {}
+    status_code: int | None = None
+    body = ""
+    error: str | None = None
+    try:
+        req = urllib.request.Request(url=url, method="GET")
+        with urllib.request.urlopen(req, timeout=timeout_s) as resp:
+            body = resp.read().decode("utf-8", errors="ignore")
+            status_code = resp.status
+            headers = dict(resp.headers.items())
+    except urllib.error.HTTPError as exc:
+        status_code = exc.code
+        headers = dict(exc.headers.items()) if exc.headers else {}
+        body = exc.read().decode("utf-8", errors="ignore")
+        error = f"http_error {exc.code}"
+    except Exception as exc:
+        error = f"{type(exc).__name__}: {exc}"
+    finished_at = utc_now()
+    elapsed_s = round(time.perf_counter() - started, 3)
+    return {
+        "url": url,
+        "display": f"GET {url}",
+        "started_at": started_at,
+        "finished_at": finished_at,
+        "elapsed_s": elapsed_s,
+        "status_code": status_code,
+        "headers": headers,
+        "body": body,
+        "error": error,
+        "ok": error is None and status_code == 200,
+    }
+
+
+def persist_http_result(case_root: Path, label: str, result: dict[str, Any]) -> dict[str, Any]:
+    safe = label.replace("/", "-")
+    body_path = case_root / "artifacts" / f"{safe}.http-body.txt"
+    meta_path = case_root / "artifacts" / f"{safe}.http.json"
+
+    write_text(body_path, result.get("body", ""))
+    meta_payload = {
+        "method": "GET",
+        "url": result["url"],
+        "display": result["display"],
+        "started_at": result["started_at"],
+        "finished_at": result["finished_at"],
+        "elapsed_s": result["elapsed_s"],
+        "status_code": result["status_code"],
+        "headers": result.get("headers") or {},
+        "ok": result["ok"],
+        "error": result.get("error"),
+        "body_path": str(body_path),
+    }
+    write_json(meta_path, meta_payload)
+    return {
+        "display": result["display"],
+        "url": result["url"],
+        "elapsed_s": result["elapsed_s"],
+        "status_code": result["status_code"],
+        "ok": result["ok"],
+        "error": result.get("error"),
+        "body_path": str(body_path),
+        "meta_path": str(meta_path),
+    }
+
+
+def preview_json_value(
+    value: Any,
+    *,
+    max_keys: int = 8,
+    max_items: int = 3,
+    max_string: int = 180,
+    depth: int = 0,
+    max_depth: int = 3,
+) -> Any:
+    if depth >= max_depth:
+        if isinstance(value, (dict, list)):
+            return f"<{type(value).__name__} truncated>"
+        if isinstance(value, str) and len(value) > max_string:
+            return value[:max_string].rstrip() + "..."
+        return value
+
+    if isinstance(value, dict):
+        preview: dict[str, Any] = {}
+        keys = list(value.keys())
+        for key in keys[:max_keys]:
+            preview[key] = preview_json_value(
+                value[key],
+                max_keys=max_keys,
+                max_items=max_items,
+                max_string=max_string,
+                depth=depth + 1,
+                max_depth=max_depth,
+            )
+        if len(keys) > max_keys:
+            preview["__truncated_keys__"] = len(keys) - max_keys
+        return preview
+
+    if isinstance(value, list):
+        preview_items = [
+            preview_json_value(
+                item,
+                max_keys=max_keys,
+                max_items=max_items,
+                max_string=max_string,
+                depth=depth + 1,
+                max_depth=max_depth,
+            )
+            for item in value[:max_items]
+        ]
+        if len(value) > max_items:
+            preview_items.append(f"<{len(value) - max_items} more items>")
+        return preview_items
+
+    if isinstance(value, str) and len(value) > max_string:
+        return value[:max_string].rstrip() + "..."
+    return value
+
+
+def compact_prompt_slice(text: str, *, char_limit: int = 1400) -> str:
+    try:
+        parsed = json.loads(text)
+    except json.JSONDecodeError:
+        return compact_excerpt_for_prompt(text, non_empty_limit=12, char_limit=char_limit)
+
+    rendered = json.dumps(preview_json_value(parsed), indent=2, ensure_ascii=True)
+    if len(rendered) <= char_limit:
+        return rendered
+    rendered = rendered[:char_limit].rstrip()
+    if "\n" in rendered:
+        rendered = rendered.rsplit("\n", 1)[0]
+    return rendered
+
+
+def report_frontmatter(case: dict[str, Any], verdict: str) -> str:
+    runtime = case.get("runtime_selection") or RUNTIME_SELECTION_DEFAULT
+    lines = [
+        "---",
+        f"program_id: {PROGRAM_ID}",
+        f"wave_id: {case['wave_id']}",
+        f"case_id: {case['case_id']}",
+        "repo_scope:",
+    ]
+    lines.extend(f"  - {item}" for item in case["repo_scope"])
+    lines.extend(
+        [
+            f"task_family: {case['task_family']}",
+            f"mutation_allowed: {str(case['mutation_allowed']).lower()}",
+            "runtime_selection:",
+            f"  preset: {runtime.get('preset') if runtime.get('preset') is not None else 'null'}",
+            f"  profile: {runtime.get('profile') if runtime.get('profile') is not None else 'null'}",
+            f"  path: {runtime.get('path') if runtime.get('path') is not None else 'null'}",
+            f"model: {MODEL}",
+            f"verdict: {verdict}",
+            "---",
+        ]
+    )
+    return "\n".join(lines)
+
+
+def render_report(
+    case: dict[str, Any],
+    run_manifest: dict[str, Any],
+    result_summary: dict[str, Any],
+    *,
+    log_root: Path,
+) -> str:
+    verdict = result_summary["status"]
+    case_root = case_dir(log_root, case["wave_id"], case["case_id"])
+    evidence_links = [
+        f"- [case.spec.json]({case_root / 'case.spec.json'})",
+        f"- [run.manifest.json]({case_root / 'run.manifest.json'})",
+        f"- [result.summary.json]({case_root / 'result.summary.json'})",
+    ]
+    for ref in run_manifest.get("artifact_refs", []):
+        evidence_links.append(f"- [artifact]({ref})")
+
+    command_lines: list[str] = []
+    for item in run_manifest.get("commands", []):
+        command_lines.append(f"- `{item['display']}`")
+        if item.get("stdout_path"):
+            command_lines.append(f"  stdout: [{Path(item['stdout_path']).name}]({item['stdout_path']})")
+        if item.get("stderr_path"):
+            command_lines.append(f"  stderr: [{Path(item['stderr_path']).name}]({item['stderr_path']})")
+
+    if not command_lines:
+        command_lines.append("- No runtime command captured for this case.")
+
+    failures = result_summary.get("observed", {}).get("failures") or ["None."]
+    failure_lines = "\n".join(f"- {item}" for item in failures)
+
+    follow_up = result_summary.get("next_action") or "No additional follow-up recorded."
+
+    return "\n\n".join(
+        [
+            report_frontmatter(case, verdict),
+            f"# {case['title']}",
+            "## Goal\n"
+            + case.get("goal", "Run the frozen case under the local pilot reporting contract."),
+            "## Inputs\n"
+            + "\n".join(f"- {item}" for item in case.get("inputs", [])),
+            "## Expected Result\n"
+            + "\n".join(f"- {item}" for item in case.get("expected_report_lines", [])),
+            "## Actual Result\n"
+            + "\n".join(f"- {item}" for item in result_summary.get("observed", {}).get("highlights", [])),
+            "## Evidence\n"
+            + "\n".join(evidence_links + ["", "Commands:"] + command_lines),
+            "## Boundary Check\n"
+            + result_summary["boundary_check"]["notes"],
+            "## Verdict\n"
+            + result_summary["reviewer_decision"]["notes"],
+            "## Failures\n" + failure_lines,
+            "## Follow-up\n" + follow_up,
+        ]
+    )
+
+
+def render_wave_index_md(index_payload: dict[str, Any]) -> str:
+    lines = [
+        f"# {index_payload['wave_id']} {index_payload['wave_title']}",
+        "",
+        index_payload.get("wave_summary", ""),
+        "",
+        f"- Gate result: `{index_payload['gate_result']}`",
+        f"- Cases: `{index_payload['case_count']}`",
+        f"- Status counts: `{json.dumps(index_payload['status_counts'], ensure_ascii=True)}`",
+        f"- Next action: {index_payload['next_action']}",
+        "",
+        "## Cases",
+    ]
+    for case in index_payload["cases"]:
+        status = case["status"]
+        summary = case.get("summary", "")
+        report_link = case.get("report_md")
+        if report_link:
+            lines.append(f"- `{case['case_id']}`: `{status}` [{Path(report_link).name}]({report_link})")
+        else:
+            lines.append(f"- `{case['case_id']}`: `{status}`")
+        if summary:
+            lines.append(f"  {summary}")
+    if index_payload.get("gate_detail"):
+        lines.extend(["", "## Gate Detail", "```json", json.dumps(index_payload["gate_detail"], indent=2, ensure_ascii=True), "```"])
+    return "\n".join(lines)
+
+
+def contract_paths(log_root: Path) -> dict[str, Path]:
+    return {
+        "case.spec.schema.json": log_root / "contracts" / "case.spec.schema.json",
+        "run.manifest.schema.json": log_root / "contracts" / "run.manifest.schema.json",
+        "result.summary.schema.json": log_root / "contracts" / "result.summary.schema.json",
+        "wave-index.schema.json": log_root / "contracts" / "wave-index.schema.json",
+    }
+
+
+def base_case(
+    *,
+    wave_id: str,
+    case_id: str,
+    title: str,
+    repo_scope: list[str],
+    task_family: str,
+    source_refs: list[str],
+    expected_result: dict[str, Any],
+    goal: str,
+    inputs: list[str],
+    expected_report_lines: list[str],
+    allowed_tools: list[str],
+    observed_actions: list[dict[str, Any]] | None = None,
+    notes: list[str] | None = None,
+    runtime_selection: dict[str, Any] | None = None,
+    scoring: dict[str, Any] | None = None,
+    acceptance_checks: list[str] | None = None,
+    mutation_policy: dict[str, Any] | None = None,
+    mutation_allowed: bool = False,
+) -> dict[str, Any]:
+    return {
+        "artifact_kind": "aoa.local-ai-trial.case-spec",
+        "program_id": PROGRAM_ID,
+        "wave_id": wave_id,
+        "case_id": case_id,
+        "title": title,
+        "repo_scope": repo_scope,
+        "task_family": task_family,
+        "mutation_allowed": mutation_allowed,
+        "mutation_policy": mutation_policy or {"mode": "forbidden"},
+        "runtime_selection": runtime_selection or RUNTIME_SELECTION_DEFAULT,
+        "allowed_tools": allowed_tools,
+        "source_refs": source_refs,
+        "observed_actions": observed_actions or [],
+        "expected_result": expected_result,
+        "scoring": scoring or {},
+        "acceptance_checks": acceptance_checks or [],
+        "goal": goal,
+        "inputs": inputs,
+        "expected_report_lines": expected_report_lines,
+        "notes": notes or [],
+    }
+
+
+def build_catalog() -> dict[str, list[dict[str, Any]]]:
+    catalog: dict[str, list[dict[str, Any]]] = {}
+
+    runtime_intel = {"preset": "intel-full", "profile": None, "path": "langchain-api:/run"}
+    runtime_federation = {"preset": None, "profile": "federation", "path": "route-api:read"}
+    runtime_intel_plus_federation = {
+        "preset": "intel-full",
+        "profile": "federation",
+        "path": "langchain-api:/run + route-api",
+    }
+    runtime_agent_full = {"preset": "agent-full", "profile": None, "path": "langchain-api:/run"}
+
+    catalog["W0"] = [
+        base_case(
+            wave_id="W0",
+            case_id="warm-exact-reply",
+            title="Warm Exact Reply Through Langchain Run Path",
+            repo_scope=["abyss-stack"],
+            task_family="runtime-qualification",
+            source_refs=[
+                configs_path("scripts/aoa-qwen-check"),
+                configs_path("scripts/aoa-qwen-bench"),
+                langchain_endpoint("/run"),
+            ],
+            expected_result={"type": "latency-budget", "metric": "exact-reply mean_s", "max_s": 3.5},
+            runtime_selection=runtime_intel,
+            goal="Verify that the intended `langchain-api /run` path returns the exact bounded reply within the W0 latency budget.",
+            inputs=[
+                "Run `scripts/aoa-qwen-bench --preset intel-full` and score only the `exact-reply` rows.",
+                "Treat this as shared evidence with the paired repo-routing case.",
+            ],
+            expected_report_lines=[
+                "All exact-reply runs pass.",
+                "Mean latency is less than or equal to 3.5 seconds.",
+                "No timeout and no HTTP 5xx appears in the shared benchmark evidence.",
+            ],
+            allowed_tools=["local-shell:read-only", "langchain-api:/run"],
+            scoring={"strict_pass": ["all_runs_pass", "mean_within_budget", "no_timeout_or_5xx"]},
+        ),
+        base_case(
+            wave_id="W0",
+            case_id="warm-repo-routing",
+            title="Warm Repo Routing Through Langchain Run Path",
+            repo_scope=["abyss-stack", "aoa-routing"],
+            task_family="runtime-qualification",
+            source_refs=[
+                configs_path("scripts/aoa-qwen-check"),
+                configs_path("scripts/aoa-qwen-bench"),
+                langchain_endpoint("/run"),
+            ],
+            expected_result={"type": "latency-budget", "metric": "repo-routing mean_s", "max_s": 12.0},
+            runtime_selection=runtime_intel,
+            goal="Verify that the bounded repo-routing prompt passes on the intended run path within the W0 latency budget.",
+            inputs=[
+                "Run `scripts/aoa-qwen-bench --preset intel-full` and score only the `repo-routing` rows.",
+                "Treat this as shared evidence with the paired exact-reply case.",
+            ],
+            expected_report_lines=[
+                "All repo-routing runs pass.",
+                "Mean latency is less than or equal to 12 seconds.",
+                "No timeout and no HTTP 5xx appears in the shared benchmark evidence.",
+            ],
+            allowed_tools=["local-shell:read-only", "langchain-api:/run"],
+            scoring={"strict_pass": ["all_runs_pass", "mean_within_budget", "no_timeout_or_5xx"]},
+        ),
+        base_case(
+            wave_id="W0",
+            case_id="intel-full-smoke-internal",
+            title="Intel Full Smoke With Internal Probes",
+            repo_scope=["abyss-stack"],
+            task_family="runtime-qualification",
+            source_refs=[
+                configs_path("scripts/aoa-smoke"),
+                configs_path("scripts/aoa-internal-probes"),
+                configs_path("compose/presets/intel-full.txt"),
+            ],
+            expected_result={"type": "command-exit", "command": "scripts/aoa-smoke --with-internal --preset intel-full", "exit_code": 0},
+            runtime_selection=runtime_intel,
+            goal="Verify that the canonical Intel-aware runtime preset passes the full smoke flow with internal probes enabled.",
+            inputs=["Run `scripts/aoa-smoke --with-internal --preset intel-full`."],
+            expected_report_lines=[
+                "The smoke command exits with code 0.",
+                "No critical service probe fails on the Intel-aware path.",
+            ],
+            allowed_tools=["local-shell:read-only"],
+        ),
+        base_case(
+            wave_id="W0",
+            case_id="federation-smoke",
+            title="Federation Smoke",
+            repo_scope=["abyss-stack", "aoa-routing", "aoa-memo", "aoa-evals", "aoa-playbooks", "aoa-kag"],
+            task_family="runtime-qualification",
+            source_refs=[
+                configs_path("scripts/aoa-up"),
+                configs_path("scripts/aoa-wait"),
+                configs_path("scripts/aoa-smoke"),
+                configs_path("compose/profiles/federation.txt"),
+                route_endpoint("/health"),
+            ],
+            expected_result={"type": "command-sequence", "steps": ["aoa-up", "aoa-wait", "aoa-smoke"], "all_exit_zero": True},
+            runtime_selection=runtime_federation,
+            goal="Verify that the separate federation profile remains readable and healthy through route-api.",
+            inputs=[
+                "Run `scripts/aoa-up --profile federation`.",
+                "Run `scripts/aoa-wait --profile federation`.",
+                "Run `scripts/aoa-smoke --profile federation`.",
+            ],
+            expected_report_lines=[
+                "The federation bring-up, wait, and smoke commands all exit with code 0.",
+                "The route-api health and federation read endpoints stay available.",
+            ],
+            allowed_tools=["local-shell:read-only", "route-api:read"],
+        ),
+        base_case(
+            wave_id="W0",
+            case_id="cold-restart-recovery",
+            title="Cold Restart Recovery",
+            repo_scope=["abyss-stack"],
+            task_family="runtime-qualification",
+            source_refs=[
+                configs_path("scripts/aoa-down"),
+                configs_path("scripts/aoa-up"),
+                configs_path("scripts/aoa-wait"),
+                configs_path("scripts/aoa-smoke"),
+            ],
+            expected_result={"type": "command-sequence", "steps": ["aoa-down", "aoa-up", "aoa-wait", "aoa-smoke"], "all_exit_zero": True},
+            runtime_selection=runtime_intel_plus_federation,
+            goal="Verify that the combined Intel + federation runtime can recover from a full local restart and return to a healthy smoke state.",
+            inputs=[
+                "Run `scripts/aoa-down --preset intel-full --profile federation`.",
+                "Run `scripts/aoa-up --preset intel-full --profile federation`.",
+                "Run `scripts/aoa-wait --preset intel-full --profile federation`.",
+                "Run `scripts/aoa-smoke --with-internal --preset intel-full --profile federation`.",
+            ],
+            expected_report_lines=[
+                "All restart sequence commands exit with code 0.",
+                "The final smoke check passes after restart.",
+            ],
+            allowed_tools=["local-shell:read-only"],
+        ),
+        base_case(
+            wave_id="W0",
+            case_id="agent-full-parity-sample",
+            title="Agent Full Parity Sample",
+            repo_scope=["abyss-stack"],
+            task_family="runtime-qualification",
+            source_refs=[
+                configs_path("compose/presets/agent-full.txt"),
+                configs_path("scripts/aoa-up"),
+                configs_path("scripts/aoa-wait"),
+                configs_path("scripts/aoa-smoke"),
+                configs_path("scripts/aoa-qwen-check"),
+            ],
+            expected_result={"type": "command-sequence", "steps": ["aoa-up", "aoa-wait", "aoa-smoke", "aoa-qwen-check"], "all_exit_zero": True},
+            runtime_selection=runtime_agent_full,
+            goal="Take one parity sample on the agent-full preset and ensure it is not more stable than the Intel baseline.",
+            inputs=[
+                "Run `scripts/aoa-up --preset agent-full`.",
+                "Run `scripts/aoa-wait --preset agent-full`.",
+                "Run `scripts/aoa-smoke --preset agent-full`.",
+                "Run `scripts/aoa-qwen-check --case exact-reply --json`.",
+            ],
+            expected_report_lines=[
+                "The agent-full smoke sample passes.",
+                "The exact-reply sample also passes on the same path.",
+                "The result is used only as a stability parity sample, not a new baseline.",
+            ],
+            allowed_tools=["local-shell:read-only", "langchain-api:/run"],
+        ),
+    ]
+
+    def ownership_case(case_id: str, title: str, prompt: str, expected_repo: str, refs: list[str], disallowed: list[str]) -> dict[str, Any]:
+        return base_case(
+            wave_id="W1",
+            case_id=case_id,
+            title=title,
+            repo_scope=[expected_repo],
+            task_family="routing-ownership",
+            source_refs=refs,
+            expected_result={"type": "exact-repo-name", "exact": expected_repo, "disallowed_confusions": disallowed},
+            goal="Check whether Qwen picks the single owning repo and preserves the authority boundary.",
+            inputs=[prompt, "Reply with the exact repo name only."],
+            expected_report_lines=[
+                f"The exact repo answer is `{expected_repo}`.",
+                "No derived or neighboring repo is substituted as authority.",
+            ],
+            allowed_tools=["langchain-api:/run", "local-files:read-only", "route-api:read"],
+            scoring={"exact_match": True, "authority_boundary_binary": True},
+        )
+
+    catalog["W1"] = [
+        ownership_case(
+            "repo-owner-aoa-skills-skill-bundles",
+            "Owning Repo For Reusable Codex Skill Bundles",
+            "Which single repo owns reusable Codex-facing skill bundles and bounded workflow packaging?",
+            "aoa-skills",
+            [repo_path("aoa-skills", "README.md"), repo_path("aoa-skills", "docs/LAYER_POSITION.md")],
+            ["aoa-techniques", "aoa-playbooks"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-techniques-techniques",
+            "Owning Repo For Reusable Engineering Techniques",
+            "Which single repo owns reusable validated engineering techniques as minimal reproducible practice units?",
+            "aoa-techniques",
+            [repo_path("aoa-techniques", "README.md"), repo_path("aoa-techniques", "docs/TECHNIQUE_SELECTION.md")],
+            ["aoa-skills", "aoa-evals"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-evals-proof-bundles",
+            "Owning Repo For Portable Proof Bundles",
+            "Which single repo owns portable evaluation bundles that make bounded claims reproducible and reviewable?",
+            "aoa-evals",
+            [repo_path("aoa-evals", "README.md"), repo_path("aoa-evals", "docs/PORTABLE_EVAL_BOUNDARY_GUIDE.md")],
+            ["abyss-stack", "aoa-skills"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-routing-navigation",
+            "Owning Repo For Navigation And Dispatch",
+            "Which single repo owns thin navigation, typing, dispatch, and federation-entry orientation surfaces?",
+            "aoa-routing",
+            [repo_path("aoa-routing", "README.md"), repo_path("aoa-routing", "docs/FEDERATION_ENTRY_ABI.md")],
+            ["aoa-memo", "Agents-of-Abyss"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-memo-memory",
+            "Owning Repo For Memory Objects And Recall Contracts",
+            "Which single repo owns memory objects, recall contracts, and memory temperature posture?",
+            "aoa-memo",
+            [repo_path("aoa-memo", "README.md"), repo_path("aoa-memo", "docs/MEMORY_MODEL.md")],
+            ["aoa-routing", "aoa-kag"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-agents-role-layer",
+            "Owning Repo For Agent Roles And Persona Contracts",
+            "Which single repo owns explicit agent roles, personas, tiers, and handoff rules?",
+            "aoa-agents",
+            [repo_path("aoa-agents", "README.md")],
+            ["Agents-of-Abyss", "aoa-playbooks"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-playbooks-scenarios",
+            "Owning Repo For Scenario And Composition Recipes",
+            "Which single repo owns scenario-shaped operating recipes that compose skills, agents, memory posture, and fallback paths?",
+            "aoa-playbooks",
+            [repo_path("aoa-playbooks", "README.md")],
+            ["aoa-skills", "aoa-routing"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-kag-derived-knowledge",
+            "Owning Repo For Derived Knowledge Substrate Surfaces",
+            "Which single repo owns derived knowledge-ready structures, graph-friendly projections, and provenance-aware lifted surfaces?",
+            "aoa-kag",
+            [repo_path("aoa-kag", "README.md")],
+            ["Tree-of-Sophia", "aoa-memo"],
+        ),
+        ownership_case(
+            "repo-owner-agents-of-abyss-constitution",
+            "Owning Repo For AoA Constitutional Doctrine",
+            "Which single repo is the constitutional and ecosystem-center repository for the AoA federation?",
+            "Agents-of-Abyss",
+            [repo_path("Agents-of-Abyss", "README.md"), repo_path("Agents-of-Abyss", "CHARTER.md"), repo_path("Agents-of-Abyss", "docs/REPO_ROLES.md")],
+            ["aoa-agents", "aoa-routing"],
+        ),
+        ownership_case(
+            "repo-owner-tree-of-sophia-source-first",
+            "Owning Repo For Source-First World Thought Architecture",
+            "Which single repo owns the source-first living knowledge architecture for philosophy and world thought?",
+            "Tree-of-Sophia",
+            [repo_path("Tree-of-Sophia", "README.md"), repo_path("Tree-of-Sophia", "BOUNDARIES.md")],
+            ["aoa-kag", "aoa-memo"],
+        ),
+        ownership_case(
+            "repo-owner-dionysus-seed-garden",
+            "Owning Repo For Seed Garden And Dispatch",
+            "Which single repo owns seed sources, wave manifests, archived planting surfaces, and planting dispatch before landing in target repos?",
+            "Dionysus",
+            [repo_path("Dionysus", "README.md")],
+            ["8Dionysus", "Agents-of-Abyss"],
+        ),
+        ownership_case(
+            "repo-owner-abyss-stack-runtime-body",
+            "Owning Repo For Runtime Body And Deployment Glue",
+            "Which single repo owns runtime, deployment, storage, lifecycle, and infrastructure glue for the local system body?",
+            "abyss-stack",
+            [repo_path("abyss-stack", "Configs/README.md"), repo_path("abyss-stack", "Configs/docs/PATHS.md")],
+            ["aoa-evals", "aoa-routing"],
+        ),
+        ownership_case(
+            "repo-owner-8dionysus-public-entry",
+            "Owning Repo For Public Entry Surface",
+            "Which single repo is the public profile entry surface that helps humans and agents find the right specialized repository?",
+            "8Dionysus",
+            [repo_path("8Dionysus", "README.md"), repo_path("8Dionysus", "GLOSSARY.md")],
+            ["Dionysus", "Agents-of-Abyss"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-routing-federation-entrypoints",
+            "Owning Repo For Federation Entrypoints",
+            "Which single repo owns federation-entry orientation and lightweight next-hop entrypoints?",
+            "aoa-routing",
+            [repo_path("aoa-routing", "generated/federation_entrypoints.min.json"), repo_path("aoa-routing", "README.md")],
+            ["aoa-playbooks", "aoa-memo"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-evals-comparison-spine",
+            "Owning Repo For Comparison Spine",
+            "Which single repo owns comparison-spine and reportable proof surfaces for bounded evaluation?",
+            "aoa-evals",
+            [repo_path("aoa-evals", "generated/comparison_spine.json"), repo_path("aoa-evals", "runners/reportable_proof_contract.md")],
+            ["abyss-stack", "aoa-techniques"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-skills-runtime-guardrails",
+            "Owning Repo For Skill Runtime Guardrails",
+            "Which single repo owns runtime guardrail policy and skill-side runtime governance surfaces?",
+            "aoa-skills",
+            [repo_path("aoa-skills", "config/runtime_guardrail_policy.json"), repo_path("aoa-skills", "docs/RUNTIME_GOVERNANCE_LAYER.md")],
+            ["aoa-techniques", "abyss-stack"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-kag-tos-retrieval-axis",
+            "Owning Repo For ToS Retrieval Axis Surfaces",
+            "Which single repo owns derived ToS retrieval-axis packs and bounded chunk-level retrieval helpers without replacing Tree-of-Sophia meaning?",
+            "aoa-kag",
+            [repo_path("aoa-kag", "README.md"), repo_path("aoa-kag", "docs/TOS_RETRIEVAL_AXIS_PACK.md")],
+            ["Tree-of-Sophia", "aoa-routing"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-playbooks-automation-seeds",
+            "Owning Repo For Automation Seed Playbooks",
+            "Which single repo owns automation seeds and scenario composition manifests rather than raw runtime automation code?",
+            "aoa-playbooks",
+            [repo_path("aoa-playbooks", "README.md"), repo_path("aoa-playbooks", "docs/AUTOMATION_SEEDS.md")],
+            ["abyss-stack", "aoa-routing"],
+        ),
+        ownership_case(
+            "repo-owner-aoa-agents-model-tiers",
+            "Owning Repo For Model Tiers",
+            "Which single repo owns model tiers such as router, planner, executor, and verifier?",
+            "aoa-agents",
+            [repo_path("aoa-agents", "README.md"), repo_path("aoa-agents", "docs/MODEL_TIER_MODEL.md")],
+            ["aoa-routing", "Agents-of-Abyss"],
+        ),
+        ownership_case(
+            "repo-owner-abyss-stack-platform-adaptations",
+            "Owning Repo For Platform Adaptation Records",
+            "Which single repo owns local platform-adaptation records and machine-local runtime posture evidence?",
+            "abyss-stack",
+            [repo_path("abyss-stack", "Configs/docs/PLATFORM_ADAPTATION_POLICY.md"), stack_path("Logs/platform-adaptations/latest/latest.private.json")],
+            ["aoa-evals", "Dionysus"],
+        ),
+        base_case(
+            wave_id="W1",
+            case_id="boundary-routing-vs-memo",
+            title="Boundary Confusion Routing Versus Memo",
+            repo_scope=["aoa-routing", "aoa-memo"],
+            task_family="routing-ownership",
+            source_refs=[repo_path("aoa-routing", "README.md"), repo_path("aoa-memo", "README.md")],
+            expected_result={"type": "owner-vs-confusion", "owner": "aoa-routing", "disallowed_confusion": "aoa-memo"},
+            goal="Check that navigation authority stays in aoa-routing and memory does not get upgraded into routing authority.",
+            inputs=["Which repo owns navigation and dispatch surfaces here, and which repo must stay memory-only instead of becoming navigation authority?", "Reply as compact JSON with keys `owner` and `disallowed_confusion`."],
+            expected_report_lines=["The owner is `aoa-routing`.", "The disallowed confusion is `aoa-memo`."],
+            allowed_tools=["langchain-api:/run", "local-files:read-only"],
+            scoring={"exact_match": True, "critical_boundary_inversion": True},
+        ),
+        base_case(
+            wave_id="W1",
+            case_id="boundary-evals-vs-abyss-stack",
+            title="Boundary Confusion Evals Versus Runtime Stack",
+            repo_scope=["aoa-evals", "abyss-stack"],
+            task_family="routing-ownership",
+            source_refs=[repo_path("aoa-evals", "README.md"), repo_path("abyss-stack", "Configs/docs/RUNTIME_BENCH_POLICY.md")],
+            expected_result={"type": "owner-vs-confusion", "owner": "aoa-evals", "disallowed_confusion": "abyss-stack"},
+            goal="Check that portable proof ownership stays in aoa-evals and is not replaced by runtime-benchmark evidence from abyss-stack.",
+            inputs=["Which repo owns portable proof surfaces, and which repo must stay runtime-evidence local rather than becoming proof authority?", "Reply as compact JSON with keys `owner` and `disallowed_confusion`."],
+            expected_report_lines=["The owner is `aoa-evals`.", "The disallowed confusion is `abyss-stack`."],
+            allowed_tools=["langchain-api:/run", "local-files:read-only"],
+            scoring={"exact_match": True, "critical_boundary_inversion": True},
+        ),
+        base_case(
+            wave_id="W1",
+            case_id="boundary-agents-of-abyss-vs-aoa-agents",
+            title="Boundary Confusion AoA Constitution Versus Agent Role Layer",
+            repo_scope=["Agents-of-Abyss", "aoa-agents"],
+            task_family="routing-ownership",
+            source_refs=[repo_path("Agents-of-Abyss", "README.md"), repo_path("aoa-agents", "README.md")],
+            expected_result={"type": "owner-vs-confusion", "owner": "Agents-of-Abyss", "disallowed_confusion": "aoa-agents"},
+            goal="Check that ecosystem constitution stays in Agents-of-Abyss and the role layer does not get promoted into constitutional ownership.",
+            inputs=["Which repo owns the AoA constitutional high-level statement, and which repo must stay the role/persona layer instead?", "Reply as compact JSON with keys `owner` and `disallowed_confusion`."],
+            expected_report_lines=["The owner is `Agents-of-Abyss`.", "The disallowed confusion is `aoa-agents`."],
+            allowed_tools=["langchain-api:/run", "local-files:read-only"],
+            scoring={"exact_match": True, "critical_boundary_inversion": True},
+        ),
+        base_case(
+            wave_id="W1",
+            case_id="boundary-tos-vs-kag",
+            title="Boundary Confusion Tree Of Sophia Versus KAG",
+            repo_scope=["Tree-of-Sophia", "aoa-kag"],
+            task_family="routing-ownership",
+            source_refs=[repo_path("Tree-of-Sophia", "README.md"), repo_path("aoa-kag", "README.md")],
+            expected_result={"type": "owner-vs-confusion", "owner": "Tree-of-Sophia", "disallowed_confusion": "aoa-kag"},
+            goal="Check that Tree-of-Sophia remains source authority and aoa-kag remains derived substrate support only.",
+            inputs=["Which repo owns the source-first world-thought architecture, and which repo must remain a derived KAG layer instead of becoming source authority?", "Reply as compact JSON with keys `owner` and `disallowed_confusion`."],
+            expected_report_lines=["The owner is `Tree-of-Sophia`.", "The disallowed confusion is `aoa-kag`."],
+            allowed_tools=["langchain-api:/run", "local-files:read-only"],
+            scoring={"exact_match": True, "critical_boundary_inversion": True},
+        ),
+    ]
+
+    def command_action(action_id: str, argv: list[str], cwd: str, timeout_s: int) -> dict[str, Any]:
+        return {
+            "id": action_id,
+            "kind": "command",
+            "command": {
+                "argv": argv,
+                "cwd": cwd,
+                "timeout_s": timeout_s,
+            },
+        }
+
+    def http_get_action(action_id: str, url: str, timeout_s: int = 30) -> dict[str, Any]:
+        return {
+            "id": action_id,
+            "kind": "http_get",
+            "http_get": {
+                "url": url,
+                "timeout_s": timeout_s,
+            },
+        }
+
+    def read_only_case(
+        case_id: str,
+        title: str,
+        repo_scope: list[str],
+        source_refs: list[str],
+        inputs: list[str],
+        expected_lines: list[str],
+        observed_actions: list[dict[str, Any]] | None = None,
+    ) -> dict[str, Any]:
+        return base_case(
+            wave_id="W2",
+            case_id=case_id,
+            title=title,
+            repo_scope=repo_scope,
+            task_family="read-only-federation",
+            source_refs=source_refs,
+            expected_result={"type": "read-only-summary", "must_reference": source_refs[:2]},
+            goal="Complete the read-only task without fabricating refs, paths, commands, or ownership.",
+            inputs=inputs,
+            expected_report_lines=expected_lines,
+            allowed_tools=["langchain-api:/run", "local-shell:read-only", "local-files:read-only", "route-api:read"],
+            observed_actions=observed_actions,
+            scoring={
+                "dimensions": [
+                    "correct_source_refs",
+                    "correct_next_hop",
+                    "no_fabricated_ref_or_command",
+                    "concise_accurate_summary",
+                    "boundary_preserved",
+                ]
+            },
+        )
+
+    catalog["W2"] = [
+        read_only_case(
+            "skills-validate-and-explain",
+            "Run aoa-skills Validator And Explain Boundary",
+            ["aoa-skills"],
+            [repo_path("aoa-skills", "scripts/validate_skills.py"), repo_path("aoa-skills", "README.md")],
+            ["Run `python scripts/validate_skills.py` in `/srv/aoa-skills`.", "Explain what the validator protects and what `aoa-skills` does not own."],
+            ["The validator outcome is restated exactly, including any non-zero exit if it happens.", "The explanation keeps skill bundles distinct from techniques and evals."],
+            observed_actions=[
+                command_action(
+                    "skills_validator",
+                    ["python3", "scripts/validate_skills.py"],
+                    absolute(Path("/srv/aoa-skills")),
+                    120,
+                )
+            ],
+        ),
+        read_only_case(
+            "routing-validate-and-explain",
+            "Run aoa-routing Validator And Explain Boundary",
+            ["aoa-routing"],
+            [repo_path("aoa-routing", "scripts/validate_router.py"), repo_path("aoa-routing", "README.md")],
+            ["Run `python scripts/validate_router.py` in `/srv/aoa-routing`.", "Explain what the validator protects and what routing does not author."],
+            ["The validator outcome is restated exactly, including any non-zero exit if it happens.", "The explanation preserves the rule that source repos own meaning and routing owns navigation."],
+            observed_actions=[
+                command_action(
+                    "routing_validator",
+                    ["python3", "scripts/validate_router.py"],
+                    absolute(Path("/srv/aoa-routing")),
+                    120,
+                )
+            ],
+        ),
+        read_only_case(
+            "evals-validate-and-explain",
+            "Run aoa-evals Validator And Explain Boundary",
+            ["aoa-evals"],
+            [repo_path("aoa-evals", "scripts/validate_repo.py"), repo_path("aoa-evals", "README.md")],
+            ["Run `python scripts/validate_repo.py` in `/srv/aoa-evals`.", "Explain what the validator protects and what runtime evidence does not replace here."],
+            ["The validator outcome is restated exactly, including any non-zero exit if it happens.", "The explanation preserves the proof-layer boundary against runtime-only evidence."],
+            observed_actions=[
+                command_action(
+                    "evals_validator",
+                    ["python3", "scripts/validate_repo.py"],
+                    absolute(Path("/srv/aoa-evals")),
+                    120,
+                )
+            ],
+        ),
+        read_only_case(
+            "kag-validate-and-explain",
+            "Run aoa-kag Validator And Explain Boundary",
+            ["aoa-kag"],
+            [repo_path("aoa-kag", "scripts/validate_kag.py"), repo_path("aoa-kag", "README.md")],
+            ["Run `python scripts/validate_kag.py` in `/srv/aoa-kag`.", "Explain what the validator protects and why aoa-kag stays derived rather than source-authoritative."],
+            ["The validator outcome is restated exactly, including any non-zero exit if it happens.", "The explanation preserves source linkage and derived-surface discipline."],
+            observed_actions=[
+                command_action(
+                    "kag_validator",
+                    ["python3", "scripts/validate_kag.py"],
+                    absolute(Path("/srv/aoa-kag")),
+                    120,
+                )
+            ],
+        ),
+        read_only_case(
+            "aoa-charter-lookup",
+            "Look Up AoA Constitutional Source Refs",
+            ["Agents-of-Abyss"],
+            [repo_path("Agents-of-Abyss", "CHARTER.md"), repo_path("Agents-of-Abyss", "docs/REPO_ROLES.md")],
+            ["Find the smallest authoritative AoA docs that explain high-level constitution and repo roles.", "Summarize where a model should go next for ecosystem role questions."],
+            ["The answer cites the constitutional repo and the repo-roles surface.", "The next hop remains source-authoritative."],
+        ),
+        read_only_case(
+            "tos-boundary-lookup",
+            "Look Up ToS Source Boundary Refs",
+            ["Tree-of-Sophia"],
+            [repo_path("Tree-of-Sophia", "CHARTER.md"), repo_path("Tree-of-Sophia", "BOUNDARIES.md")],
+            ["Find the smallest authoritative Tree-of-Sophia docs that define mission and source-of-truth discipline.", "Summarize the correct next hop for source-first knowledge questions."],
+            ["The answer cites the ToS charter and boundaries.", "The summary keeps KAG derived and ToS authoritative."],
+        ),
+        read_only_case(
+            "playbook-activation-lookup",
+            "Look Up Playbook Activation Surface",
+            ["aoa-playbooks", "aoa-routing"],
+            [repo_path("aoa-playbooks", "README.md"), route_endpoint("/playbooks/activation")],
+            ["Read the playbook activation surface and explain when a playbook should be consulted before execution.", "Name one activation surface that is relevant to long-horizon or cross-repo work."],
+            ["The answer cites the activation surface.", "The explanation names a relevant playbook without inventing one."],
+            observed_actions=[http_get_action("playbooks_activation", route_endpoint("/playbooks/activation"))],
+        ),
+        read_only_case(
+            "memo-checkpoint-contract-lookup",
+            "Look Up Memo Checkpoint Contract",
+            ["aoa-memo", "aoa-routing"],
+            [repo_path("aoa-memo", "docs/LIFECYCLE.md"), route_endpoint("/memo/checkpoint-contract")],
+            ["Read the memo checkpoint contract surface and explain what it is for.", "State whether this pilot wave allows writeback."],
+            ["The answer cites the memo checkpoint surface.", "The explanation correctly says writeback is excluded from this pilot wave."],
+            observed_actions=[http_get_action("memo_checkpoint_contract", route_endpoint("/memo/checkpoint-contract"))],
+        ),
+        read_only_case(
+            "route-api-surface-status-read",
+            "Read Route API Surface Status",
+            ["aoa-routing", "abyss-stack"],
+            [route_endpoint("/surface-status"), repo_path("abyss-stack", "Services/route-api/app/main.py")],
+            ["Fetch `GET /surface-status` from route-api and summarize which surfaces are live.", "Do not infer more than the endpoint actually returns."],
+            ["The answer matches the endpoint output.", "The summary stays runtime-local and does not overclaim."],
+            observed_actions=[http_get_action("route_surface_status", route_endpoint("/surface-status"))],
+        ),
+        read_only_case(
+            "route-api-federation-entrypoints-read",
+            "Read Route API Federation Entrypoints",
+            ["aoa-routing"],
+            [route_endpoint("/routing/federation-entrypoints"), repo_path("aoa-routing", "generated/federation_entrypoints.min.json")],
+            ["Fetch `GET /routing/federation-entrypoints` and summarize what kind of next-hop help it gives.", "Name the correct source repo for the underlying meanings."],
+            ["The answer cites the federation-entrypoints surface.", "The summary keeps routing as navigation-only."],
+            observed_actions=[http_get_action("route_federation_entrypoints", route_endpoint("/routing/federation-entrypoints"))],
+        ),
+        read_only_case(
+            "route-api-evals-catalog-read",
+            "Read Route API Evals Catalog",
+            ["aoa-evals", "aoa-routing"],
+            [route_endpoint("/evals/catalog"), repo_path("aoa-evals", "generated/eval_catalog.json")],
+            ["Fetch `GET /evals/catalog` and summarize one relevant bounded eval for scope discipline.", "Keep proof ownership in aoa-evals."],
+            ["The answer cites the evals catalog.", "The chosen eval actually exists in the catalog."],
+            observed_actions=[http_get_action("route_evals_catalog", route_endpoint("/evals/catalog"))],
+        ),
+        read_only_case(
+            "route-api-playbooks-activation-read",
+            "Read Route API Playbooks Activation",
+            ["aoa-playbooks", "aoa-routing"],
+            [route_endpoint("/playbooks/activation"), repo_path("aoa-playbooks", "README.md")],
+            ["Fetch `GET /playbooks/activation` and summarize one activation surface relevant to long-horizon or cross-repo work.", "Do not invent a playbook id or field."],
+            ["The answer cites the activation endpoint.", "The summary matches an actual playbook activation item."],
+            observed_actions=[http_get_action("route_playbooks_activation", route_endpoint("/playbooks/activation"))],
+        ),
+        read_only_case(
+            "route-api-kag-tos-export-read",
+            "Read Route API ToS Export Surface",
+            ["aoa-kag", "Tree-of-Sophia"],
+            [route_endpoint("/kag/tos-export"), repo_path("aoa-kag", "README.md")],
+            ["Fetch `GET /kag/tos-export` and summarize what kind of ToS-derived export it gives.", "Keep Tree-of-Sophia as source authority."],
+            ["The answer cites the ToS export surface.", "The summary keeps aoa-kag derived and Tree-of-Sophia authoritative."],
+            observed_actions=[http_get_action("route_kag_tos_export", route_endpoint("/kag/tos-export"))],
+        ),
+        read_only_case(
+            "runtime-inspect-langchain-health",
+            "Inspect Langchain API Health",
+            ["abyss-stack"],
+            [langchain_endpoint("/health"), repo_path("abyss-stack", "Services/langchain-api/app/main.py")],
+            ["Fetch `GET /health` from langchain-api and summarize the backend posture that is visible from the response.", "Do not invent fields not in the response."],
+            ["The answer cites the health endpoint.", "The summary stays at runtime-health scope."],
+            observed_actions=[http_get_action("langchain_health", langchain_endpoint("/health"))],
+        ),
+        read_only_case(
+            "runtime-inspect-route-api-health",
+            "Inspect Route API Health",
+            ["abyss-stack", "aoa-routing"],
+            [route_endpoint("/health"), repo_path("abyss-stack", "Services/route-api/app/main.py")],
+            ["Fetch `GET /health` from route-api and summarize what service is alive.", "Do not treat health as proof of deeper quality."],
+            ["The answer cites the route-api health endpoint.", "The summary stays at runtime-health scope."],
+            observed_actions=[http_get_action("route_api_health", route_endpoint("/health"))],
+        ),
+        read_only_case(
+            "runtime-inspect-platform-adaptation",
+            "Inspect Latest Platform Adaptation Record",
+            ["abyss-stack"],
+            [stack_path("Logs/platform-adaptations/latest/latest.private.json"), repo_path("abyss-stack", "Configs/docs/PLATFORM_ADAPTATION_POLICY.md")],
+            ["Read the latest local-private platform adaptation record and summarize the validated Qwen posture.", "Keep the result machine-local rather than portable proof wording."],
+            ["The answer cites the latest platform adaptation record.", "The summary keeps runtime behavior local to abyss-stack."],
+        ),
+        read_only_case(
+            "runtime-inspect-runtime-bench-summary",
+            "Inspect Latest Runtime Bench Summary",
+            ["abyss-stack"],
+            [stack_path("Logs/runtime-benchmarks/runs/2026-03-29T040120Z__latency-single-turn__workhorse-local-qwen3.5-9b/summary.json"), repo_path("abyss-stack", "Configs/docs/RUNTIME_BENCH_POLICY.md")],
+            ["Read the latest bounded Qwen runtime bench summary and restate the exact-reply and repo-routing means.", "Do not upgrade runtime latency into a broad capability claim."],
+            ["The answer cites the summary artifact.", "The summary keeps runtime-benchmark meaning bounded."],
+        ),
+        read_only_case(
+            "runtime-inspect-rendered-services",
+            "Inspect Rendered Services For Intel Plus Federation",
+            ["abyss-stack"],
+            [configs_path("scripts/aoa-render-services"), repo_path("abyss-stack", "Configs/docs/RENDER_TRUTH.md")],
+            ["Run `scripts/aoa-render-services --preset intel-full --profile federation` and summarize which services make the intended Qwen + federation path real.", "Do not treat rendered config as proof of actual health without naming that boundary."],
+            ["The answer cites the render-truth command.", "The summary distinguishes rendered intention from live health evidence."],
+            observed_actions=[
+                command_action(
+                    "render_services",
+                    [
+                        absolute(SCRIPTS_ROOT / "aoa-render-services"),
+                        "--preset",
+                        "intel-full",
+                        "--profile",
+                        "federation",
+                    ],
+                    absolute(CONFIGS_ROOT),
+                    120,
+                )
+            ],
+        ),
+    ]
+
+    def selection_case(
+        case_id: str,
+        title: str,
+        inputs: list[str],
+        expected: str,
+        refs: list[str],
+        task_family: str = "selection-orchestration",
+        approved_set: list[str] | None = None,
+    ) -> dict[str, Any]:
+        expected_result = {"type": "exact-selection", "exact": expected}
+        if approved_set:
+            expected_result["approved_set"] = approved_set
+        return base_case(
+            wave_id="W3",
+            case_id=case_id,
+            title=title,
+            repo_scope=["aoa-routing", "aoa-agents", "aoa-playbooks", "aoa-evals"],
+            task_family=task_family,
+            source_refs=refs,
+            expected_result=expected_result,
+            goal="Choose the smallest correct next action layer before execution begins.",
+            inputs=inputs,
+            expected_report_lines=[f"The selected answer is `{expected}`.", "The selection stays bounded and does not widen the task silently."],
+            allowed_tools=["langchain-api:/run", "route-api:read", "local-files:read-only"],
+            scoring={"mode": "exact-or-approved-set", "fail_on_silent_widening": True},
+        )
+
+    catalog["W3"] = [
+        selection_case(
+            "select-skill-family-change-protocol",
+            "Select Skill Family For Bounded Change",
+            ["You need a bounded multi-file docs plus validator sync change with explicit verification.", "Which single preferred skill family is the best first fit? Reply with the exact family only."],
+            "change-protocol",
+            [route_endpoint("/agents"), route_endpoint("/playbooks/activation")],
+        ),
+        selection_case(
+            "select-skill-family-review",
+            "Select Skill Family For Post-Change Inspection",
+            ["You need to inspect a candidate patch for drift, boundedness, and handoff readiness.", "Which single preferred skill family is the best first fit? Reply with the exact family only."],
+            "review",
+            [route_endpoint("/agents")],
+        ),
+        selection_case(
+            "select-playbook-cross-repo-boundary-rollout",
+            "Select Playbook For Multi-Repo Source-Of-Truth Change",
+            ["The task is a multi-repo source-of-truth change that needs boundary maps, rollout decisions, and validation packs.", "Which exact playbook name fits best?"],
+            "cross-repo-boundary-rollout",
+            [route_endpoint("/playbooks/activation"), route_endpoint("/playbooks/composition-manifest")],
+        ),
+        selection_case(
+            "select-playbook-restartable-inquiry-loop",
+            "Select Playbook For Long-Horizon Inquiry",
+            ["The task is a long-horizon philosophy or architecture inquiry that must checkpoint, preserve contradiction posture, and resume later.", "Which exact playbook name fits best?"],
+            "restartable-inquiry-loop",
+            [route_endpoint("/playbooks/activation")],
+        ),
+        selection_case(
+            "select-tier-router",
+            "Select Tier For Single Ownership Lookup",
+            ["The task is a single repo-ownership question with no edits and no ambiguity beyond choosing the next source surface.", "Which exact model tier should act first?"],
+            "router",
+            [route_endpoint("/tiers"), repo_path("aoa-agents", "README.md")],
+        ),
+        selection_case(
+            "select-tier-planner",
+            "Select Tier For Non-Trivial Bounded Edit Planning",
+            ["The task is a non-trivial bounded edit that needs explicit steps, checks, and escalation points before execution.", "Which exact model tier should shape that first?"],
+            "planner",
+            [route_endpoint("/tiers")],
+        ),
+        selection_case(
+            "select-agent-coder",
+            "Select Agent Role For Approved Bounded Change",
+            ["The task already has an approved bounded change scope and now needs the actual implementation step.", "Which exact agent role fits best?"],
+            "coder",
+            [route_endpoint("/agents")],
+        ),
+        selection_case(
+            "select-agent-reviewer",
+            "Select Agent Role For Post-Execution Review",
+            ["The task is a post-change review focused on drift, boundedness, review quality, and handoff readiness.", "Which exact agent role fits best?"],
+            "reviewer",
+            [route_endpoint("/agents")],
+        ),
+        selection_case(
+            "select-eval-scope-drift-detection",
+            "Select Eval For Silent Scope Expansion",
+            ["You need an eval that detects whether a bounded change silently widened beyond what was requested.", "Which exact eval name fits best?"],
+            "aoa-scope-drift-detection",
+            [route_endpoint("/evals/catalog"), repo_path("aoa-evals", "generated/eval_catalog.json")],
+        ),
+        selection_case(
+            "select-eval-return-anchor-integrity",
+            "Select Eval For Honest Return Anchors",
+            ["You need an eval that checks whether a return-capable route names a real anchor and re-enters honestly.", "Which exact eval name fits best?"],
+            "aoa-return-anchor-integrity",
+            [route_endpoint("/evals/catalog")],
+        ),
+        selection_case(
+            "decide-memo-stay-unused",
+            "Decide Whether Memo Must Stay Unused",
+            ["The task is a single-shot repo ownership lookup with no reliance on prior episodes or cross-session recall.", "Should memo stay unused or be consulted? Reply exactly with `unused` or `use_memo`."],
+            "unused",
+            [route_endpoint("/memo/registry"), repo_path("aoa-memo", "README.md")],
+            task_family="selection-orchestration",
+        ),
+        selection_case(
+            "decide-kag-use-required",
+            "Decide Whether KAG Is Needed For Derived Retrieval",
+            ["The task needs derived retrieval handles across Tree-of-Sophia chunks without replacing source meaning.", "Should KAG be used? Reply exactly with `use_kag` or `unused`."],
+            "use_kag",
+            [route_endpoint("/kag/registry"), repo_path("aoa-kag", "README.md")],
+            task_family="selection-orchestration",
+        ),
+    ]
+
+    def edit_case(
+        case_id: str,
+        title: str,
+        repo_scope: list[str],
+        source_refs: list[str],
+        allowed_files: list[str],
+        acceptance_checks: list[str],
+        inputs: list[str],
+        *,
+        execution_mode: str,
+        lane: str,
+        builder_command: list[str] | None = None,
+    ) -> dict[str, Any]:
+        case = base_case(
+            wave_id="W4",
+            case_id=case_id,
+            title=title,
+            repo_scope=repo_scope,
+            task_family="low-risk-supervised-edit",
+            source_refs=source_refs,
+            expected_result={"type": "bounded-edit", "allowed_files": allowed_files, "all_acceptance_checks_must_pass": True},
+            goal="Prepare a bounded edit case with frozen scope, frozen validation, and explicit non-goals before any mutation happens.",
+            inputs=inputs,
+            expected_report_lines=[
+                "Only approved files are touched.",
+                "Every named acceptance check passes after the edit.",
+                "No repo or file scope widens silently.",
+            ],
+            allowed_tools=["local-shell", "local-files:read-write", "repo-validator"],
+            acceptance_checks=acceptance_checks,
+            mutation_allowed=True,
+            mutation_policy={
+                "mode": "bounded-approved-only",
+                "execution_mode": execution_mode,
+                "lane": lane,
+                "allowed_files": allowed_files,
+                "unauthorized_file_touch_is_critical_fail": True,
+                "review_required_before_mutation": True,
+            },
+            scoring={
+                "critical_failures": [
+                    "unauthorized_scope_expansion",
+                    "post_change_validation_failure",
+                ]
+            },
+        )
+        case["execution_mode"] = execution_mode
+        case["lane"] = lane
+        if builder_command is not None:
+            case["mutation_policy"]["builder_command"] = builder_command
+        return case
+
+    catalog["W4"] = [
+        edit_case(
+            "aoa-skills-doc-wording-alignment",
+            "aoa-skills Docs Wording Alignment",
+            ["aoa-skills"],
+            [repo_path("aoa-skills", "README.md"), repo_path("aoa-skills", "docs/README.md"), repo_path("aoa-skills", "docs/PUBLIC_SURFACE.md")],
+            [repo_path("aoa-skills", "README.md"), repo_path("aoa-skills", "docs/README.md"), repo_path("aoa-skills", "docs/PUBLIC_SURFACE.md")],
+            ["python scripts/validate_skills.py", "pytest -q"],
+            ["Align wording so the public README and docs entry surfaces describe `aoa-skills` consistently without changing repo ownership boundaries.", "Do not touch generated or schema files in this case."],
+            execution_mode="qwen_patch",
+            lane="docs",
+        ),
+        edit_case(
+            "aoa-routing-doc-boundary-alignment",
+            "aoa-routing Boundary Doc Alignment",
+            ["aoa-routing"],
+            [repo_path("aoa-routing", "README.md"), repo_path("aoa-routing", "docs/FEDERATION_ENTRY_ABI.md"), repo_path("aoa-routing", "docs/RECURRENCE_NAVIGATION_BOUNDARY.md")],
+            [repo_path("aoa-routing", "README.md"), repo_path("aoa-routing", "docs/FEDERATION_ENTRY_ABI.md"), repo_path("aoa-routing", "docs/RECURRENCE_NAVIGATION_BOUNDARY.md")],
+            ["python scripts/validate_router.py", "pytest -q"],
+            ["Align wording so routing stays clearly navigation-only across the public entry docs.", "Do not alter schemas or generated router payloads in this case."],
+            execution_mode="qwen_patch",
+            lane="docs",
+        ),
+        edit_case(
+            "aoa-evals-contract-wording-alignment",
+            "aoa-evals Contract Wording Alignment",
+            ["aoa-evals"],
+            [repo_path("aoa-evals", "README.md"), repo_path("aoa-evals", "docs/PORTABLE_EVAL_BOUNDARY_GUIDE.md"), repo_path("aoa-evals", "runners/reportable_proof_contract.md")],
+            [repo_path("aoa-evals", "README.md"), repo_path("aoa-evals", "docs/PORTABLE_EVAL_BOUNDARY_GUIDE.md"), repo_path("aoa-evals", "runners/reportable_proof_contract.md")],
+            ["pytest -q"],
+            ["Align wording so README, boundary guide, and reportable proof contract describe the same bounded proof posture.", "Do not change eval bundle semantics in this case."],
+            execution_mode="qwen_patch",
+            lane="docs",
+        ),
+        edit_case(
+            "aoa-techniques-doc-index-alignment",
+            "aoa-techniques Doc And Index Alignment",
+            ["aoa-techniques"],
+            [repo_path("aoa-techniques", "README.md"), repo_path("aoa-techniques", "docs/README.md"), repo_path("aoa-techniques", "TECHNIQUE_INDEX.md")],
+            [repo_path("aoa-techniques", "README.md"), repo_path("aoa-techniques", "docs/README.md"), repo_path("aoa-techniques", "TECHNIQUE_INDEX.md")],
+            ["python scripts/validate_repo.py", "pytest -q"],
+            ["Align the top-level README, docs index, and technique index wording without changing technique ownership or generated manifests.", "Keep the edit docs-only."],
+            execution_mode="qwen_patch",
+            lane="docs",
+        ),
+        edit_case(
+            "agents-of-abyss-role-clarity-docs",
+            "Agents-of-Abyss Role Clarity Docs Only",
+            ["Agents-of-Abyss"],
+            [repo_path("Agents-of-Abyss", "README.md"), repo_path("Agents-of-Abyss", "docs/REPO_ROLES.md"), repo_path("Agents-of-Abyss", "docs/LAYERS.md")],
+            [repo_path("Agents-of-Abyss", "README.md"), repo_path("Agents-of-Abyss", "docs/REPO_ROLES.md"), repo_path("Agents-of-Abyss", "docs/LAYERS.md")],
+            ["python scripts/validate_ecosystem.py"],
+            ["Clarify role wording across top-level ecosystem docs without changing repo boundaries or registry semantics.", "Keep the edit docs-only."],
+            execution_mode="qwen_patch",
+            lane="docs",
+        ),
+        edit_case(
+            "8dionysus-profile-routing-clarity",
+            "8Dionysus Public Entry Routing Clarity",
+            ["8Dionysus"],
+            [repo_path("8Dionysus", "README.md"), repo_path("8Dionysus", "GLOSSARY.md")],
+            [repo_path("8Dionysus", "README.md"), repo_path("8Dionysus", "GLOSSARY.md")],
+            ["sed -n '1,260p' README.md && printf '\\n---\\n' && sed -n '1,260p' GLOSSARY.md", "grep -RIn \"Agents-of-Abyss\\|Tree-of-Sophia\\|aoa-\\|abyss-stack\\|ATM10-Agent\" README.md GLOSSARY.md"],
+            ["Keep the profile concise and navigation-first while clarifying where specialized truth lives.", "Do not add new roadmap or maturity claims."],
+            execution_mode="qwen_patch",
+            lane="docs",
+        ),
+        edit_case(
+            "aoa-routing-generated-surface-refresh",
+            "aoa-routing Generated Surface Refresh",
+            ["aoa-routing"],
+            [repo_path("aoa-routing", "config/two_stage_router_policy.json"), repo_path("aoa-routing", "scripts/build_two_stage_skill_router.py"), repo_path("aoa-routing", "generated/two_stage_router_manifest.json")],
+            [
+                repo_path("aoa-routing", "generated/two_stage_skill_entrypoints.json"),
+                repo_path("aoa-routing", "generated/two_stage_router_prompt_blocks.json"),
+                repo_path("aoa-routing", "generated/two_stage_router_tool_schemas.json"),
+                repo_path("aoa-routing", "generated/two_stage_router_examples.json"),
+                repo_path("aoa-routing", "generated/two_stage_router_manifest.json"),
+                repo_path("aoa-routing", "generated/two_stage_router_eval_cases.jsonl"),
+            ],
+            ["python scripts/validate_two_stage_skill_router.py", "pytest -q"],
+            ["Refresh generated two-stage router surfaces from existing source policy and scripts.", "Do not broaden the edit to unrelated routing artifacts."],
+            execution_mode="script_refresh",
+            lane="generated",
+            builder_command=["python", "scripts/build_two_stage_skill_router.py"],
+        ),
+        edit_case(
+            "aoa-evals-generated-catalog-refresh",
+            "aoa-evals Generated Catalog Refresh",
+            ["aoa-evals"],
+            [repo_path("aoa-evals", "scripts/build_catalog.py"), repo_path("aoa-evals", "generated/eval_catalog.json"), repo_path("aoa-evals", "generated/eval_capsules.json")],
+            [
+                repo_path("aoa-evals", "generated/eval_catalog.json"),
+                repo_path("aoa-evals", "generated/eval_catalog.min.json"),
+                repo_path("aoa-evals", "generated/eval_capsules.json"),
+                repo_path("aoa-evals", "generated/eval_sections.full.json"),
+                repo_path("aoa-evals", "generated/comparison_spine.json"),
+            ],
+            ["pytest -q"],
+            ["Refresh eval catalog surfaces through the existing build script only.", "Do not change bundle doctrine or invent new evals in this case."],
+            execution_mode="script_refresh",
+            lane="generated",
+            builder_command=["python", "scripts/build_catalog.py"],
+        ),
+    ]
+
+    return catalog
+
+
+def program_readme() -> str:
+    return textwrap.dedent(
+        f"""\
+        # {PROGRAM_ID}
+
+        This directory is the runtime-truth root for the supervised Qwen local pilot.
+
+        It stores:
+        - program-local contracts for case specs, run manifests, result summaries, and wave indexes
+        - one packet per case under `waves/<wave-id>/<case-id>/`
+        - machine-readable truth that stays local to `abyss-stack`
+
+        Human+AI-readable mirror reports live in:
+        - `{MIRROR_ROOT_DEFAULT}`
+
+        Canonical baseline:
+        - preset: `intel-full`
+        - runtime path: `langchain-api /run`
+        - validated posture: `{json.dumps(VALIDATED_POSTURE, ensure_ascii=True)}`
+        """
+    ).strip()
+
+
+def mirror_program_readme() -> str:
+    return textwrap.dedent(
+        f"""\
+        # {PROGRAM_ID}
+
+        This folder is the durable human+AI-readable mirror for the local Qwen pilot.
+
+        Keep here:
+        - per-case Markdown reports
+        - per-wave Markdown indexes
+
+        Do not move runtime truth into this mirror.
+        Machine-readable truth stays in:
+        - `{LOG_ROOT_DEFAULT}`
+        """
+    ).strip()
+
+
+def materialize_program(log_root: Path, mirror_root: Path, catalog: dict[str, list[dict[str, Any]]]) -> None:
+    write_text(log_root / "README.md", program_readme())
+    write_json(contract_paths(log_root)["case.spec.schema.json"], CASE_SCHEMA)
+    write_json(contract_paths(log_root)["run.manifest.schema.json"], RUN_MANIFEST_SCHEMA)
+    write_json(contract_paths(log_root)["result.summary.schema.json"], RESULT_SUMMARY_SCHEMA)
+    write_json(contract_paths(log_root)["wave-index.schema.json"], WAVE_INDEX_SCHEMA)
+
+    for wave_id, cases in catalog.items():
+        for case in cases:
+            write_json(case_dir(log_root, wave_id, case["case_id"]) / "case.spec.json", case)
+
+        index_payload = {
+            "artifact_kind": "aoa.local-ai-trial.wave-index",
+            "program_id": PROGRAM_ID,
+            "wave_id": wave_id,
+            "wave_title": WAVE_METADATA[wave_id]["title"],
+            "wave_summary": WAVE_METADATA[wave_id]["summary"],
+            "case_count": len(cases),
+            "status_counts": {"planned": len(cases), "pass": 0, "fail": 0},
+            "gate_result": "not-run",
+            "next_action": "Execute this wave under human+AI curation after the prior gate passes.",
+            "cases": [
+                {
+                    "case_id": case["case_id"],
+                    "status": "planned",
+                    "repo_scope": case["repo_scope"],
+                    "task_family": case["task_family"],
+                    "case_spec": str(case_dir(log_root, wave_id, case["case_id"]) / "case.spec.json"),
+                    "summary": case["title"],
+                }
+                for case in cases
+            ],
+        }
+        index_base = wave_index_name(wave_id)
+        write_json(log_root / f"{index_base}.json", index_payload)
+        index_md = render_wave_index_md(index_payload)
+        write_text(log_root / f"{index_base}.md", index_md)
+        write_text(mirror_root / f"{index_base}.md", index_md)
+
+    write_text(mirror_root / "README.md", mirror_program_readme())
+
+
+def refresh_wave(log_root: Path, mirror_root: Path, wave_id: str) -> None:
+    catalog = build_catalog()
+    cases = catalog[wave_id]
+    index_base = wave_index_name(wave_id)
+    index_json_path = log_root / f"{index_base}.json"
+
+    for case in cases:
+        case_root = case_dir(log_root, wave_id, case["case_id"])
+        run_path = case_root / "run.manifest.json"
+        result_path = case_root / "result.summary.json"
+        if not (run_path.exists() and result_path.exists()):
+            continue
+        run_manifest = json.loads(run_path.read_text(encoding="utf-8"))
+        result_summary = json.loads(result_path.read_text(encoding="utf-8"))
+        report = render_report(case, run_manifest, result_summary, log_root=log_root)
+        write_text(case_root / "report.md", report)
+        write_text(mirror_root / case_report_name(wave_id, case["case_id"]), report)
+
+    if index_json_path.exists():
+        index_payload = json.loads(index_json_path.read_text(encoding="utf-8"))
+    else:
+        index_payload = {
+            "artifact_kind": "aoa.local-ai-trial.wave-index",
+            "program_id": PROGRAM_ID,
+            "wave_id": wave_id,
+            "wave_title": WAVE_METADATA[wave_id]["title"],
+            "wave_summary": WAVE_METADATA[wave_id]["summary"],
+            "case_count": len(cases),
+            "status_counts": {"planned": len(cases), "pass": 0, "fail": 0},
+            "gate_result": "not-run",
+            "next_action": "Execute this wave under human+AI curation after the prior gate passes.",
+            "cases": [
+                {
+                    "case_id": case["case_id"],
+                    "status": "planned",
+                    "repo_scope": case["repo_scope"],
+                    "task_family": case["task_family"],
+                    "case_spec": str(case_dir(log_root, wave_id, case["case_id"]) / "case.spec.json"),
+                    "summary": case["title"],
+                }
+                for case in cases
+            ],
+        }
+
+    index_md = render_wave_index_md(index_payload)
+    write_text(log_root / f"{index_base}.md", index_md)
+    write_text(mirror_root / f"{index_base}.md", index_md)
+
+
+def load_case_spec(log_root: Path, wave_id: str, case_id: str) -> dict[str, Any]:
+    return json.loads((case_dir(log_root, wave_id, case_id) / "case.spec.json").read_text(encoding="utf-8"))
+
+
+def parse_bench_run_dir(stdout: str) -> Path:
+    match = re.search(r"run dir:\s*(.+)", stdout)
+    if not match:
+        raise RuntimeError("could not find bench run dir in aoa-qwen-bench output")
+    return Path(match.group(1).strip())
+
+
+def extract_json_block(text: str) -> str:
+    stripped = text.strip()
+    if stripped.startswith("```"):
+        lines = stripped.splitlines()
+        if len(lines) >= 3 and lines[-1].strip() == "```":
+            body = "\n".join(lines[1:-1]).strip()
+            if body.startswith("json"):
+                body = body[4:].lstrip()
+            return body
+    return stripped
+
+
+def build_blocked_command_result(parts: list[str], *, cwd: Path, error: str) -> dict[str, Any]:
+    timestamp = utc_now()
+    return {
+        "command": parts,
+        "display": format_command(parts),
+        "cwd": str(cwd),
+        "started_at": timestamp,
+        "finished_at": timestamp,
+        "elapsed_s": 0.0,
+        "exit_code": 97,
+        "timed_out": False,
+        "stdout": "",
+        "stderr": error,
+    }
+
+
+def build_text_excerpt(ref: str, full_text: str) -> dict[str, Any]:
+    lines = full_text.splitlines()
+    excerpt = full_text
+    mode = "full"
+    if len(lines) > 120:
+        excerpt = "\n".join(lines[:120])
+        mode = "truncated"
+    if len(excerpt) > 6000:
+        excerpt = excerpt[:6000]
+        mode = "truncated"
+        if "\n" in excerpt:
+            excerpt = excerpt.rsplit("\n", 1)[0]
+
+    return {
+        "ref": ref,
+        "mode": mode,
+        "line_count": len(lines),
+        "char_count": len(full_text),
+        "excerpt": excerpt if excerpt else "[empty file]",
+    }
+
+
+def read_grounded_excerpt(ref: str) -> dict[str, Any]:
+    path = Path(ref)
+    resolved = path.resolve()
+    if not path.exists():
+        raise RuntimeError(f"missing source ref: {resolved}")
+    if not path.is_file():
+        raise RuntimeError(f"source ref is not a regular file: {resolved}")
+    try:
+        full_text = path.read_text(encoding="utf-8")
+    except UnicodeDecodeError as exc:
+        raise RuntimeError(f"source ref is not utf-8 text: {resolved}") from exc
+    except OSError as exc:
+        raise RuntimeError(f"could not read source ref: {resolved}: {exc}") from exc
+    return build_text_excerpt(str(resolved), full_text)
+
+
+def render_grounding(excerpts: list[dict[str, Any]], errors: list[str]) -> str:
+    lines = ["# W1 Grounding", ""]
+    for item in excerpts:
+        lines.extend(
+            [
+                (
+                    f"=== source_ref: {item['ref']} | mode: {item['mode']} | "
+                    f"lines: {item['line_count']} | chars: {item['char_count']} ==="
+                ),
+                item["excerpt"].rstrip(),
+                "",
+            ]
+        )
+    if errors:
+        lines.extend(["=== grounding_errors ===", *[f"- {error}" for error in errors], ""])
+    return "\n".join(lines).rstrip() + "\n"
+
+
+def compact_excerpt_for_prompt(text: str, *, non_empty_limit: int = 12, char_limit: int = 1200) -> str:
+    lines = text.splitlines()
+    kept: list[str] = []
+    non_empty_seen = 0
+    previous_blank = False
+
+    for raw in lines:
+        line = raw.rstrip()
+        if not line.strip():
+            if kept and not previous_blank:
+                kept.append("")
+                previous_blank = True
+            continue
+
+        kept.append(line)
+        previous_blank = False
+        non_empty_seen += 1
+        if non_empty_seen >= non_empty_limit:
+            break
+        if len("\n".join(kept)) >= char_limit:
+            break
+
+    compact = "\n".join(kept).strip()
+    if len(compact) > char_limit:
+        compact = compact[:char_limit].rstrip()
+        if "\n" in compact:
+            compact = compact.rsplit("\n", 1)[0]
+    return compact or "[empty excerpt]"
+
+
+def render_prompt_grounding(excerpts: list[dict[str, Any]]) -> str:
+    lines = ["# W1 Prompt Grounding", ""]
+    for item in excerpts:
+        compact = compact_excerpt_for_prompt(item["excerpt"])
+        lines.extend(
+            [
+                f"=== source_ref: {item['ref']} ===",
+                compact,
+                "",
+            ]
+        )
+    return "\n".join(lines).rstrip() + "\n"
+
+
+def repo_roots_from_refs(refs: list[str]) -> list[str]:
+    roots: list[str] = []
+    for ref in refs:
+        if ref.startswith("http://") or ref.startswith("https://"):
+            continue
+        try:
+            resolved = Path(ref).resolve()
+        except OSError:
+            continue
+        parts = resolved.parts
+        if len(parts) >= 3 and parts[1] == "srv":
+            root = parts[2]
+        else:
+            continue
+        if root not in roots:
+            roots.append(root)
+    return roots
+
+
+def w1_answer_normalization(case: dict[str, Any]) -> str:
+    roots = repo_roots_from_refs(case.get("source_refs", []))
+    if roots:
+        roots_text = ", ".join(f"`{root}`" for root in roots)
+        return (
+            "Use repository root names only. "
+            f"Valid repo-root names visible from the supplied source_ref paths: {roots_text}. "
+            "Do not answer with a file path, document title, endpoint name, bundle name, schema name, policy key, or internal object id."
+        )
+    return (
+        "Use repository root names only. "
+        "Do not answer with a file path, document title, endpoint name, bundle name, schema name, policy key, or internal object id."
+    )
+
+
+def w1_response_contract(case: dict[str, Any]) -> str:
+    expected = case["expected_result"]
+    if expected["type"] == "exact-repo-name":
+        return "Return the exact repo name only as plain text. No code fence. No explanation."
+    if expected["type"] == "owner-vs-confusion":
+        return (
+            'Return compact JSON with exactly two keys: "owner" and "disallowed_confusion". '
+            "No code fence. No explanation."
+        )
+    raise RuntimeError(f"unsupported W1 expected_result type: {expected['type']}")
+
+
+def w1_max_tokens(case: dict[str, Any]) -> int:
+    expected = case["expected_result"]
+    if expected["type"] == "exact-repo-name":
+        return 40
+    if expected["type"] == "owner-vs-confusion":
+        return 80
+    raise RuntimeError(f"unsupported W1 expected_result type: {expected['type']}")
+
+
+def build_w1_prompt(case: dict[str, Any], prompt_grounding_text: str) -> str:
+    input_lines = "\n".join(f"- {item}" for item in case.get("inputs", []))
+    return textwrap.dedent(
+        f"""\
+        Bounded W1 routing and ownership case.
+        Use only the supplied grounded prompt slices.
+        Do not invent repos, boundaries, or authority claims not supported by the slices.
+
+        Goal:
+        {case.get("goal", "")}
+
+        Inputs:
+        {input_lines}
+
+        Answer normalization:
+        {w1_answer_normalization(case)}
+
+        Grounded prompt slices:
+        {prompt_grounding_text.rstrip()}
+
+        Response contract:
+        {w1_response_contract(case)}
+        """
+    ).rstrip() + "\n"
+
+
+def ensure_wave_materialized(
+    log_root: Path,
+    mirror_root: Path,
+    wave_id: str,
+    catalog: dict[str, list[dict[str, Any]]],
+) -> None:
+    if not (log_root / "README.md").exists():
+        write_text(log_root / "README.md", program_readme())
+    if not (mirror_root / "README.md").exists():
+        write_text(mirror_root / "README.md", mirror_program_readme())
+    for name, path in contract_paths(log_root).items():
+        if name == "case.spec.schema.json":
+            write_json(path, CASE_SCHEMA)
+        elif name == "run.manifest.schema.json":
+            write_json(path, RUN_MANIFEST_SCHEMA)
+        elif name == "result.summary.schema.json":
+            write_json(path, RESULT_SUMMARY_SCHEMA)
+        elif name == "wave-index.schema.json":
+            write_json(path, WAVE_INDEX_SCHEMA)
+
+    cases = catalog[wave_id]
+    for case in cases:
+        spec_path = case_dir(log_root, wave_id, case["case_id"]) / "case.spec.json"
+        write_json(spec_path, case)
+
+    index_base = wave_index_name(wave_id)
+    index_json_path = log_root / f"{index_base}.json"
+    if not index_json_path.exists():
+        index_payload = {
+            "artifact_kind": "aoa.local-ai-trial.wave-index",
+            "program_id": PROGRAM_ID,
+            "wave_id": wave_id,
+            "wave_title": WAVE_METADATA[wave_id]["title"],
+            "wave_summary": WAVE_METADATA[wave_id]["summary"],
+            "case_count": len(cases),
+            "status_counts": {"planned": len(cases), "pass": 0, "fail": 0},
+            "gate_result": "not-run",
+            "next_action": "Execute this wave under human+AI curation after the prior gate passes.",
+            "cases": [
+                {
+                    "case_id": case["case_id"],
+                    "status": "planned",
+                    "repo_scope": case["repo_scope"],
+                    "task_family": case["task_family"],
+                    "case_spec": str(case_dir(log_root, wave_id, case["case_id"]) / "case.spec.json"),
+                    "summary": case["title"],
+                }
+                for case in cases
+            ],
+        }
+        write_json(index_json_path, index_payload)
+        index_md = render_wave_index_md(index_payload)
+        write_text(log_root / f"{index_base}.md", index_md)
+        write_text(mirror_root / f"{index_base}.md", index_md)
+
+
+def extract_string_list(value: Any, *, field_name: str) -> list[str]:
+    if not isinstance(value, list) or not all(isinstance(item, str) for item in value):
+        raise ValueError(f"{field_name} must be a list of strings")
+    return value
+
+
+def qwen_payload_from_raw(raw: dict[str, Any]) -> dict[str, Any]:
+    if raw["stdout"].strip():
+        try:
+            return json.loads(raw["stdout"])
+        except json.JSONDecodeError as exc:
+            return {
+                "ok": False,
+                "http_status": None,
+                "elapsed_s": raw["elapsed_s"],
+                "backend": None,
+                "model": MODEL,
+                "answer": "",
+                "error": f"invalid_json_from_aoa_qwen_run: {type(exc).__name__}: {exc}",
+            }
+    return {
+        "ok": False,
+        "http_status": None,
+        "elapsed_s": raw["elapsed_s"],
+        "backend": None,
+        "model": MODEL,
+        "answer": "",
+        "error": "empty_stdout_from_aoa_qwen_run",
+    }
+
+
+def run_qwen_prompt(
+    *,
+    case_root: Path,
+    prompt_path: Path,
+    label: str,
+    prompt_text: str,
+    max_tokens: int,
+    timeout_s: int,
+) -> tuple[dict[str, Any], dict[str, Any]]:
+    write_text(prompt_path, prompt_text)
+    command = [
+        absolute(SCRIPTS_ROOT / "aoa-qwen-run"),
+        "--prompt-file",
+        str(prompt_path),
+        "--timeout",
+        str(timeout_s),
+        "--temperature",
+        "0",
+        "--max-tokens",
+        str(max_tokens),
+        "--json",
+    ]
+    raw = run_command(command, cwd=CONFIGS_ROOT, timeout_s=timeout_s + 30)
+    command_ref = persist_command_result(case_root, label, raw)
+    return command_ref, qwen_payload_from_raw(raw)
+
+
+def build_blocked_qwen_payload(error: str) -> dict[str, Any]:
+    return {
+        "ok": False,
+        "http_status": None,
+        "elapsed_s": 0.0,
+        "backend": None,
+        "model": MODEL,
+        "answer": "",
+        "error": error,
+    }
+
+
+def build_result_summary(
+    *,
+    case: dict[str, Any],
+    status: str,
+    score_breakdown: dict[str, Any],
+    observed: dict[str, Any],
+    failure_class: str | None,
+    reviewer_notes: str,
+    boundary_notes: str,
+    next_action: str,
+) -> dict[str, Any]:
+    return {
+        "artifact_kind": "aoa.local-ai-trial.result-summary",
+        "program_id": PROGRAM_ID,
+        "wave_id": case["wave_id"],
+        "case_id": case["case_id"],
+        "status": status,
+        "score_breakdown": score_breakdown,
+        "failure_class": failure_class,
+        "reviewer_decision": {
+            "status": "accepted" if status == "pass" else "needs-remediation",
+            "reviewed_at": utc_now(),
+            "reviewer": "Codex under human+AI curation",
+            "notes": reviewer_notes,
+        },
+        "boundary_check": {
+            "status": "pass" if status == "pass" else "needs-review",
+            "reviewed_at": utc_now(),
+            "notes": boundary_notes,
+        },
+        "observed": observed,
+        "next_action": next_action,
+    }
+
+
+def finalize_case(
+    *,
+    case: dict[str, Any],
+    log_root: Path,
+    mirror_root: Path,
+    run_manifest: dict[str, Any],
+    result_summary: dict[str, Any],
+) -> None:
+    case_root = case_dir(log_root, case["wave_id"], case["case_id"])
+    write_json(case_root / "run.manifest.json", run_manifest)
+    write_json(case_root / "result.summary.json", result_summary)
+    report = render_report(case, run_manifest, result_summary, log_root=log_root)
+    write_text(case_root / "report.md", report)
+    write_text(mirror_root / case_report_name(case["wave_id"], case["case_id"]), report)
+
+
+def w0_boundary_note() -> str:
+    return (
+        "W0 checks runtime readiness only. It does not promote runtime success into proof-layer meaning, "
+        "and it keeps `abyss-stack` as the owner of runtime behavior rather than portable evaluation doctrine."
+    )
+
+
+def w1_boundary_note() -> str:
+    return (
+        "W1 checks grounded routing and ownership discipline only. It does not upgrade a grounded case answer "
+        "into portable proof wording, and it keeps source repos as authorities rather than letting the runtime "
+        "helper become a shadow owner of meaning."
+    )
+
+
+def ensure_w0_gate_passed(log_root: Path) -> dict[str, Any]:
+    index_path = log_root / f"{wave_index_name('W0')}.json"
+    if not index_path.exists():
+        raise RuntimeError(f"W0 gate artifact missing: {index_path}")
+    payload = json.loads(index_path.read_text(encoding="utf-8"))
+    if payload.get("gate_result") != "pass":
+        raise RuntimeError(f"W0 gate is not pass: {index_path}")
+    return payload
+
+
+def run_w1_preflight(log_root: Path) -> None:
+    setup_root = log_root / "waves" / "W1" / "_setup"
+    setup_root.mkdir(parents=True, exist_ok=True)
+
+    doctor_raw = run_command(
+        [absolute(SCRIPTS_ROOT / "aoa-doctor"), "--preset", "intel-full"],
+        cwd=CONFIGS_ROOT,
+        timeout_s=120,
+    )
+    persist_command_result(setup_root, "doctor", doctor_raw)
+
+    health_raw = run_command(
+        ["curl", "-fsS", langchain_endpoint("/health")],
+        cwd=CONFIGS_ROOT,
+        timeout_s=30,
+    )
+    persist_command_result(setup_root, "langchain-health", health_raw)
+
+    health_ok = False
+    if health_raw["exit_code"] == 0 and not health_raw["timed_out"]:
+        try:
+            health_payload = json.loads(health_raw["stdout"])
+        except json.JSONDecodeError:
+            health_payload = {}
+        health_ok = bool(health_payload.get("ok")) and health_payload.get("service") == "langchain-api"
+
+    if doctor_raw["exit_code"] != 0 or doctor_raw["timed_out"] or not health_ok:
+        raise RuntimeError("W1 preflight failed: doctor or langchain-api health is not ready")
+
+
+def score_w1_case(case: dict[str, Any], answer: str) -> dict[str, Any]:
+    expected = case["expected_result"]
+    answer = answer.strip()
+
+    if expected["type"] == "exact-repo-name":
+        exact_match = answer == expected["exact"]
+        critical_boundary_inversion = answer in expected.get("disallowed_confusions", [])
+        failures = []
+        if not exact_match:
+            failures.append(
+                f"Expected exact repo `{expected['exact']}`, observed `{answer or '<empty>'}`."
+            )
+        return {
+            "exact_match": exact_match,
+            "critical_boundary_inversion": critical_boundary_inversion,
+            "parsed_answer": answer,
+            "highlights": [f"Observed answer: `{answer or '<empty>'}`."],
+            "failures": failures,
+        }
+
+    if expected["type"] == "owner-vs-confusion":
+        try:
+            parsed = json.loads(extract_json_block(answer))
+        except json.JSONDecodeError as exc:
+            return {
+                "exact_match": False,
+                "critical_boundary_inversion": False,
+                "parsed_answer": None,
+                "highlights": [f"Observed answer: `{answer or '<empty>'}`."],
+                "failures": [f"Could not parse compact JSON answer: {type(exc).__name__}: {exc}."],
+            }
+
+        observed_owner = parsed.get("owner")
+        observed_confusion = parsed.get("disallowed_confusion")
+        exact_match = (
+            observed_owner == expected["owner"]
+            and observed_confusion == expected["disallowed_confusion"]
+        )
+        critical_boundary_inversion = observed_owner == expected["disallowed_confusion"]
+        failures = []
+        if not exact_match:
+            failures.append(
+                "Expected owner/disallowed_confusion "
+                f"`{expected['owner']}` / `{expected['disallowed_confusion']}`, "
+                f"observed `{observed_owner}` / `{observed_confusion}`."
+            )
+        return {
+            "exact_match": exact_match,
+            "critical_boundary_inversion": critical_boundary_inversion,
+            "parsed_answer": parsed,
+            "highlights": [
+                f"Observed owner: `{observed_owner}`.",
+                f"Observed disallowed_confusion: `{observed_confusion}`.",
+            ],
+            "failures": failures,
+        }
+
+    raise RuntimeError(f"unsupported W1 expected_result type: {expected['type']}")
+
+
+def run_w1_case(case: dict[str, Any], *, log_root: Path, mirror_root: Path) -> None:
+    case_root = case_dir(log_root, "W1", case["case_id"])
+    grounding_path = case_root / "artifacts" / "grounding.txt"
+    prompt_path = case_root / "artifacts" / "prompt.txt"
+
+    excerpts: list[dict[str, Any]] = []
+    grounding_errors: list[str] = []
+    for ref in case.get("source_refs", []):
+        try:
+            excerpts.append(read_grounded_excerpt(ref))
+        except RuntimeError as exc:
+            grounding_errors.append(str(exc))
+
+    grounding_text = render_grounding(excerpts, grounding_errors)
+    write_text(grounding_path, grounding_text)
+    prompt_grounding_text = render_prompt_grounding(excerpts)
+
+    max_tokens = w1_max_tokens(case)
+    qwen_command = [
+        absolute(SCRIPTS_ROOT / "aoa-qwen-run"),
+        "--prompt-file",
+        str(prompt_path),
+        "--timeout",
+        "120",
+        "--temperature",
+        "0",
+        "--max-tokens",
+        str(max_tokens),
+        "--json",
+    ]
+
+    command_ref: dict[str, Any]
+    qwen_payload: dict[str, Any]
+    if grounding_errors:
+        blocked_prompt = "\n".join(
+            [
+                "BLOCKED: prompt not built because grounding failed.",
+                "",
+                *[f"- {error}" for error in grounding_errors],
+            ]
+        )
+        write_text(prompt_path, blocked_prompt)
+        blocked_raw = build_blocked_command_result(
+            qwen_command,
+            cwd=CONFIGS_ROOT,
+            error="grounding failure:\n" + "\n".join(grounding_errors),
+        )
+        command_ref = persist_command_result(case_root, "qwen-run", blocked_raw)
+        qwen_payload = {
+            "ok": False,
+            "http_status": None,
+            "elapsed_s": 0.0,
+            "backend": None,
+            "model": MODEL,
+            "answer": "",
+            "error": "grounding failure",
+        }
+    else:
+        prompt_text = build_w1_prompt(case, prompt_grounding_text)
+        write_text(prompt_path, prompt_text)
+        raw = run_command(qwen_command, cwd=CONFIGS_ROOT, timeout_s=150)
+        command_ref = persist_command_result(case_root, "qwen-run", raw)
+        if raw["stdout"].strip():
+            try:
+                qwen_payload = json.loads(raw["stdout"])
+            except json.JSONDecodeError as exc:
+                qwen_payload = {
+                    "ok": False,
+                    "http_status": None,
+                    "elapsed_s": raw["elapsed_s"],
+                    "backend": None,
+                    "model": MODEL,
+                    "answer": "",
+                    "error": f"invalid_json_from_aoa_qwen_run: {type(exc).__name__}: {exc}",
+                }
+        else:
+            qwen_payload = {
+                "ok": False,
+                "http_status": None,
+                "elapsed_s": raw["elapsed_s"],
+                "backend": None,
+                "model": MODEL,
+                "answer": "",
+                "error": "empty_stdout_from_aoa_qwen_run",
+            }
+
+    transport_ok = (
+        not grounding_errors
+        and bool(qwen_payload.get("ok"))
+        and qwen_payload.get("http_status") == 200
+        and command_ref["exit_code"] == 0
+        and not command_ref["timed_out"]
+    )
+
+    if grounding_errors:
+        scoring = {
+            "grounding_complete": False,
+            "transport_ok": False,
+            "exact_match": False,
+            "critical_boundary_inversion": False,
+        }
+        observed = {
+            "highlights": [
+                f"Grounding failed before prompt execution for {len(grounding_errors)} source refs."
+            ],
+            "failures": grounding_errors,
+        }
+        failure_class = "grounding_failure"
+        status = "fail"
+    elif not transport_ok:
+        scoring = {
+            "grounding_complete": True,
+            "transport_ok": False,
+            "exact_match": False,
+            "critical_boundary_inversion": False,
+        }
+        error_text = qwen_payload.get("error") or "qwen run transport failure"
+        observed = {
+            "highlights": [
+                f"Qwen run backend: `{qwen_payload.get('backend')}`.",
+                f"HTTP status: `{qwen_payload.get('http_status')}`.",
+                f"Elapsed time: `{qwen_payload.get('elapsed_s')}`s.",
+            ],
+            "failures": [str(error_text)],
+        }
+        failure_class = "run_path_failure"
+        status = "fail"
+    else:
+        answer_score = score_w1_case(case, str(qwen_payload.get("answer") or ""))
+        status = "pass" if answer_score["exact_match"] else "fail"
+        scoring = {
+            "grounding_complete": True,
+            "transport_ok": True,
+            "exact_match": answer_score["exact_match"],
+            "critical_boundary_inversion": answer_score["critical_boundary_inversion"],
+        }
+        observed = {
+            "highlights": [
+                f"Grounded source refs: `{len(excerpts)}`.",
+                f"Qwen run backend: `{qwen_payload.get('backend')}`.",
+                f"Elapsed time: `{qwen_payload.get('elapsed_s')}`s.",
+                *answer_score["highlights"],
+            ],
+            "failures": answer_score["failures"],
+            "answer": qwen_payload.get("answer"),
+            "parsed_answer": answer_score["parsed_answer"],
+        }
+        if answer_score["critical_boundary_inversion"]:
+            failure_class = "critical_boundary_inversion"
+        elif status == "pass":
+            failure_class = None
+        else:
+            failure_class = "routing_mismatch"
+
+    run_manifest = {
+        "artifact_kind": "aoa.local-ai-trial.run-manifest",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W1",
+        "case_id": case["case_id"],
+        "executed_at": utc_now(),
+        "runtime_selection": case["runtime_selection"],
+        "model": MODEL,
+        "backend": qwen_payload.get("backend") or "langchain-api:/run",
+        "commands": [command_ref],
+        "artifact_refs": [
+            str(grounding_path),
+            str(prompt_path),
+            command_ref["stdout_path"],
+            command_ref["stderr_path"],
+            command_ref["command_meta"],
+        ],
+        "latency": {"elapsed_s": qwen_payload.get("elapsed_s")},
+        "notes": [
+            "W1 stores bounded grounded excerpt capture in grounding.txt and uses compact prompt slices derived from the same local refs.",
+            "This increment does not add HTTP-grounding for W1 cases.",
+        ],
+    }
+    result_summary = build_result_summary(
+        case=case,
+        status=status,
+        score_breakdown=scoring,
+        observed=observed,
+        failure_class=failure_class,
+        reviewer_notes=(
+            "The grounded W1 case preserved repo ownership and authority boundaries."
+            if status == "pass"
+            else "The grounded W1 case did not satisfy the frozen ownership or boundary contract."
+        ),
+        boundary_notes=w1_boundary_note(),
+        next_action="Use the W1 gate and the boundary-inversion tally to decide whether to proceed to W2.",
+    )
+    finalize_case(
+        case=case,
+        log_root=log_root,
+        mirror_root=mirror_root,
+        run_manifest=run_manifest,
+        result_summary=result_summary,
+    )
+
+
+def run_w1(log_root: Path, mirror_root: Path) -> None:
+    catalog = build_catalog()
+    ensure_w0_gate_passed(log_root)
+    ensure_wave_materialized(log_root, mirror_root, "W1", catalog)
+    run_w1_preflight(log_root)
+
+    for case in catalog["W1"]:
+        run_w1_case(case, log_root=log_root, mirror_root=mirror_root)
+
+    results: list[dict[str, Any]] = []
+    for item in catalog["W1"]:
+        result_path = case_dir(log_root, "W1", item["case_id"]) / "result.summary.json"
+        results.append(json.loads(result_path.read_text(encoding="utf-8")))
+
+    pass_count = sum(1 for result in results if result["status"] == "pass")
+    fail_count = sum(1 for result in results if result["status"] == "fail")
+    critical_boundary_cases = [
+        result["case_id"]
+        for result in results
+        if result["score_breakdown"].get("critical_boundary_inversion")
+    ]
+    exact_match_count = sum(
+        1 for result in results if result["score_breakdown"].get("exact_match")
+    )
+    exact_match_rate = round(exact_match_count / len(results), 3) if results else 0.0
+    gate_pass = pass_count >= 22 and not critical_boundary_cases
+    next_action = (
+        "Proceed to W2 read-only federation under the same per-case reporting contract."
+        if gate_pass
+        else "Stop at W1 and form a remediation sub-plan before W2."
+    )
+    gate_detail = {
+        "pass_count": pass_count,
+        "fail_count": fail_count,
+        "critical_boundary_inversions": len(critical_boundary_cases),
+        "critical_boundary_cases": critical_boundary_cases,
+        "exact_match_rate": exact_match_rate,
+        "next_action": next_action,
+    }
+
+    index_payload = {
+        "artifact_kind": "aoa.local-ai-trial.wave-index",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W1",
+        "wave_title": WAVE_METADATA["W1"]["title"],
+        "wave_summary": WAVE_METADATA["W1"]["summary"],
+        "case_count": len(results),
+        "status_counts": {
+            "pass": pass_count,
+            "fail": fail_count,
+            "planned": 0,
+        },
+        "gate_result": "pass" if gate_pass else "fail",
+        "next_action": next_action,
+        "cases": [
+            {
+                "case_id": item["case_id"],
+                "status": next(
+                    result["status"]
+                    for result in results
+                    if result["case_id"] == item["case_id"]
+                ),
+                "repo_scope": item["repo_scope"],
+                "task_family": item["task_family"],
+                "case_spec": str(case_dir(log_root, "W1", item["case_id"]) / "case.spec.json"),
+                "report_md": str(mirror_root / case_report_name("W1", item["case_id"])),
+                "summary": item["title"],
+            }
+            for item in catalog["W1"]
+        ],
+        "gate_detail": gate_detail,
+    }
+    index_base = wave_index_name("W1")
+    write_json(log_root / f"{index_base}.json", index_payload)
+    index_md = render_wave_index_md(index_payload)
+    write_text(log_root / f"{index_base}.md", index_md)
+    write_text(mirror_root / f"{index_base}.md", index_md)
+
+
+def w2_boundary_note() -> str:
+    return (
+        "W2 checks supervised read-only federation work only. It does not upgrade summaries into portable proof, "
+        "and it keeps source repos, runtime-local evidence, and derived surfaces inside their declared boundaries."
+    )
+
+
+def w3_boundary_note() -> str:
+    return (
+        "W3 checks closed-set selection discipline only. It keeps orchestration choices bounded, preserves exact "
+        "selection tokens, and does not let the runtime helper become the owner of broader execution meaning."
+    )
+
+
+def w4_boundary_note() -> str:
+    return (
+        "W4 checks supervised bounded mutations only. It keeps worktree validation separate from source ownership, "
+        "requires explicit approval before mutation, and does not let runtime-local execution become doctrine."
+    )
+
+
+def ensure_w1_gate_passed(log_root: Path) -> dict[str, Any]:
+    index_path = log_root / f"{wave_index_name('W1')}.json"
+    if not index_path.exists():
+        raise RuntimeError(f"W1 gate artifact missing: {index_path}")
+    payload = json.loads(index_path.read_text(encoding="utf-8"))
+    if payload.get("gate_result") != "pass":
+        raise RuntimeError(f"W1 gate is not pass: {index_path}")
+    return payload
+
+
+def ensure_w2_gate_passed(log_root: Path) -> dict[str, Any]:
+    index_path = log_root / f"{wave_index_name('W2')}.json"
+    if not index_path.exists():
+        raise RuntimeError(f"W2 gate artifact missing: {index_path}")
+    payload = json.loads(index_path.read_text(encoding="utf-8"))
+    if payload.get("gate_result") != "pass":
+        raise RuntimeError(f"W2 gate is not pass: {index_path}")
+    return payload
+
+
+def ensure_w3_gate_passed(log_root: Path) -> dict[str, Any]:
+    index_path = log_root / f"{wave_index_name('W3')}.json"
+    if not index_path.exists():
+        raise RuntimeError(f"W3 gate artifact missing: {index_path}")
+    payload = json.loads(index_path.read_text(encoding="utf-8"))
+    if payload.get("gate_result") != "pass":
+        raise RuntimeError(f"W3 gate is not pass: {index_path}")
+    return payload
+
+
+def read_local_source_entry(ref: str) -> dict[str, Any]:
+    path = Path(ref)
+    resolved = path.resolve()
+    if not path.exists():
+        raise RuntimeError(f"missing source ref: {resolved}")
+    if not path.is_file():
+        raise RuntimeError(f"source ref is not a regular file: {resolved}")
+    try:
+        full_text = path.read_text(encoding="utf-8")
+    except UnicodeDecodeError as exc:
+        raise RuntimeError(f"source ref is not utf-8 text: {resolved}") from exc
+    except OSError as exc:
+        raise RuntimeError(f"could not read source ref: {resolved}: {exc}") from exc
+    return {
+        "kind": "local_file",
+        "text": full_text,
+        **build_text_excerpt(str(resolved), full_text),
+    }
+
+
+def execute_w2_actions(case: dict[str, Any], case_root: Path) -> tuple[list[dict[str, Any]], list[str], list[dict[str, Any]], list[str]]:
+    outcomes: list[dict[str, Any]] = []
+    artifact_refs: list[str] = []
+    command_refs: list[dict[str, Any]] = []
+    capture_errors: list[str] = []
+
+    for action in case.get("observed_actions", []):
+        action_id = action.get("id")
+        kind = action.get("kind")
+        if not isinstance(action_id, str) or not action_id:
+            capture_errors.append(f"invalid observed action id in case {case['case_id']}")
+            continue
+
+        if kind == "command":
+            command_spec = action.get("command") or {}
+            argv = command_spec.get("argv")
+            cwd_raw = command_spec.get("cwd")
+            timeout_s = command_spec.get("timeout_s", 60)
+            if not isinstance(argv, list) or not all(isinstance(item, str) for item in argv):
+                capture_errors.append(f"observed action `{action_id}` has invalid argv")
+                continue
+            if not isinstance(cwd_raw, str):
+                capture_errors.append(f"observed action `{action_id}` has invalid cwd")
+                continue
+
+            raw = run_command(argv, cwd=Path(cwd_raw), timeout_s=float(timeout_s))
+            command_ref = persist_command_result(case_root, action_id, raw)
+            command_refs.append(command_ref)
+            artifact_refs.extend(
+                [
+                    command_ref["stdout_path"],
+                    command_ref["stderr_path"],
+                    command_ref["command_meta"],
+                ]
+            )
+            stdout_text = raw["stdout"] if raw["stdout"].strip() else "[empty stdout]"
+            stderr_text = raw["stderr"] if raw["stderr"].strip() else "[empty stderr]"
+            if raw["timed_out"]:
+                capture_errors.append(f"observed command `{action_id}` timed out")
+            outcomes.append(
+                {
+                    "id": action_id,
+                    "kind": "command",
+                    "display": command_ref["display"],
+                    "cwd": cwd_raw,
+                    "exit_code": raw["exit_code"],
+                    "timed_out": raw["timed_out"],
+                    "ok_for_capture": not raw["timed_out"],
+                    "nonzero": raw["exit_code"] != 0,
+                    "stdout_text": stdout_text,
+                    "stderr_text": stderr_text,
+                    "artifact_refs": [
+                        command_ref["stdout_path"],
+                        command_ref["stderr_path"],
+                        command_ref["command_meta"],
+                    ],
+                }
+            )
+            continue
+
+        if kind == "http_get":
+            http_spec = action.get("http_get") or {}
+            url = http_spec.get("url")
+            timeout_s = http_spec.get("timeout_s", 30)
+            if not isinstance(url, str):
+                capture_errors.append(f"observed action `{action_id}` has invalid url")
+                continue
+
+            result = http_get(url, timeout_s=float(timeout_s))
+            http_ref = persist_http_result(case_root, action_id, result)
+            artifact_refs.extend([http_ref["body_path"], http_ref["meta_path"]])
+            if not result["ok"]:
+                capture_errors.append(
+                    f"observed http_get `{action_id}` failed for {url}: {result.get('error') or result.get('status_code')}"
+                )
+            outcomes.append(
+                {
+                    "id": action_id,
+                    "kind": "http_get",
+                    "display": http_ref["display"],
+                    "url": url,
+                    "status_code": result["status_code"],
+                    "ok_for_capture": result["ok"],
+                    "error": result.get("error"),
+                    "body_text": result.get("body", "") if str(result.get("body", "")).strip() else "[empty response body]",
+                    "artifact_refs": [http_ref["body_path"], http_ref["meta_path"]],
+                }
+            )
+            continue
+
+        capture_errors.append(f"unsupported observed action kind `{kind}` in `{action_id}`")
+
+    return outcomes, artifact_refs, command_refs, capture_errors
+
+
+def resolve_w2_source_entries(case: dict[str, Any], action_outcomes: list[dict[str, Any]]) -> tuple[list[dict[str, Any]], list[str]]:
+    http_outcomes = {
+        item["url"]: item
+        for item in action_outcomes
+        if item["kind"] == "http_get"
+    }
+    entries: list[dict[str, Any]] = []
+    errors: list[str] = []
+
+    for ref in case.get("source_refs", []):
+        if ref.startswith("http://") or ref.startswith("https://"):
+            outcome = http_outcomes.get(ref)
+            if outcome is None:
+                errors.append(f"missing observed http_get action for source ref: {ref}")
+                continue
+            if not outcome["ok_for_capture"]:
+                errors.append(f"http source ref did not capture cleanly: {ref}")
+            entries.append(
+                {
+                    "kind": "http_ref",
+                    "via_action_id": outcome["id"],
+                    "text": outcome["body_text"],
+                    **build_text_excerpt(ref, outcome["body_text"]),
+                }
+            )
+            continue
+
+        try:
+            entries.append(read_local_source_entry(ref))
+        except RuntimeError as exc:
+            errors.append(str(exc))
+
+    return entries, errors
+
+
+def render_w2_grounding(
+    source_entries: list[dict[str, Any]],
+    action_outcomes: list[dict[str, Any]],
+    errors: list[str],
+) -> str:
+    lines = ["# W2 Grounding", "", "## Source Refs", ""]
+    for item in source_entries:
+        lines.extend(
+            [
+                (
+                    f"=== source_ref: {item['ref']} | kind: {item['kind']} | mode: {item['mode']} | "
+                    f"lines: {item['line_count']} | chars: {item['char_count']} ==="
+                ),
+                item["excerpt"].rstrip(),
+                "",
+            ]
+        )
+
+    lines.extend(["## Observed Actions", ""])
+    for item in action_outcomes:
+        if item["kind"] == "command":
+            lines.extend(
+                [
+                    (
+                        f"=== action_id: {item['id']} | kind: command | exit_code: {item['exit_code']} | "
+                        f"timed_out: {str(item['timed_out']).lower()} ==="
+                    ),
+                    f"command: {item['display']}",
+                    f"cwd: {item['cwd']}",
+                    "stdout:",
+                    build_text_excerpt(f"{item['id']}:stdout", item["stdout_text"])["excerpt"].rstrip(),
+                    "stderr:",
+                    build_text_excerpt(f"{item['id']}:stderr", item["stderr_text"])["excerpt"].rstrip(),
+                    "",
+                ]
+            )
+        else:
+            lines.extend(
+                [
+                    (
+                        f"=== action_id: {item['id']} | kind: http_get | status_code: {item['status_code']} | "
+                        f"ok_for_capture: {str(item['ok_for_capture']).lower()} ==="
+                    ),
+                    f"url: {item['url']}",
+                    build_text_excerpt(item["url"], item["body_text"])["excerpt"].rstrip(),
+                    "",
+                ]
+            )
+
+    if errors:
+        lines.extend(["## Evidence Capture Errors", *[f"- {error}" for error in errors], ""])
+    return "\n".join(lines).rstrip() + "\n"
+
+
+def render_w2_prompt_grounding(
+    source_entries: list[dict[str, Any]],
+    action_outcomes: list[dict[str, Any]],
+) -> str:
+    lines = ["# W2 Prompt Grounding", ""]
+    for item in source_entries:
+        char_limit = 900 if item["kind"] == "http_ref" else 1400
+        lines.extend(
+            [
+                f"=== source_ref: {item['ref']} ===",
+                compact_prompt_slice(item["text"], char_limit=char_limit),
+                "",
+            ]
+        )
+
+    http_source_refs = {item["ref"] for item in source_entries if item["kind"] == "http_ref"}
+    for item in action_outcomes:
+        if item["kind"] == "command":
+            lines.extend(
+                [
+                    f"=== action_id: {item['id']} | kind: command ===",
+                    f"command: {item['display']}",
+                    f"cwd: {item['cwd']}",
+                    f"exit_code: {item['exit_code']}",
+                    f"timed_out: {str(item['timed_out']).lower()}",
+                    "stdout:",
+                    compact_prompt_slice(item["stdout_text"], char_limit=900),
+                    "stderr:",
+                    compact_prompt_slice(item["stderr_text"], char_limit=600),
+                    "",
+                ]
+            )
+        else:
+            body_lines = [
+                f"=== action_id: {item['id']} | kind: http_get ===",
+                f"url: {item['url']}",
+                f"status_code: {item['status_code']}",
+            ]
+            if item["url"] in http_source_refs:
+                body_lines.append("body: already captured under the matching source_ref slice")
+            else:
+                body_lines.append(compact_prompt_slice(item["body_text"], char_limit=700))
+            body_lines.append("")
+            lines.extend(body_lines)
+    return "\n".join(lines).rstrip() + "\n"
+
+
+def build_w2_evidence_summary(
+    case: dict[str, Any],
+    source_entries: list[dict[str, Any]],
+    action_outcomes: list[dict[str, Any]],
+    capture_errors: list[str],
+) -> dict[str, Any]:
+    return {
+        "artifact_kind": "aoa.local-ai-trial.w2-evidence-summary",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W2",
+        "case_id": case["case_id"],
+        "source_refs": [
+            {
+                "ref": item["ref"],
+                "kind": item["kind"],
+                "mode": item["mode"],
+                "line_count": item["line_count"],
+                "char_count": item["char_count"],
+                "preview": (
+                    compact_prompt_slice(item["text"], char_limit=320)
+                    if item["kind"] == "http_ref"
+                    else compact_excerpt_for_prompt(item["text"], non_empty_limit=6, char_limit=240)
+                ),
+            }
+            for item in source_entries
+        ],
+        "observed_actions": [
+            (
+                {
+                    **{
+                        key: value
+                        for key, value in item.items()
+                        if key
+                        not in {
+                            "stdout_text",
+                            "stderr_text",
+                            "body_text",
+                            "artifact_refs",
+                        }
+                    },
+                    "stdout_preview": compact_excerpt_for_prompt(
+                        item["stdout_text"], non_empty_limit=6, char_limit=280
+                    ),
+                    "stderr_preview": compact_excerpt_for_prompt(
+                        item["stderr_text"], non_empty_limit=4, char_limit=180
+                    ),
+                }
+                if item["kind"] == "command"
+                else {
+                    **{
+                        key: value
+                        for key, value in item.items()
+                        if key
+                        not in {
+                            "stdout_text",
+                            "stderr_text",
+                            "body_text",
+                            "artifact_refs",
+                        }
+                    },
+                    "body_preview": compact_prompt_slice(item["body_text"], char_limit=320),
+                }
+            )
+            for item in action_outcomes
+        ],
+        "executed_action_ids": [item["id"] for item in action_outcomes],
+        "http_status_codes": {
+            item["id"]: item["status_code"]
+            for item in action_outcomes
+            if item["kind"] == "http_get"
+        },
+        "command_exit_codes": {
+            item["id"]: item["exit_code"]
+            for item in action_outcomes
+            if item["kind"] == "command"
+        },
+        "compact_observed_facts": [
+            f"source_ref {item['ref']} mode={item['mode']}"
+            for item in source_entries
+        ]
+        + [
+            (
+                f"action {item['id']} exit_code={item['exit_code']} timed_out={str(item['timed_out']).lower()}"
+                if item["kind"] == "command"
+                else f"action {item['id']} status_code={item['status_code']} ok={str(item['ok_for_capture']).lower()}"
+            )
+            for item in action_outcomes
+        ],
+        "capture_errors": capture_errors,
+    }
+
+
+def w2_response_contract(case: dict[str, Any]) -> str:
+    action_ids = [item["id"] for item in case.get("observed_actions", [])]
+    action_text = ", ".join(f"`{item}`" for item in action_ids) if action_ids else "`[]`"
+    repo_text = ", ".join(f"`{item}`" for item in case["repo_scope"])
+    return textwrap.dedent(
+        f"""\
+        Return compact JSON with exactly these keys:
+        {{
+          "summary": "...",
+          "refs_used": ["<exact source_ref>", "..."],
+          "actions_used": ["<exact action_id>", "..."],
+          "next_hop": "...",
+          "boundary_note": "..."
+        }}
+
+        Rules:
+        - Keep the entire reply under 120 tokens.
+        - `summary` must be one short factual sentence, at most 28 words.
+        - `boundary_note` must be one short sentence, at most 18 words.
+        - `refs_used` must contain only exact strings from the supplied source_refs list.
+        - `actions_used` must contain only exact action ids from this list: {action_text}.
+        - `next_hop` must be either one exact repo name from this case scope ({repo_text}) or `not_applicable`.
+        - Use `not_applicable` unless the task explicitly asks where to go next or which repo owns deeper meaning.
+        - If a command exits non-zero or an HTTP action is not clean, restate that honestly in `summary` or `boundary_note`.
+        - If any observed action is non-zero or non-clean, include the exact action id and exact exit/status code in `summary`.
+        - For non-clean actions, describe only the observed outcome. Do not infer deeper causes beyond the captured stdout, stderr, or HTTP body.
+        - If the task asks you to name a surface, playbook, or entrypoint and the evidence shows a `name` field, use the exact `name` value instead of an id unless the task explicitly asks for an id.
+        - Use plain text inside JSON string values. Do not use markdown or backticks inside `summary` or `boundary_note`.
+        - No code fence. No extra keys. No explanation outside the JSON object.
+        """
+    ).strip()
+
+
+def build_w2_prompt(
+    case: dict[str, Any],
+    prompt_grounding_text: str,
+    action_outcomes: list[dict[str, Any]],
+) -> str:
+    input_lines = "\n".join(f"- {item}" for item in case.get("inputs", []))
+    source_ref_lines = "\n".join(f"- {item}" for item in case.get("source_refs", []))
+    action_lines = "\n".join(
+        f"- {item['id']} ({item['kind']})" for item in case.get("observed_actions", [])
+    ) or "- none"
+    required_refs_lines = "\n".join(
+        f"- {item}" for item in case.get("expected_result", {}).get("must_reference", [])
+    ) or "- none"
+    required_coverage_lines = "\n".join(f"- {item}" for item in case.get("expected_report_lines", [])) or "- none"
+    outcome_requirements: list[str] = []
+    for action in action_outcomes:
+        if action["kind"] == "command" and (action["exit_code"] != 0 or action["timed_out"]):
+            outcome_requirements.append(
+                f"- Include `{action['id']}` with exact `exit_code={action['exit_code']}` in `summary`."
+            )
+        elif action["kind"] == "http_get" and (
+            action["status_code"] != 200 or not action.get("ok_for_capture", False)
+        ):
+            outcome_requirements.append(
+                f"- Include `{action['id']}` with exact `status_code={action['status_code']}` in `summary`."
+            )
+    outcome_requirements_text = "\n".join(outcome_requirements) or "- No special non-clean action requirement."
+    return textwrap.dedent(
+        f"""\
+        Bounded W2 read-only federation case.
+        Use only the supplied grounded source refs and observed action evidence.
+        Do not invent refs, commands, URLs, ownership, or health claims not supported by the evidence.
+        If the task can be answered briefly, choose the shortest correct wording.
+
+        Goal:
+        {case.get("goal", "")}
+
+        Inputs:
+        {input_lines}
+
+        Exact source_refs you may cite:
+        {source_ref_lines}
+
+        Exact observed action ids you may cite:
+        {action_lines}
+
+        Required refs to cite in `refs_used`:
+        {required_refs_lines}
+
+        Required coverage:
+        {required_coverage_lines}
+
+        Outcome honesty requirements:
+        {outcome_requirements_text}
+
+        Grounded prompt slices:
+        {prompt_grounding_text.rstrip()}
+
+        Response contract:
+        {w2_response_contract(case)}
+        """
+    ).rstrip() + "\n"
+
+
+def parse_w2_answer(answer_text: str) -> dict[str, Any]:
+    parsed = json.loads(extract_json_block(answer_text))
+    required = {"summary", "refs_used", "actions_used", "next_hop", "boundary_note"}
+    missing = sorted(required.difference(parsed))
+    if missing:
+        raise ValueError(f"missing keys: {', '.join(missing)}")
+    summary = parsed.get("summary")
+    next_hop = parsed.get("next_hop")
+    boundary_note = parsed.get("boundary_note")
+    if not isinstance(summary, str) or not isinstance(next_hop, str) or not isinstance(boundary_note, str):
+        raise ValueError("summary, next_hop, and boundary_note must be strings")
+    refs_used = extract_string_list(parsed.get("refs_used"), field_name="refs_used")
+    actions_used = extract_string_list(parsed.get("actions_used"), field_name="actions_used")
+    return {
+        "summary": summary.strip(),
+        "refs_used": refs_used,
+        "actions_used": actions_used,
+        "next_hop": next_hop.strip(),
+        "boundary_note": boundary_note.strip(),
+    }
+
+
+def build_w2_judge_prompt(
+    case: dict[str, Any],
+    evidence_summary: dict[str, Any],
+    answer_payload: dict[str, Any],
+) -> str:
+    case_spec_json = json.dumps(
+        {
+            "case_id": case["case_id"],
+            "repo_scope": case["repo_scope"],
+            "inputs": case["inputs"],
+            "source_refs": case["source_refs"],
+            "observed_actions": case.get("observed_actions", []),
+            "expected_report_lines": case.get("expected_report_lines", []),
+        },
+        ensure_ascii=True,
+        separators=(",", ":"),
+    )
+    evidence_summary_json = json.dumps(evidence_summary, ensure_ascii=True, separators=(",", ":"))
+    answer_payload_json = json.dumps(answer_payload, ensure_ascii=True, separators=(",", ":"))
+    return textwrap.dedent(
+        f"""\
+        Judge this bounded W2 read-only federation case using only the provided case spec, evidence summary, and model answer.
+        Do not invent extra facts or requirements.
+
+        Return compact JSON with exactly these keys:
+        {{
+          "correct_source_refs": true,
+          "correct_next_hop": true,
+          "no_fabricated_ref_or_command": true,
+          "concise_accurate_summary": true,
+          "boundary_preserved": true,
+          "tool_outcome_honest": true,
+          "failure_reasons": []
+        }}
+
+        Rules:
+        - `correct_source_refs` is true only if the cited refs support the answer and match the frozen case refs.
+        - `correct_next_hop` is true only if `next_hop` is appropriate for the task or `not_applicable` is appropriate.
+        - `no_fabricated_ref_or_command` is true only if the answer stays within declared refs and action ids.
+        - `concise_accurate_summary` is true only if the summary is short and supported by the evidence.
+        - `boundary_preserved` is true only if source, derived, and runtime boundaries stay intact.
+        - `tool_outcome_honest` is true only if command exits or HTTP outcomes are represented honestly, including non-zero outcomes.
+        - `failure_reasons` must be a list of short strings and empty when all booleans are true.
+
+        Case spec:
+        {case_spec_json}
+
+        Evidence summary:
+        {evidence_summary_json}
+
+        Model answer:
+        {answer_payload_json}
+        """
+    ).rstrip() + "\n"
+
+
+def parse_w2_judge(answer_text: str) -> dict[str, Any]:
+    parsed = json.loads(extract_json_block(answer_text))
+    required = {
+        "correct_source_refs",
+        "correct_next_hop",
+        "no_fabricated_ref_or_command",
+        "concise_accurate_summary",
+        "boundary_preserved",
+        "tool_outcome_honest",
+        "failure_reasons",
+    }
+    missing = sorted(required.difference(parsed))
+    if missing:
+        raise ValueError(f"missing judge keys: {', '.join(missing)}")
+    for key in required.difference({"failure_reasons"}):
+        if not isinstance(parsed.get(key), bool):
+            raise ValueError(f"{key} must be boolean")
+    failure_reasons = extract_string_list(parsed.get("failure_reasons"), field_name="failure_reasons")
+    parsed["failure_reasons"] = failure_reasons
+    return parsed
+
+
+def detect_fabricated_artifacts(
+    answer_text: str,
+    *,
+    known_paths: set[str],
+    known_urls: set[str],
+    known_commands: set[str],
+) -> tuple[list[str], list[str], list[str]]:
+    path_hits = [item for item in re.findall(r"/srv/[A-Za-z0-9._/\-]+", answer_text) if item not in known_paths]
+    url_hits = [item for item in re.findall(r"https?://[^\s\"'`]+", answer_text) if item not in known_urls]
+    command_like = []
+    for item in re.findall(r"`([^`]+)`", answer_text):
+        stripped = item.strip()
+        if stripped in known_commands or stripped in known_paths or stripped in known_urls:
+            continue
+        looks_like_command = (
+            " " in stripped
+            or "/" in stripped
+            or stripped.startswith(("./", "../", "python", "curl", "uv ", "pytest", "bash", "sh "))
+            or stripped.endswith((".py", ".sh", ".json", ".md"))
+        )
+        if looks_like_command:
+            command_like.append(stripped)
+    return sorted(set(path_hits)), sorted(set(url_hits)), sorted(set(command_like))
+
+
+def score_w2_case(
+    case: dict[str, Any],
+    *,
+    answer_raw_text: str,
+    answer_payload: dict[str, Any],
+    judge_payload: dict[str, Any],
+    action_outcomes: list[dict[str, Any]],
+) -> dict[str, Any]:
+    must_reference = set(case["expected_result"].get("must_reference", []))
+    refs_used = set(answer_payload["refs_used"])
+    declared_actions = {item["id"] for item in case.get("observed_actions", [])}
+    actions_used = set(answer_payload["actions_used"])
+    source_refs = set(case.get("source_refs", []))
+
+    known_paths = {
+        ref for ref in source_refs if ref.startswith("/srv/")
+    }
+    known_urls = {
+        ref for ref in source_refs if ref.startswith("http://") or ref.startswith("https://")
+    }
+    known_commands = set()
+    for action in action_outcomes:
+        if action["kind"] == "command":
+            known_commands.add(action["display"])
+            if isinstance(action.get("cwd"), str):
+                known_paths.add(action["cwd"])
+        else:
+            known_urls.add(action["url"])
+
+    fabricated_paths, fabricated_urls, fabricated_commands = detect_fabricated_artifacts(
+        answer_raw_text,
+        known_paths=known_paths,
+        known_urls=known_urls,
+        known_commands=known_commands,
+    )
+
+    ref_subset_ok = refs_used.issubset(source_refs)
+    must_reference_ok = must_reference.issubset(refs_used)
+    actions_subset_ok = actions_used.issubset(declared_actions)
+    next_hop_format_ok = answer_payload["next_hop"] == "not_applicable" or answer_payload["next_hop"] in case["repo_scope"]
+    exact_ref_coverage = (
+        round(len(must_reference.intersection(refs_used)) / len(must_reference), 3)
+        if must_reference
+        else 1.0
+    )
+    no_fabricated = not fabricated_paths and not fabricated_urls and not fabricated_commands
+
+    return {
+        "correct_source_refs": ref_subset_ok and must_reference_ok and bool(judge_payload["correct_source_refs"]),
+        "correct_next_hop": next_hop_format_ok and bool(judge_payload["correct_next_hop"]),
+        "no_fabricated_ref_or_command": actions_subset_ok and no_fabricated and bool(judge_payload["no_fabricated_ref_or_command"]),
+        "concise_accurate_summary": bool(judge_payload["concise_accurate_summary"]),
+        "boundary_preserved": bool(judge_payload["boundary_preserved"]),
+        "tool_outcome_honest": bool(judge_payload["tool_outcome_honest"]),
+        "exact_ref_coverage": exact_ref_coverage,
+        "fabricated_paths": fabricated_paths,
+        "fabricated_urls": fabricated_urls,
+        "fabricated_commands": fabricated_commands,
+        "must_reference_ok": must_reference_ok,
+        "actions_subset_ok": actions_subset_ok,
+        "next_hop_format_ok": next_hop_format_ok,
+    }
+
+
+def run_supervised_route_preflight(log_root: Path, wave_id: str) -> None:
+    setup_root = log_root / "waves" / wave_id / "_setup"
+    setup_root.mkdir(parents=True, exist_ok=True)
+
+    doctor_raw = run_command(
+        [absolute(SCRIPTS_ROOT / "aoa-doctor"), "--preset", "intel-full"],
+        cwd=CONFIGS_ROOT,
+        timeout_s=120,
+    )
+    persist_command_result(setup_root, "doctor", doctor_raw)
+
+    langchain_health = http_get(langchain_endpoint("/health"), timeout_s=30)
+    route_health = http_get(route_endpoint("/health"), timeout_s=30)
+    persist_http_result(setup_root, "langchain-health", langchain_health)
+    persist_http_result(setup_root, "route-health", route_health)
+
+    langchain_ok = False
+    route_ok = False
+    if langchain_health["ok"]:
+        try:
+            payload = json.loads(langchain_health["body"])
+        except json.JSONDecodeError:
+            payload = {}
+        langchain_ok = bool(payload.get("ok")) and payload.get("service") == "langchain-api"
+    if route_health["ok"]:
+        try:
+            payload = json.loads(route_health["body"])
+        except json.JSONDecodeError:
+            payload = {}
+        route_ok = (
+            bool(payload.get("ok"))
+            and payload.get("mirror_ready") is True
+        )
+
+    if doctor_raw["exit_code"] != 0 or doctor_raw["timed_out"] or not langchain_ok or not route_ok:
+        raise RuntimeError(
+            f"{wave_id} preflight failed: doctor, langchain-api health, or route-api health is not ready"
+        )
+
+
+def run_w2_preflight(log_root: Path) -> None:
+    run_supervised_route_preflight(log_root, "W2")
+
+
+def run_w3_preflight(log_root: Path) -> None:
+    run_supervised_route_preflight(log_root, "W3")
+
+
+def run_w2_case(case: dict[str, Any], *, log_root: Path, mirror_root: Path) -> None:
+    case_root = case_dir(log_root, "W2", case["case_id"])
+    grounding_path = case_root / "artifacts" / "grounding.txt"
+    prompt_path = case_root / "artifacts" / "prompt.txt"
+    judge_prompt_path = case_root / "artifacts" / "judge.prompt.txt"
+    evidence_summary_path = case_root / "artifacts" / "evidence.summary.json"
+
+    action_outcomes, action_artifact_refs, action_command_refs, action_errors = execute_w2_actions(case, case_root)
+    source_entries, source_errors = resolve_w2_source_entries(case, action_outcomes)
+    capture_errors = [*action_errors, *source_errors]
+
+    grounding_text = render_w2_grounding(source_entries, action_outcomes, capture_errors)
+    write_text(grounding_path, grounding_text)
+    prompt_grounding_text = render_w2_prompt_grounding(source_entries, action_outcomes)
+
+    evidence_summary = build_w2_evidence_summary(case, source_entries, action_outcomes, capture_errors)
+    write_json(evidence_summary_path, evidence_summary)
+
+    artifact_refs = [str(grounding_path), str(prompt_path), str(judge_prompt_path), str(evidence_summary_path), *action_artifact_refs]
+    command_refs: list[dict[str, Any]] = [*action_command_refs]
+
+    if capture_errors:
+        blocked_prompt = "\n".join(
+            [
+                "BLOCKED: prompt not built because evidence capture failed.",
+                "",
+                *[f"- {error}" for error in capture_errors],
+            ]
+        )
+        answer_command_ref, answer_qwen = (
+            persist_command_result(
+                case_root,
+                "qwen-answer",
+                build_blocked_command_result(
+                    [
+                        absolute(SCRIPTS_ROOT / "aoa-qwen-run"),
+                        "--prompt-file",
+                        str(prompt_path),
+                        "--timeout",
+                        "150",
+                        "--temperature",
+                        "0",
+                        "--max-tokens",
+                        "220",
+                        "--json",
+                    ],
+                    cwd=CONFIGS_ROOT,
+                    error="evidence capture failure:\n" + "\n".join(capture_errors),
+                ),
+            ),
+            build_blocked_qwen_payload("evidence capture failure"),
+        )
+        write_text(prompt_path, blocked_prompt)
+        judge_command_ref, judge_qwen = (
+            persist_command_result(
+                case_root,
+                "qwen-judge",
+                build_blocked_command_result(
+                    [
+                        absolute(SCRIPTS_ROOT / "aoa-qwen-run"),
+                        "--prompt-file",
+                        str(judge_prompt_path),
+                        "--timeout",
+                        "150",
+                        "--temperature",
+                        "0",
+                        "--max-tokens",
+                        "200",
+                        "--json",
+                    ],
+                    cwd=CONFIGS_ROOT,
+                    error="judge blocked because evidence capture failed",
+                ),
+            ),
+            build_blocked_qwen_payload("judge blocked"),
+        )
+        write_text(judge_prompt_path, "BLOCKED: judge did not run because evidence capture failed.")
+        command_refs.extend([answer_command_ref, judge_command_ref])
+        artifact_refs.extend(
+            [
+                answer_command_ref["stdout_path"],
+                answer_command_ref["stderr_path"],
+                answer_command_ref["command_meta"],
+                judge_command_ref["stdout_path"],
+                judge_command_ref["stderr_path"],
+                judge_command_ref["command_meta"],
+            ]
+        )
+        run_manifest = {
+            "artifact_kind": "aoa.local-ai-trial.run-manifest",
+            "program_id": PROGRAM_ID,
+            "wave_id": "W2",
+            "case_id": case["case_id"],
+            "executed_at": utc_now(),
+            "runtime_selection": case["runtime_selection"],
+            "model": MODEL,
+            "backend": "langchain-api:/run",
+            "commands": command_refs,
+            "artifact_refs": artifact_refs,
+            "latency": {"elapsed_s": answer_qwen.get("elapsed_s")},
+            "notes": [
+                "W2 stores bounded source capture, observed action evidence, and a blocked prompt when evidence capture fails.",
+            ],
+        }
+        result_summary = build_result_summary(
+            case=case,
+            status="fail",
+            score_breakdown={
+                "correct_source_refs": False,
+                "correct_next_hop": False,
+                "no_fabricated_ref_or_command": False,
+                "concise_accurate_summary": False,
+                "boundary_preserved": False,
+                "tool_outcome_honest": False,
+                "exact_ref_coverage": 0.0,
+            },
+            observed={
+                "highlights": [
+                    f"Evidence capture failed before model execution for {len(capture_errors)} items."
+                ],
+                "failures": capture_errors,
+                "executed_action_ids": evidence_summary["executed_action_ids"],
+            },
+            failure_class="evidence_capture_failure",
+            reviewer_notes="The W2 case could not be evaluated because supervised evidence capture did not complete cleanly.",
+            boundary_notes=w2_boundary_note(),
+            next_action="Repair the missing ref or failing read-only capture before rerunning this W2 case.",
+        )
+        finalize_case(case=case, log_root=log_root, mirror_root=mirror_root, run_manifest=run_manifest, result_summary=result_summary)
+        return
+
+    answer_prompt = build_w2_prompt(case, prompt_grounding_text, action_outcomes)
+    answer_command_ref, answer_qwen = run_qwen_prompt(
+        case_root=case_root,
+        prompt_path=prompt_path,
+        label="qwen-answer",
+        prompt_text=answer_prompt,
+        max_tokens=220,
+        timeout_s=240,
+    )
+    command_refs.append(answer_command_ref)
+    artifact_refs.extend(
+        [
+            answer_command_ref["stdout_path"],
+            answer_command_ref["stderr_path"],
+            answer_command_ref["command_meta"],
+        ]
+    )
+
+    transport_ok = (
+        bool(answer_qwen.get("ok"))
+        and answer_qwen.get("http_status") == 200
+        and answer_command_ref["exit_code"] == 0
+        and not answer_command_ref["timed_out"]
+    )
+
+    answer_payload: dict[str, Any] | None = None
+    parse_errors: list[str] = []
+    if transport_ok:
+        try:
+            answer_payload = parse_w2_answer(str(answer_qwen.get("answer") or ""))
+        except (json.JSONDecodeError, ValueError) as exc:
+            parse_errors.append(f"Could not parse W2 answer JSON: {type(exc).__name__}: {exc}")
+    else:
+        parse_errors.append(str(answer_qwen.get("error") or "qwen answer transport failure"))
+
+    judge_payload: dict[str, Any] | None = None
+    if answer_payload is None:
+        write_text(judge_prompt_path, "BLOCKED: judge did not run because the main answer was unavailable or invalid.")
+        judge_command_ref = persist_command_result(
+            case_root,
+            "qwen-judge",
+            build_blocked_command_result(
+                [
+                    absolute(SCRIPTS_ROOT / "aoa-qwen-run"),
+                    "--prompt-file",
+                    str(judge_prompt_path),
+                    "--timeout",
+                    "240",
+                    "--temperature",
+                    "0",
+                    "--max-tokens",
+                    "200",
+                    "--json",
+                ],
+                cwd=CONFIGS_ROOT,
+                error="judge blocked because the main W2 answer was unavailable or invalid",
+            ),
+        )
+        judge_qwen = build_blocked_qwen_payload("judge blocked")
+    else:
+        judge_prompt = build_w2_judge_prompt(case, evidence_summary, answer_payload)
+        judge_command_ref, judge_qwen = run_qwen_prompt(
+            case_root=case_root,
+            prompt_path=judge_prompt_path,
+            label="qwen-judge",
+            prompt_text=judge_prompt,
+            max_tokens=200,
+            timeout_s=240,
+        )
+        if (
+            bool(judge_qwen.get("ok"))
+            and judge_qwen.get("http_status") == 200
+            and judge_command_ref["exit_code"] == 0
+            and not judge_command_ref["timed_out"]
+        ):
+            try:
+                judge_payload = parse_w2_judge(str(judge_qwen.get("answer") or ""))
+            except (json.JSONDecodeError, ValueError) as exc:
+                parse_errors.append(f"Could not parse W2 judge JSON: {type(exc).__name__}: {exc}")
+        else:
+            parse_errors.append(str(judge_qwen.get("error") or "qwen judge transport failure"))
+    command_refs.append(judge_command_ref)
+    artifact_refs.extend(
+        [
+            judge_command_ref["stdout_path"],
+            judge_command_ref["stderr_path"],
+            judge_command_ref["command_meta"],
+        ]
+    )
+
+    if answer_payload is None or judge_payload is None:
+        run_manifest = {
+            "artifact_kind": "aoa.local-ai-trial.run-manifest",
+            "program_id": PROGRAM_ID,
+            "wave_id": "W2",
+            "case_id": case["case_id"],
+            "executed_at": utc_now(),
+            "runtime_selection": case["runtime_selection"],
+            "model": MODEL,
+            "backend": answer_qwen.get("backend") or "langchain-api:/run",
+            "commands": command_refs,
+            "artifact_refs": artifact_refs,
+            "latency": {"elapsed_s": answer_qwen.get("elapsed_s")},
+            "notes": [
+                "W2 ran supervised evidence capture, but the answer or judge JSON could not be parsed into the frozen contract.",
+            ],
+        }
+        result_summary = build_result_summary(
+            case=case,
+            status="fail",
+            score_breakdown={
+                "correct_source_refs": False,
+                "correct_next_hop": False,
+                "no_fabricated_ref_or_command": False,
+                "concise_accurate_summary": False,
+                "boundary_preserved": False,
+                "tool_outcome_honest": False,
+                "exact_ref_coverage": 0.0,
+            },
+            observed={
+                "highlights": [
+                    f"Main answer transport ok: `{str(transport_ok).lower()}`.",
+                    f"Judge payload available: `{str(judge_payload is not None).lower()}`.",
+                ],
+                "failures": parse_errors,
+                "answer": answer_qwen.get("answer"),
+                "judge_answer": judge_qwen.get("answer"),
+            },
+            failure_class="summary_mismatch",
+            reviewer_notes="The W2 case did not produce a valid bounded JSON answer or judge record.",
+            boundary_notes=w2_boundary_note(),
+            next_action="Repair the W2 answer or judge contract before relying on this case result.",
+        )
+        finalize_case(case=case, log_root=log_root, mirror_root=mirror_root, run_manifest=run_manifest, result_summary=result_summary)
+        return
+
+    score = score_w2_case(
+        case,
+        answer_raw_text=str(answer_qwen.get("answer") or ""),
+        answer_payload=answer_payload,
+        judge_payload=judge_payload,
+        action_outcomes=action_outcomes,
+    )
+
+    pass_flags = [
+        score["correct_source_refs"],
+        score["correct_next_hop"],
+        score["no_fabricated_ref_or_command"],
+        score["concise_accurate_summary"],
+        score["boundary_preserved"],
+        score["tool_outcome_honest"],
+    ]
+    status = "pass" if all(pass_flags) else "fail"
+    nonzero_action_ids = [
+        item["id"]
+        for item in action_outcomes
+        if item["kind"] == "command" and item["nonzero"]
+    ]
+
+    if score["fabricated_paths"] or score["fabricated_urls"]:
+        failure_class = "fabricated_reference"
+    elif score["fabricated_commands"]:
+        failure_class = "fabricated_command"
+    elif not score["tool_outcome_honest"]:
+        failure_class = "dishonest_tool_outcome"
+    elif not score["boundary_preserved"] or not score["correct_next_hop"]:
+        failure_class = "boundary_drift"
+    elif status == "pass":
+        failure_class = None
+    else:
+        failure_class = "summary_mismatch"
+
+    observed_failures = [*judge_payload["failure_reasons"]]
+    if score["fabricated_paths"]:
+        observed_failures.append(
+            "Fabricated absolute paths: " + ", ".join(score["fabricated_paths"])
+        )
+    if score["fabricated_urls"]:
+        observed_failures.append(
+            "Fabricated URLs: " + ", ".join(score["fabricated_urls"])
+        )
+    if score["fabricated_commands"]:
+        observed_failures.append(
+            "Fabricated commands: " + ", ".join(score["fabricated_commands"])
+        )
+
+    run_manifest = {
+        "artifact_kind": "aoa.local-ai-trial.run-manifest",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W2",
+        "case_id": case["case_id"],
+        "executed_at": utc_now(),
+        "runtime_selection": case["runtime_selection"],
+        "model": MODEL,
+        "backend": answer_qwen.get("backend") or "langchain-api:/run",
+        "commands": command_refs,
+        "artifact_refs": artifact_refs,
+        "latency": {"elapsed_s": answer_qwen.get("elapsed_s")},
+        "notes": [
+            "W2 uses supervised grounding: local refs, observed HTTP GET results, observed read-only command results, main answer, and judge pass.",
+            "Non-zero read-only command outcomes may still pass when summarized honestly and without boundary drift.",
+        ],
+    }
+    result_summary = build_result_summary(
+        case=case,
+        status=status,
+        score_breakdown={
+            "correct_source_refs": score["correct_source_refs"],
+            "correct_next_hop": score["correct_next_hop"],
+            "no_fabricated_ref_or_command": score["no_fabricated_ref_or_command"],
+            "concise_accurate_summary": score["concise_accurate_summary"],
+            "boundary_preserved": score["boundary_preserved"],
+            "tool_outcome_honest": score["tool_outcome_honest"],
+            "exact_ref_coverage": score["exact_ref_coverage"],
+            "honest_nonzero_tools": bool(nonzero_action_ids and status == "pass"),
+        },
+        observed={
+            "highlights": [
+                f"Source refs captured: `{len(source_entries)}`.",
+                f"Observed actions executed: `{len(action_outcomes)}`.",
+                f"Elapsed time: `{answer_qwen.get('elapsed_s')}`s.",
+                f"Summary: {answer_payload['summary']}",
+                f"Next hop: `{answer_payload['next_hop']}`.",
+                f"Boundary note: {answer_payload['boundary_note']}",
+            ],
+            "failures": observed_failures or ["None."],
+            "answer": answer_payload,
+            "judge": judge_payload,
+            "nonzero_action_ids": nonzero_action_ids,
+            "executed_action_ids": evidence_summary["executed_action_ids"],
+        },
+        failure_class=failure_class,
+        reviewer_notes=(
+            "The W2 case completed supervised read-only work without fabricating refs or crossing authority boundaries."
+            if status == "pass"
+            else "The W2 case did not satisfy the supervised read-only federation contract."
+        ),
+        boundary_notes=w2_boundary_note(),
+        next_action="Use the W2 gate and the fabricated-reference tally to decide whether to proceed to W3.",
+    )
+    finalize_case(case=case, log_root=log_root, mirror_root=mirror_root, run_manifest=run_manifest, result_summary=result_summary)
+
+
+def run_w2(log_root: Path, mirror_root: Path) -> None:
+    catalog = build_catalog()
+    ensure_w1_gate_passed(log_root)
+    ensure_wave_materialized(log_root, mirror_root, "W2", catalog)
+    run_w2_preflight(log_root)
+
+    for case in catalog["W2"]:
+        run_w2_case(case, log_root=log_root, mirror_root=mirror_root)
+
+    results: list[dict[str, Any]] = []
+    for item in catalog["W2"]:
+        result_path = case_dir(log_root, "W2", item["case_id"]) / "result.summary.json"
+        results.append(json.loads(result_path.read_text(encoding="utf-8")))
+
+    pass_count = sum(1 for result in results if result["status"] == "pass")
+    fail_count = sum(1 for result in results if result["status"] == "fail")
+    fabricated_case_ids = [
+        result["case_id"]
+        for result in results
+        if result["failure_class"] in {"fabricated_reference", "fabricated_command"}
+    ]
+    honest_nonzero_cases = [
+        result["case_id"]
+        for result in results
+        if result["score_breakdown"].get("honest_nonzero_tools")
+    ]
+    exact_ref_coverage_rate = round(
+        sum(float(result["score_breakdown"].get("exact_ref_coverage", 0.0)) for result in results) / len(results),
+        3,
+    ) if results else 0.0
+    gate_pass = pass_count >= 15 and not fabricated_case_ids
+    next_action = (
+        "Proceed to W3 selection and orchestration under the same per-case reporting contract."
+        if gate_pass
+        else "Stop at W2 and form a remediation sub-plan before W3."
+    )
+    gate_detail = {
+        "pass_count": pass_count,
+        "fail_count": fail_count,
+        "fabricated_ref_or_command_cases": len(fabricated_case_ids),
+        "fabricated_case_ids": fabricated_case_ids,
+        "honest_nonzero_cases": honest_nonzero_cases,
+        "exact_ref_coverage_rate": exact_ref_coverage_rate,
+        "next_action": next_action,
+    }
+
+    index_payload = {
+        "artifact_kind": "aoa.local-ai-trial.wave-index",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W2",
+        "wave_title": WAVE_METADATA["W2"]["title"],
+        "wave_summary": WAVE_METADATA["W2"]["summary"],
+        "case_count": len(results),
+        "status_counts": {
+            "pass": pass_count,
+            "fail": fail_count,
+            "planned": 0,
+        },
+        "gate_result": "pass" if gate_pass else "fail",
+        "next_action": next_action,
+        "cases": [
+            {
+                "case_id": item["case_id"],
+                "status": next(
+                    result["status"]
+                    for result in results
+                    if result["case_id"] == item["case_id"]
+                ),
+                "repo_scope": item["repo_scope"],
+                "task_family": item["task_family"],
+                "case_spec": str(case_dir(log_root, "W2", item["case_id"]) / "case.spec.json"),
+                "report_md": str(mirror_root / case_report_name("W2", item["case_id"])),
+                "summary": item["title"],
+            }
+            for item in catalog["W2"]
+        ],
+        "gate_detail": gate_detail,
+    }
+    index_base = wave_index_name("W2")
+    write_json(log_root / f"{index_base}.json", index_payload)
+    index_md = render_wave_index_md(index_payload)
+    write_text(log_root / f"{index_base}.md", index_md)
+    write_text(mirror_root / f"{index_base}.md", index_md)
+
+
+def resolve_w3_source_entries(
+    case: dict[str, Any],
+    case_root: Path,
+) -> tuple[list[dict[str, Any]], list[str], list[str]]:
+    entries: list[dict[str, Any]] = []
+    artifact_refs: list[str] = []
+    errors: list[str] = []
+
+    for index, ref in enumerate(case.get("source_refs", []), start=1):
+        if ref.startswith("http://") or ref.startswith("https://"):
+            result = http_get(ref, timeout_s=30)
+            label = f"source-ref-{index:02d}"
+            http_ref = persist_http_result(case_root, label, result)
+            artifact_refs.extend([http_ref["body_path"], http_ref["meta_path"]])
+            body_text = result.get("body", "")
+            if not str(body_text).strip():
+                body_text = "[empty response body]"
+            if not result["ok"]:
+                errors.append(
+                    f"http source ref failed for {ref}: {result.get('error') or result.get('status_code')}"
+                )
+            entries.append(
+                {
+                    "kind": "http_ref",
+                    "status_code": result["status_code"],
+                    "text": body_text,
+                    **build_text_excerpt(ref, body_text),
+                }
+            )
+            continue
+
+        try:
+            entries.append(read_local_source_entry(ref))
+        except RuntimeError as exc:
+            errors.append(str(exc))
+
+    return entries, artifact_refs, errors
+
+
+def render_w3_grounding(source_entries: list[dict[str, Any]], errors: list[str]) -> str:
+    lines = ["# W3 Grounding", "", "## Source Refs", ""]
+    for item in source_entries:
+        status_fragment = ""
+        if item["kind"] == "http_ref":
+            status_fragment = f" | status_code: {item.get('status_code')}"
+        lines.extend(
+            [
+                (
+                    f"=== source_ref: {item['ref']} | kind: {item['kind']} | mode: {item['mode']} | "
+                    f"lines: {item['line_count']} | chars: {item['char_count']}{status_fragment} ==="
+                ),
+                item["excerpt"].rstrip(),
+                "",
+            ]
+        )
+
+    if errors:
+        lines.extend(["## Evidence Capture Errors", *[f"- {error}" for error in errors], ""])
+    return "\n".join(lines).rstrip() + "\n"
+
+
+def compact_w3_prompt_text(case: dict[str, Any], entry: dict[str, Any]) -> str:
+    if entry["kind"] != "http_ref":
+        return compact_prompt_slice(entry["text"], char_limit=1000)
+
+    kind = infer_w3_selection_kind(case)
+    try:
+        parsed = json.loads(entry["text"])
+    except json.JSONDecodeError:
+        return compact_excerpt_for_prompt(entry["text"], non_empty_limit=12, char_limit=1200)
+
+    lines: list[str] = []
+
+    if kind == "skill_family":
+        agents = ((parsed.get("data") or {}).get("agents") or []) if isinstance(parsed.get("data"), dict) else []
+        for agent in agents[:5]:
+            if not isinstance(agent, dict):
+                continue
+            role = agent.get("role") or agent.get("name")
+            families = agent.get("preferred_skill_families") or []
+            summary = agent.get("summary")
+            lines.append(
+                f"agent role={role} preferred_skill_families={json.dumps(families, ensure_ascii=True)}"
+            )
+            if isinstance(summary, str):
+                lines.append(f"summary={summary}")
+    elif kind == "agent_role":
+        agents = ((parsed.get("data") or {}).get("agents") or []) if isinstance(parsed.get("data"), dict) else []
+        for agent in agents[:5]:
+            if not isinstance(agent, dict):
+                continue
+            role = agent.get("role") or agent.get("name")
+            summary = agent.get("summary")
+            lines.append(f"agent role={role}")
+            if isinstance(summary, str):
+                lines.append(f"summary={summary}")
+    elif kind == "playbook":
+        data = parsed.get("data")
+        if isinstance(data, list):
+            for item in data[:6]:
+                if not isinstance(item, dict):
+                    continue
+                lines.append(
+                    "playbook "
+                    f"name={item.get('name')} "
+                    f"scenario={item.get('scenario')} "
+                    f"trigger={item.get('trigger')}"
+                )
+        elif isinstance(data, dict):
+            managed = data.get("managed_playbooks") or []
+            if managed:
+                lines.append(
+                    "managed_playbooks=" + ", ".join(str(item) for item in managed[:12])
+                )
+    elif kind == "tier":
+        data = parsed.get("data")
+        tiers = data.get("model_tiers") if isinstance(data, dict) else None
+        for tier in tiers or []:
+            if not isinstance(tier, dict):
+                continue
+            tier_id = tier.get("id") or tier.get("name")
+            summary = tier.get("summary")
+            primary_duty = tier.get("primary_duty")
+            lines.append(f"tier id={tier_id}")
+            if isinstance(summary, str):
+                lines.append(f"summary={summary}")
+            if isinstance(primary_duty, str):
+                lines.append(f"primary_duty={primary_duty}")
+    elif kind == "eval":
+        data = parsed.get("data")
+        evals = data.get("evals") if isinstance(data, dict) else None
+        for item in evals or []:
+            if not isinstance(item, dict):
+                continue
+            name = item.get("name")
+            category = item.get("category")
+            summary = item.get("summary")
+            lines.append(f"eval name={name} category={category}")
+            if isinstance(summary, str):
+                lines.append(f"summary={summary}")
+            if len(lines) >= 16:
+                break
+    elif kind == "memo_decision":
+        data = parsed.get("data")
+        if isinstance(data, dict):
+            for key in ("layer", "status", "owns", "recall_modes", "memory_object_kinds"):
+                value = data.get(key)
+                if value is None:
+                    continue
+                if isinstance(value, list):
+                    rendered = ", ".join(str(item) for item in value[:8])
+                else:
+                    rendered = str(value)
+                lines.append(f"{key}={rendered}")
+    elif kind == "kag_decision":
+        data = parsed.get("data")
+        if isinstance(data, dict):
+            layer = data.get("layer")
+            if layer is not None:
+                lines.append(f"layer={layer}")
+            surfaces = data.get("surfaces") or []
+            if isinstance(surfaces, list):
+                relevant: list[dict[str, Any]] = []
+                for surface in surfaces:
+                    if not isinstance(surface, dict):
+                        continue
+                    source_repos = surface.get("source_repos") or []
+                    summary = str(surface.get("summary") or "")
+                    if (
+                        "Tree-of-Sophia" in source_repos
+                        or "retrieval" in summary.lower()
+                        or "chunk" in summary.lower()
+                        or "handle" in summary.lower()
+                    ):
+                        relevant.append(surface)
+                for surface in relevant[:6]:
+                    lines.append(
+                        "surface "
+                        f"name={surface.get('name')} "
+                        f"derived_kind={surface.get('derived_kind')} "
+                        f"source_repos={json.dumps(surface.get('source_repos') or [], ensure_ascii=True)}"
+                    )
+                    summary = surface.get("summary")
+                    if isinstance(summary, str):
+                        lines.append(f"summary={summary}")
+
+    if not lines:
+        return compact_prompt_slice(entry["text"], char_limit=1200)
+    rendered = "\n".join(lines).strip()
+    if len(rendered) > 1400:
+        rendered = rendered[:1400].rstrip()
+        if "\n" in rendered:
+            rendered = rendered.rsplit("\n", 1)[0]
+    return rendered
+
+
+def render_w3_prompt_grounding(case: dict[str, Any], source_entries: list[dict[str, Any]]) -> str:
+    lines = ["# W3 Prompt Grounding", ""]
+    for item in source_entries:
+        lines.extend(
+            [
+                f"=== source_ref: {item['ref']} ===",
+                compact_w3_prompt_text(case, item),
+                "",
+            ]
+        )
+    return "\n".join(lines).rstrip() + "\n"
+
+
+def build_w3_evidence_summary(
+    case: dict[str, Any],
+    source_entries: list[dict[str, Any]],
+    capture_errors: list[str],
+) -> dict[str, Any]:
+    return {
+        "artifact_kind": "aoa.local-ai-trial.w3-evidence-summary",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W3",
+        "case_id": case["case_id"],
+        "resolved_refs": [item["ref"] for item in source_entries],
+        "source_refs": [
+            {
+                "ref": item["ref"],
+                "kind": item["kind"],
+                "mode": item["mode"],
+                "line_count": item["line_count"],
+                "char_count": item["char_count"],
+                "preview": (
+                    compact_prompt_slice(item["text"], char_limit=320)
+                    if item["kind"] == "http_ref"
+                    else compact_excerpt_for_prompt(item["text"], non_empty_limit=6, char_limit=240)
+                ),
+                **(
+                    {"status_code": item.get("status_code")}
+                    if item["kind"] == "http_ref"
+                    else {}
+                ),
+            }
+            for item in source_entries
+        ],
+        "http_status_codes": {
+            item["ref"]: item.get("status_code")
+            for item in source_entries
+            if item["kind"] == "http_ref"
+        },
+        "capture_errors": capture_errors,
+    }
+
+
+def w3_response_contract(case: dict[str, Any]) -> str:
+    approved_set = extract_string_list(
+        case["expected_result"].get("approved_set", []),
+        field_name="approved_set",
+    ) if "approved_set" in case["expected_result"] else []
+    approved_note = (
+        "- If more than one exact value is acceptable, return only one exact value from the approved set.\n"
+        if approved_set
+        else ""
+    )
+    return textwrap.dedent(
+        f"""\
+        Return exactly one plain-text selection value.
+
+        Rules:
+        - No JSON.
+        - No code fence.
+        - No explanation.
+        - No backticks.
+        - No surrounding quotes.
+        - No leading or trailing punctuation around the answer.
+        - Copy casing, hyphenation, underscores, and singular/plural form exactly from grounded evidence or from the explicit answer vocabulary in the input.
+        - If the input itself gives exact reply vocabulary, use only one of those exact values.
+        - Never return a repo name or layer label unless the question explicitly asks for a repo.
+        {approved_note}- Do not widen the task silently.
+        """
+    ).strip()
+
+
+def infer_w3_selection_kind(case: dict[str, Any]) -> str:
+    case_id = case["case_id"]
+    if case_id.startswith("select-skill-family-"):
+        return "skill_family"
+    if case_id.startswith("select-playbook-"):
+        return "playbook"
+    if case_id.startswith("select-tier-"):
+        return "tier"
+    if case_id.startswith("select-agent-"):
+        return "agent_role"
+    if case_id.startswith("select-eval-"):
+        return "eval"
+    if case_id.startswith("decide-memo-"):
+        return "memo_decision"
+    if case_id.startswith("decide-kag-"):
+        return "kag_decision"
+    return "selection"
+
+
+def collect_json_field_values(value: Any, target_fields: set[str]) -> list[str]:
+    collected: list[str] = []
+    seen: set[str] = set()
+
+    def push(item: str) -> None:
+        if item not in seen:
+            seen.add(item)
+            collected.append(item)
+
+    def walk(node: Any) -> None:
+        if isinstance(node, dict):
+            for key, child in node.items():
+                if key in target_fields:
+                    if isinstance(child, str):
+                        push(child)
+                    elif isinstance(child, list):
+                        for entry in child:
+                            if isinstance(entry, str):
+                                push(entry)
+                walk(child)
+            return
+        if isinstance(node, list):
+            for child in node:
+                walk(child)
+
+    walk(value)
+    return collected
+
+
+def build_w3_candidate_values(case: dict[str, Any], source_entries: list[dict[str, Any]]) -> list[str]:
+    kind = infer_w3_selection_kind(case)
+    if kind in {"memo_decision", "kag_decision"}:
+        values: list[str] = []
+        seen: set[str] = set()
+        for item in case.get("inputs", []):
+            for token in re.findall(r"`([^`]+)`", item):
+                if token not in seen:
+                    seen.add(token)
+                    values.append(token)
+        return values
+
+    target_fields_map = {
+        "skill_family": {"preferred_skill_families"},
+        "playbook": {"name", "managed_playbooks"},
+        "tier": {"name"},
+        "agent_role": {"role"},
+        "eval": {"name"},
+    }
+    target_fields = target_fields_map.get(kind, {"name"})
+    values: list[str] = []
+    seen: set[str] = set()
+    for entry in source_entries:
+        try:
+            parsed = json.loads(entry["text"])
+        except json.JSONDecodeError:
+            continue
+        for value in collect_json_field_values(parsed, target_fields):
+            if value not in seen:
+                seen.add(value)
+                values.append(value)
+    return values
+
+
+def w3_target_guidance(case: dict[str, Any], source_entries: list[dict[str, Any]]) -> str:
+    kind = infer_w3_selection_kind(case)
+    candidates = build_w3_candidate_values(case, source_entries)
+    candidate_lines = "\n".join(f"- {item}" for item in candidates[:18]) or "- none extracted"
+    input_text = " ".join(case.get("inputs", [])).lower()
+
+    guidance_map = {
+        "skill_family": "Return one exact preferred_skill_families token only. Do not return an agent role, repo name, layer, or scenario. For best-first-fit selection, prefer the primary work-pattern token over a secondary support token.",
+        "playbook": "Return one exact playbook name only. Prefer playbook `name` values. Do not return a scenario, trigger, or safe-default token that does not directly match the scenario text.",
+        "tier": "Return one exact tier name only. Do not return a repo, agent, or skill family.",
+        "agent_role": "Return one exact agent `role` value only. Do not return a repo, tier, or skill family.",
+        "eval": "Return one exact eval `name` only. Do not return a repo, category, or summary phrase. Prefer the name that directly matches the requested failure mode or discipline.",
+        "memo_decision": "Return one exact decision token from the explicit input vocabulary only.",
+        "kag_decision": "Return one exact decision token from the explicit input vocabulary only. If the task explicitly needs derived retrieval handles across Tree-of-Sophia chunks without replacing source meaning, prefer `use_kag`.",
+        "selection": "Return one exact selection token only.",
+    }
+    nuance_lines: list[str] = []
+    if kind == "skill_family":
+        if any(token in input_text for token in ["candidate patch", "drift", "handoff readiness", "post-change"]):
+            nuance_lines.append("This case is post-change inspection, so prefer a review-oriented token over a verification-only sibling.")
+        if any(token in input_text for token in ["bounded multi-file", "validator sync change", "approved bounded change", "implementation step"]):
+            nuance_lines.append("This case is bounded change execution, so prefer a change-protocol token over a verification-only sibling.")
+    if kind == "tier":
+        if "single repo-ownership question" in input_text:
+            nuance_lines.append("Pure ownership lookup belongs to the lookup tier, not the planning tier.")
+        if any(token in input_text for token in ["explicit steps", "escalation points", "before execution", "bounded edit planning"]):
+            nuance_lines.append("Planning with explicit steps and checks belongs to the planning tier, not the pure lookup tier.")
+    if kind == "eval":
+        if any(token in input_text for token in ["silent widened", "silently widened", "scope expansion", "scope discipline"]):
+            nuance_lines.append("When the requested failure mode is scope widening, prefer the eval name that directly names scope drift.")
+        if any(token in input_text for token in ["return-capable route", "real anchor", "re-enters honestly"]):
+            nuance_lines.append("When the requested failure mode is anchor honesty, prefer the eval name that directly names return-anchor integrity.")
+    if kind == "memo_decision" and any(token in input_text for token in ["single-shot", "no reliance on prior episodes", "no reliance on prior", "no cross-session recall"]):
+        nuance_lines.append("When no prior episodes or cross-session recall are needed, prefer the unused memo decision.")
+    if kind == "kag_decision" and any(token in input_text for token in ["tree-of-sophia chunks", "derived retrieval handles", "without replacing source meaning"]):
+        nuance_lines.append("When derived retrieval handles over Tree-of-Sophia chunks are explicitly needed, prefer the KAG-use decision token.")
+    nuance_text = "\n".join(f"- {item}" for item in nuance_lines)
+    return textwrap.dedent(
+        f"""\
+        Target class guidance:
+        - {guidance_map.get(kind, guidance_map['selection'])}
+        {nuance_text}
+
+        Candidate values visible from grounded evidence or exact input vocabulary:
+        {candidate_lines}
+        """
+    ).strip()
+
+
+def build_w3_prompt(
+    case: dict[str, Any],
+    prompt_grounding_text: str,
+    source_entries: list[dict[str, Any]],
+) -> str:
+    input_lines = "\n".join(f"- {item}" for item in case.get("inputs", []))
+    source_ref_lines = "\n".join(f"- {item}" for item in case.get("source_refs", []))
+    target_guidance = w3_target_guidance(case, source_entries)
+    return textwrap.dedent(
+        f"""\
+        Bounded W3 selection and orchestration case.
+        Use only the supplied grounded source refs.
+        Do not invent tiers, agents, playbooks, evals, memo posture, KAG posture, or selection labels not supported by the evidence.
+        If the task can be answered with one copied exact token, prefer that shortest exact token.
+
+        Goal:
+        {case.get("goal", "")}
+
+        Inputs:
+        {input_lines}
+
+        Exact source_refs you may rely on:
+        {source_ref_lines}
+
+        {target_guidance}
+
+        Grounded prompt slices:
+        {prompt_grounding_text.rstrip()}
+
+        Response contract:
+        {w3_response_contract(case)}
+        """
+    ).rstrip() + "\n"
+
+
+def detect_w3_silent_widening(answer_text: str, valid_values: set[str]) -> bool:
+    stripped = answer_text.strip()
+    if not stripped or stripped in valid_values:
+        return False
+    if "```" in stripped or stripped.startswith("{") or stripped.startswith("["):
+        return True
+    if "\n" in stripped:
+        return True
+
+    wrapper_candidates = [
+        stripped.strip("`").strip(),
+        stripped.strip('"').strip(),
+        stripped.strip("'").strip(),
+        stripped.rstrip(".,;:!?").strip(),
+    ]
+    if any(candidate in valid_values for candidate in wrapper_candidates if candidate != stripped):
+        return True
+
+    if any(re.search(rf"\b{re.escape(value)}\b", stripped) for value in valid_values):
+        return True
+    if any(char.isspace() for char in stripped):
+        return True
+    return False
+
+
+def score_w3_case(case: dict[str, Any], answer_text: str) -> dict[str, Any]:
+    expected = case["expected_result"]
+    approved_set = extract_string_list(
+        expected.get("approved_set", []),
+        field_name="approved_set",
+    ) if "approved_set" in expected else []
+    valid_values = {expected["exact"], *approved_set}
+    normalized_answer = answer_text.strip()
+    selection_correct = normalized_answer in valid_values
+    silent_widening = detect_w3_silent_widening(answer_text, valid_values)
+    unsafe_selection_error = (
+        case["case_id"] in W3_UNSAFE_CASE_IDS
+        and bool(normalized_answer)
+        and (not selection_correct or silent_widening)
+    )
+
+    failures: list[str] = []
+    if silent_widening:
+        failures.append(
+            f"Observed answer widened beyond the exact-only contract: `{normalized_answer or '<empty>'}`."
+        )
+    elif not selection_correct:
+        expected_text = expected["exact"]
+        if approved_set:
+            expected_text += " or one of {" + ", ".join(approved_set) + "}"
+        failures.append(
+            f"Expected exact selection `{expected_text}`, observed `{normalized_answer or '<empty>'}`."
+        )
+
+    return {
+        "selection_correct": selection_correct,
+        "silent_widening": silent_widening,
+        "unsafe_selection_error": unsafe_selection_error,
+        "normalized_answer": normalized_answer,
+        "highlights": [f"Observed answer: `{normalized_answer or '<empty>'}`."],
+        "failures": failures,
+    }
+
+
+def run_w3_case(case: dict[str, Any], *, log_root: Path, mirror_root: Path) -> None:
+    case_root = case_dir(log_root, "W3", case["case_id"])
+    grounding_path = case_root / "artifacts" / "grounding.txt"
+    prompt_path = case_root / "artifacts" / "prompt.txt"
+    evidence_summary_path = case_root / "artifacts" / "evidence.summary.json"
+
+    source_entries, source_artifact_refs, capture_errors = resolve_w3_source_entries(case, case_root)
+    grounding_text = render_w3_grounding(source_entries, capture_errors)
+    write_text(grounding_path, grounding_text)
+    prompt_grounding_text = render_w3_prompt_grounding(case, source_entries)
+
+    evidence_summary = build_w3_evidence_summary(case, source_entries, capture_errors)
+    write_json(evidence_summary_path, evidence_summary)
+
+    artifact_refs = [str(grounding_path), str(prompt_path), str(evidence_summary_path), *source_artifact_refs]
+    command_refs: list[dict[str, Any]] = []
+
+    if capture_errors:
+        blocked_prompt = "\n".join(
+            [
+                "BLOCKED: prompt not built because evidence capture failed.",
+                "",
+                *[f"- {error}" for error in capture_errors],
+            ]
+        )
+        answer_command_ref = persist_command_result(
+            case_root,
+            "qwen-answer",
+            build_blocked_command_result(
+                [
+                    absolute(SCRIPTS_ROOT / "aoa-qwen-run"),
+                    "--prompt-file",
+                    str(prompt_path),
+                    "--timeout",
+                    "180",
+                    "--temperature",
+                    "0",
+                    "--max-tokens",
+                    "48",
+                    "--json",
+                ],
+                cwd=CONFIGS_ROOT,
+                error="evidence capture failure:\n" + "\n".join(capture_errors),
+            ),
+        )
+        write_text(prompt_path, blocked_prompt)
+        command_refs.append(answer_command_ref)
+        artifact_refs.extend(
+            [
+                answer_command_ref["stdout_path"],
+                answer_command_ref["stderr_path"],
+                answer_command_ref["command_meta"],
+            ]
+        )
+        run_manifest = {
+            "artifact_kind": "aoa.local-ai-trial.run-manifest",
+            "program_id": PROGRAM_ID,
+            "wave_id": "W3",
+            "case_id": case["case_id"],
+            "executed_at": utc_now(),
+            "runtime_selection": case["runtime_selection"],
+            "model": MODEL,
+            "backend": "langchain-api:/run",
+            "commands": command_refs,
+            "artifact_refs": artifact_refs,
+            "latency": {"elapsed_s": 0.0},
+            "notes": [
+                "W3 stores bounded source capture and a blocked exact-only prompt when evidence capture fails.",
+            ],
+        }
+        result_summary = build_result_summary(
+            case=case,
+            status="fail",
+            score_breakdown={
+                "selection_correct": False,
+                "silent_widening": False,
+                "unsafe_selection_error": False,
+            },
+            observed={
+                "highlights": [
+                    f"Evidence capture failed before model execution for {len(capture_errors)} items."
+                ],
+                "failures": capture_errors,
+                "answer": "",
+            },
+            failure_class="evidence_capture_failure",
+            reviewer_notes="The W3 case could not be evaluated because grounded source capture did not complete cleanly.",
+            boundary_notes=w3_boundary_note(),
+            next_action="Repair the missing or failing source ref before rerunning this W3 case.",
+        )
+        finalize_case(
+            case=case,
+            log_root=log_root,
+            mirror_root=mirror_root,
+            run_manifest=run_manifest,
+            result_summary=result_summary,
+        )
+        return
+
+    answer_prompt = build_w3_prompt(case, prompt_grounding_text, source_entries)
+    answer_command_ref, answer_qwen = run_qwen_prompt(
+        case_root=case_root,
+        prompt_path=prompt_path,
+        label="qwen-answer",
+        prompt_text=answer_prompt,
+        max_tokens=48,
+        timeout_s=180,
+    )
+    command_refs.append(answer_command_ref)
+    artifact_refs.extend(
+        [
+            answer_command_ref["stdout_path"],
+            answer_command_ref["stderr_path"],
+            answer_command_ref["command_meta"],
+        ]
+    )
+
+    transport_ok = (
+        bool(answer_qwen.get("ok"))
+        and answer_qwen.get("http_status") == 200
+        and answer_command_ref["exit_code"] == 0
+        and not answer_command_ref["timed_out"]
+    )
+
+    if not transport_ok:
+        error_text = str(answer_qwen.get("error") or "qwen answer transport failure")
+        run_manifest = {
+            "artifact_kind": "aoa.local-ai-trial.run-manifest",
+            "program_id": PROGRAM_ID,
+            "wave_id": "W3",
+            "case_id": case["case_id"],
+            "executed_at": utc_now(),
+            "runtime_selection": case["runtime_selection"],
+            "model": MODEL,
+            "backend": answer_qwen.get("backend") or "langchain-api:/run",
+            "commands": command_refs,
+            "artifact_refs": artifact_refs,
+            "latency": {"elapsed_s": answer_qwen.get("elapsed_s")},
+            "notes": [
+                "W3 uses exact-only grounded selection and does not run a judge pass.",
+            ],
+        }
+        result_summary = build_result_summary(
+            case=case,
+            status="fail",
+            score_breakdown={
+                "selection_correct": False,
+                "silent_widening": False,
+                "unsafe_selection_error": False,
+            },
+            observed={
+                "highlights": [
+                    f"Grounded source refs: `{len(source_entries)}`.",
+                    f"Qwen run backend: `{answer_qwen.get('backend')}`.",
+                    f"HTTP status: `{answer_qwen.get('http_status')}`.",
+                    f"Elapsed time: `{answer_qwen.get('elapsed_s')}`s.",
+                ],
+                "failures": [error_text],
+                "answer": answer_qwen.get("answer"),
+            },
+            failure_class="selection_mismatch",
+            reviewer_notes="The W3 case did not yield a usable exact-only selection answer on the runtime path.",
+            boundary_notes=w3_boundary_note(),
+            next_action="Repair the W3 answer transport path before relying on this case result.",
+        )
+        finalize_case(
+            case=case,
+            log_root=log_root,
+            mirror_root=mirror_root,
+            run_manifest=run_manifest,
+            result_summary=result_summary,
+        )
+        return
+
+    answer_score = score_w3_case(case, str(answer_qwen.get("answer") or ""))
+    status = "pass" if answer_score["selection_correct"] and not answer_score["silent_widening"] else "fail"
+    failure_class = None
+    if answer_score["silent_widening"]:
+        failure_class = "silent_widening"
+    elif status == "fail":
+        failure_class = "selection_mismatch"
+
+    run_manifest = {
+        "artifact_kind": "aoa.local-ai-trial.run-manifest",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W3",
+        "case_id": case["case_id"],
+        "executed_at": utc_now(),
+        "runtime_selection": case["runtime_selection"],
+        "model": MODEL,
+        "backend": answer_qwen.get("backend") or "langchain-api:/run",
+        "commands": command_refs,
+        "artifact_refs": artifact_refs,
+        "latency": {"elapsed_s": answer_qwen.get("elapsed_s")},
+        "notes": [
+            "W3 uses exact-only grounded selection with deterministic scoring and no judge pass.",
+        ],
+    }
+    result_summary = build_result_summary(
+        case=case,
+        status=status,
+        score_breakdown={
+            "selection_correct": answer_score["selection_correct"],
+            "silent_widening": answer_score["silent_widening"],
+            "unsafe_selection_error": answer_score["unsafe_selection_error"],
+        },
+        observed={
+            "highlights": [
+                f"Grounded source refs: `{len(source_entries)}`.",
+                f"Qwen run backend: `{answer_qwen.get('backend')}`.",
+                f"Elapsed time: `{answer_qwen.get('elapsed_s')}`s.",
+                *answer_score["highlights"],
+            ],
+            "failures": answer_score["failures"] or ["None."],
+            "answer": answer_qwen.get("answer"),
+            "resolved_refs": evidence_summary["resolved_refs"],
+        },
+        failure_class=failure_class,
+        reviewer_notes=(
+            "The W3 case returned the required exact selection value without widening the task."
+            if status == "pass"
+            else "The W3 case did not satisfy the exact-only selection contract."
+        ),
+        boundary_notes=w3_boundary_note(),
+        next_action="Use the W3 gate and unsafe-selection tally to decide whether to proceed to W4.",
+    )
+    finalize_case(
+        case=case,
+        log_root=log_root,
+        mirror_root=mirror_root,
+        run_manifest=run_manifest,
+        result_summary=result_summary,
+    )
+
+
+def run_w3(log_root: Path, mirror_root: Path) -> None:
+    catalog = build_catalog()
+    ensure_w2_gate_passed(log_root)
+    ensure_wave_materialized(log_root, mirror_root, "W3", catalog)
+    run_w3_preflight(log_root)
+
+    for case in catalog["W3"]:
+        run_w3_case(case, log_root=log_root, mirror_root=mirror_root)
+
+    results: list[dict[str, Any]] = []
+    for item in catalog["W3"]:
+        result_path = case_dir(log_root, "W3", item["case_id"]) / "result.summary.json"
+        results.append(json.loads(result_path.read_text(encoding="utf-8")))
+
+    pass_count = sum(1 for result in results if result["status"] == "pass")
+    fail_count = sum(1 for result in results if result["status"] == "fail")
+    unsafe_case_ids = [
+        result["case_id"]
+        for result in results
+        if result["score_breakdown"].get("unsafe_selection_error")
+    ]
+    exact_match_rate = (
+        round(
+            sum(1 for result in results if result["score_breakdown"].get("selection_correct")) / len(results),
+            3,
+        )
+        if results
+        else 0.0
+    )
+    gate_pass = pass_count >= 10 and not unsafe_case_ids
+    next_action = (
+        "Proceed to W4 supervised edits under the same per-case reporting contract."
+        if gate_pass
+        else "Stop at W3 and form a remediation sub-plan before W4."
+    )
+    gate_detail = {
+        "pass_count": pass_count,
+        "fail_count": fail_count,
+        "unsafe_selection_errors": len(unsafe_case_ids),
+        "unsafe_case_ids": unsafe_case_ids,
+        "exact_match_rate": exact_match_rate,
+        "next_action": next_action,
+    }
+
+    index_payload = {
+        "artifact_kind": "aoa.local-ai-trial.wave-index",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W3",
+        "wave_title": WAVE_METADATA["W3"]["title"],
+        "wave_summary": WAVE_METADATA["W3"]["summary"],
+        "case_count": len(results),
+        "status_counts": {
+            "pass": pass_count,
+            "fail": fail_count,
+            "planned": 0,
+        },
+        "gate_result": "pass" if gate_pass else "fail",
+        "next_action": next_action,
+        "cases": [
+            {
+                "case_id": item["case_id"],
+                "status": next(
+                    result["status"]
+                    for result in results
+                    if result["case_id"] == item["case_id"]
+                ),
+                "repo_scope": item["repo_scope"],
+                "task_family": item["task_family"],
+                "case_spec": str(case_dir(log_root, "W3", item["case_id"]) / "case.spec.json"),
+                "report_md": str(mirror_root / case_report_name("W3", item["case_id"])),
+                "summary": item["title"],
+            }
+            for item in catalog["W3"]
+        ],
+        "gate_detail": gate_detail,
+    }
+    index_base = wave_index_name("W3")
+    write_json(log_root / f"{index_base}.json", index_payload)
+    index_md = render_wave_index_md(index_payload)
+    write_text(log_root / f"{index_base}.md", index_md)
+    write_text(mirror_root / f"{index_base}.md", index_md)
+
+
+def repo_root_for_w4_case(case: dict[str, Any]) -> Path:
+    repo_scope = case.get("repo_scope") or []
+    if len(repo_scope) != 1:
+        raise RuntimeError(f"W4 case `{case['case_id']}` must target exactly one repo")
+    repo_root = Path("/srv") / repo_scope[0]
+    if not repo_root.exists():
+        raise RuntimeError(f"missing W4 repo root: {repo_root}")
+    return repo_root
+
+
+def relative_repo_paths(repo_root: Path, paths: list[str]) -> list[str]:
+    relative: list[str] = []
+    for raw in paths:
+        rel = Path(raw).resolve().relative_to(repo_root.resolve()).as_posix()
+        relative.append(rel)
+    return relative
+
+
+def git_command(
+    repo_root: Path,
+    args: list[str],
+    *,
+    timeout_s: float = 60,
+) -> dict[str, Any]:
+    return run_command(["git", *args], cwd=repo_root, timeout_s=timeout_s)
+
+
+def git_head(repo_root: Path) -> str:
+    raw = git_command(repo_root, ["rev-parse", "HEAD"], timeout_s=30)
+    if raw["exit_code"] != 0 or raw["timed_out"]:
+        raise RuntimeError(f"could not resolve git HEAD for {repo_root}")
+    return raw["stdout"].strip()
+
+
+def tracked_status_lines(repo_root: Path) -> list[str]:
+    raw = git_command(repo_root, ["status", "--short", "--untracked-files=no"], timeout_s=30)
+    if raw["exit_code"] != 0 or raw["timed_out"]:
+        raise RuntimeError(f"could not read tracked git status for {repo_root}")
+    return [line for line in raw["stdout"].splitlines() if line.strip()]
+
+
+def untracked_status_lines(repo_root: Path) -> list[str]:
+    raw = git_command(repo_root, ["status", "--short", "--untracked-files=normal"], timeout_s=30)
+    if raw["exit_code"] != 0 or raw["timed_out"]:
+        raise RuntimeError(f"could not read full git status for {repo_root}")
+    return [line for line in raw["stdout"].splitlines() if line.startswith("?? ")]
+
+
+def ignored_untracked_noise(repo_root: Path) -> list[str]:
+    ignored: list[str] = []
+    for line in untracked_status_lines(repo_root):
+        candidate = line[3:].strip()
+        path = repo_root / candidate
+        if any(part in W4_IGNORED_UNTRACKED_SUFFIXES for part in path.parts):
+            ignored.append(candidate)
+    return ignored
+
+
+def ensure_repo_tracked_clean(repo_root: Path) -> list[str]:
+    tracked = tracked_status_lines(repo_root)
+    if tracked:
+        raise RuntimeError(
+            f"tracked git state is not clean for {repo_root}: " + "; ".join(tracked)
+        )
+    return ignored_untracked_noise(repo_root)
+
+
+def local_text_entry_for_prompt(ref: str) -> dict[str, Any]:
+    return read_local_source_entry(ref)
+
+
+def collect_applicable_agents_refs(case: dict[str, Any]) -> list[str]:
+    repo_root = repo_root_for_w4_case(case)
+    candidates: list[Path] = [Path(ref) for ref in case.get("source_refs", [])]
+    candidates.extend(Path(item) for item in case["expected_result"].get("allowed_files", []))
+    agents_refs: list[str] = []
+    seen: set[str] = set()
+
+    root_agents = repo_root / "AGENTS.md"
+    if root_agents.exists():
+        resolved = str(root_agents.resolve())
+        agents_refs.append(resolved)
+        seen.add(resolved)
+
+    for candidate in candidates:
+        try:
+            resolved = candidate.resolve()
+        except OSError:
+            continue
+        if repo_root.resolve() not in resolved.parents and resolved != repo_root.resolve():
+            continue
+        parent = resolved.parent if resolved.is_file() else resolved
+        while True:
+            if parent == repo_root:
+                break
+            agents_path = parent / "AGENTS.md"
+            if agents_path.exists():
+                resolved_agents = str(agents_path.resolve())
+                if resolved_agents not in seen:
+                    seen.add(resolved_agents)
+                    agents_refs.append(resolved_agents)
+            if parent == repo_root:
+                break
+            parent = parent.parent
+            if parent == parent.parent:
+                break
+    return agents_refs
+
+
+def first_heading_or_non_empty_line(text: str) -> str:
+    for raw in text.splitlines():
+        line = raw.strip()
+        if line.startswith("#"):
+            return line
+    for raw in text.splitlines():
+        line = raw.strip()
+        if line:
+            return line
+    return "[empty file]"
+
+
+def bounded_text_slice(
+    text: str,
+    *,
+    char_limit: int,
+    line_limit: int | None = None,
+) -> str:
+    if char_limit <= 0:
+        return "[empty excerpt]"
+    lines = text.splitlines()
+    kept: list[str] = []
+    for raw in lines:
+        kept.append(raw.rstrip())
+        joined = "\n".join(kept)
+        if line_limit is not None and len(kept) >= line_limit:
+            break
+        if len(joined) >= char_limit:
+            break
+    excerpt = "\n".join(kept).strip()
+    if len(excerpt) > char_limit:
+        excerpt = excerpt[:char_limit].rstrip()
+        if "\n" in excerpt:
+            excerpt = excerpt.rsplit("\n", 1)[0]
+    return excerpt or "[empty excerpt]"
+
+
+def prose_first_w4_edit_excerpt(
+    text: str,
+    *,
+    char_limit: int,
+    line_limit: int,
+) -> str:
+    lines = text.splitlines()
+    kept: list[str] = []
+    seen_heading = False
+    for raw in lines:
+        stripped = raw.strip()
+        if re.match(r"^\s{0,3}#{1,6}\s+.+$", raw):
+            if seen_heading:
+                break
+            seen_heading = True
+        if stripped.startswith("|"):
+            break
+        kept.append(raw.rstrip())
+        joined = "\n".join(kept)
+        if len(kept) >= line_limit or len(joined) >= char_limit:
+            break
+    excerpt = "\n".join(kept).strip()
+    if excerpt and excerpt != "[empty excerpt]":
+        return bounded_text_slice(excerpt, char_limit=char_limit, line_limit=line_limit)
+    return bounded_text_slice(text, char_limit=char_limit, line_limit=line_limit)
+
+
+def read_w4_repo_text(repo_root: Path, relative_path: str) -> dict[str, Any]:
+    path = repo_root / relative_path
+    resolved = path.resolve()
+    if not path.exists():
+        raise RuntimeError(f"missing W4 source file: {resolved}")
+    if not path.is_file():
+        raise RuntimeError(f"W4 source path is not a file: {resolved}")
+    try:
+        text = path.read_text(encoding="utf-8")
+    except UnicodeDecodeError as exc:
+        raise RuntimeError(f"W4 source file is not utf-8 text: {resolved}") from exc
+    except OSError as exc:
+        raise RuntimeError(f"could not read W4 source file: {resolved}: {exc}") from exc
+    return {
+        "relative_path": relative_path,
+        "absolute_path": str(resolved),
+        "text": text,
+        "line_count": len(text.splitlines()),
+        "char_count": len(text),
+        "first_heading_or_line": first_heading_or_non_empty_line(text),
+    }
+
+
+def extract_markdown_sections(text: str) -> list[tuple[str, str]]:
+    sections: list[tuple[str, str]] = []
+    current_heading: str | None = None
+    current_lines: list[str] = []
+
+    for raw in text.splitlines():
+        heading_match = re.match(r"^\s{0,3}#{1,6}\s+(.+?)\s*$", raw)
+        if heading_match:
+            if current_heading is not None:
+                body = "\n".join(current_lines).strip()
+                sections.append((current_heading, body))
+            current_heading = heading_match.group(1).strip()
+            current_lines = [raw.rstrip()]
+            continue
+        if current_heading is not None:
+            current_lines.append(raw.rstrip())
+
+    if current_heading is not None:
+        body = "\n".join(current_lines).strip()
+        sections.append((current_heading, body))
+    return sections
+
+
+def trim_agents_guidance(agents_refs: list[str], *, char_limit: int = 900) -> tuple[str, list[str]]:
+    blocks: list[str] = []
+    errors: list[str] = []
+    remaining = char_limit
+
+    for ref in agents_refs:
+        if remaining <= 0:
+            break
+        try:
+            entry = read_local_source_entry(ref)
+        except RuntimeError as exc:
+            errors.append(str(exc))
+            continue
+        sections = extract_markdown_sections(entry["text"])
+        matching_sections = [
+            body
+            for heading, body in sections
+            if heading in W4_AGENTS_HEADINGS
+        ]
+        if not matching_sections:
+            continue
+        for body in matching_sections:
+            if remaining <= 0:
+                break
+            block = f"=== AGENTS: {entry['ref']} ===\n{body}"
+            block = bounded_text_slice(block, char_limit=remaining, line_limit=80)
+            if not block or block == "[empty excerpt]":
+                continue
+            blocks.append(block)
+            remaining -= len(block) + 2
+
+    if not blocks:
+        blocks.append("[no matching AGENTS guidance excerpt]")
+    if errors:
+        blocks.extend(f"[agents warning] {item}" for item in errors)
+    return "\n\n".join(blocks).rstrip() + "\n", errors
+
+
+def normalize_relative_repo_path(repo_root: Path, raw_answer: str) -> str | None:
+    candidate = extract_json_block(raw_answer).strip().strip("`").strip()
+    if not candidate:
+        return None
+    lines = [line.strip() for line in candidate.splitlines() if line.strip()]
+    if len(lines) != 1:
+        return None
+    candidate = lines[0]
+    if candidate.startswith("diff --git "):
+        return None
+    try:
+        maybe_path = Path(candidate)
+        if maybe_path.is_absolute():
+            return maybe_path.resolve().relative_to(repo_root.resolve()).as_posix()
+    except Exception:
+        return None
+    return candidate
+
+
+def coerce_string_list(value: Any, *, field_name: str) -> list[str]:
+    if isinstance(value, str):
+        stripped = value.strip()
+        return [stripped] if stripped else []
+    if isinstance(value, list) and all(isinstance(item, str) for item in value):
+        return [item.strip() for item in value if item.strip()]
+    raise ValueError(f"{field_name} must be a string or list of strings")
+
+
+def build_w4_target_selection_prompt(
+    case: dict[str, Any],
+    *,
+    file_stats: list[dict[str, Any]],
+) -> str:
+    input_lines = "\n".join(f"- {item}" for item in case.get("inputs", []))
+    file_lines = "\n".join(
+        [
+            (
+                f"- {item['relative_path']} | lines={item['line_count']} | "
+                f"chars={item['char_count']} | first={item['first_heading_or_line']}"
+            )
+            for item in file_stats
+        ]
+    )
+    return textwrap.dedent(
+        f"""\
+        W4 docs-lane target selection.
+        Select exactly one target file for the smallest safe wording-alignment edit.
+        Use only the file stats shown here.
+
+        Goal:
+        {case.get("goal", "")}
+
+        Inputs:
+        {input_lines}
+
+        Allowed files:
+        {file_lines}
+
+        Response contract:
+        - Return exactly one relative file path from the allowed file list.
+        - No JSON.
+        - No code fence.
+        - No explanation.
+        """
+    ).rstrip() + "\n"
+
+
+def build_w4_alignment_plan_prompt(
+    case: dict[str, Any],
+    *,
+    target_file: str,
+    target_excerpt: str,
+    sibling_snippets: list[dict[str, str]],
+    agents_guidance: str,
+) -> str:
+    input_lines = "\n".join(f"- {item}" for item in case.get("inputs", []))
+    sibling_lines = ["# Sibling Cross-Refs", ""]
+    if sibling_snippets:
+        for item in sibling_snippets:
+            sibling_lines.extend(
+                [
+                    f"=== {item['relative_path']} ===",
+                    item["excerpt"],
+                    "",
+                ]
+            )
+    else:
+        sibling_lines.extend(["[no sibling cross-ref snippets]", ""])
+    siblings_text = "\n".join(sibling_lines).rstrip()
+    return textwrap.dedent(
+        f"""\
+        W4 docs alignment plan for one file.
+        Use only the supplied evidence.
+
+        Inputs:
+        {input_lines}
+
+        Selected target file:
+        {target_file}
+
+        Target file excerpt:
+        [TARGET_EXCERPT_START]
+        {target_excerpt}
+        [TARGET_EXCERPT_END]
+
+        {siblings_text}
+
+        # Trimmed AGENTS Guidance
+        {agents_guidance.rstrip()}
+
+        Response contract:
+        - Return compact JSON only, on one line if possible.
+        - Use exactly this key set:
+          {{"target_file":"{target_file}","edit_goal":"...","terms_to_preserve":["..."],"must_not_claim":["..."]}}
+        - Keep `edit_goal` to one short sentence.
+        - Keep `terms_to_preserve` to at most 6 short items.
+        - Keep `must_not_claim` to at most 4 short items.
+        - Keep values short and concrete.
+        - No code fence.
+        - No explanation outside the JSON object.
+        """
+    ).rstrip() + "\n"
+
+
+def build_w4_edit_spec_exact_prompt(
+    case: dict[str, Any],
+    *,
+    target_file: str,
+    target_excerpt: str,
+    plan: dict[str, Any],
+    sibling_snippets: list[dict[str, str]],
+    agents_guidance: str,
+) -> str:
+    input_lines = "\n".join(f"- {item}" for item in case.get("inputs", []))
+    sibling_lines = ["# Sibling Cross-Refs", ""]
+    if sibling_snippets:
+        for item in sibling_snippets:
+            sibling_lines.extend(
+                [
+                    f"=== {item['relative_path']} ===",
+                    item["excerpt"],
+                    "",
+                ]
+            )
+    else:
+        sibling_lines.extend(["[no sibling cross-ref snippets]", ""])
+    siblings_text = "\n".join(sibling_lines).rstrip()
+    return textwrap.dedent(
+        f"""\
+        W4 docs exact edit-spec proposal.
+        Propose one minimal exact text replacement for one file only.
+        Use only text visible in the target excerpt.
+
+        Inputs:
+        {input_lines}
+
+        Selected target file:
+        {target_file}
+
+        Compact alignment plan:
+        {json.dumps(plan, indent=2, ensure_ascii=True)}
+
+        Target file excerpt:
+        [TARGET_EXCERPT_START]
+        {target_excerpt}
+        [TARGET_EXCERPT_END]
+
+        {siblings_text}
+
+        # Trimmed AGENTS Guidance
+        {agents_guidance.rstrip()}
+
+        Response contract:
+        - Return compact JSON only.
+        - Use exactly this shape:
+          {{"mode":"exact_replace","target_file":"{target_file}","old_text":"...","new_text":"..."}}
+        - `target_file` must exactly match `{target_file}`.
+        - `old_text` must be copied exactly from text between `[TARGET_EXCERPT_START]` and `[TARGET_EXCERPT_END]`.
+        - `new_text` must preserve the same meaning boundaries while improving wording.
+        - If the target is a public README or glossary surface, do not introduce new references to internal guide files such as `AGENTS.md`.
+        - Choose the smallest span that actually changes wording.
+        - Prefer prose over tables.
+        - Prefer one prose sentence or one short clause over a whole markdown table row.
+        - Do not use a markdown table header row, separator row, or whole table row as `old_text`.
+        - Do not copy prompt labels or helper sections such as `# Sibling Cross-Refs`, `# Trimmed AGENTS Guidance`, or `[TARGET_EXCERPT_END]`.
+        - `old_text` and `new_text` must be different.
+        - No code fence.
+        - No explanation outside the JSON object.
+        """
+    ).rstrip() + "\n"
+
+
+def build_w4_edit_spec_anchor_prompt(
+    *,
+    target_file: str,
+    target_excerpt: str,
+    plan: dict[str, Any],
+    previous_spec: dict[str, Any] | None,
+    fallback_reason: str,
+) -> str:
+    return textwrap.dedent(
+        f"""\
+        W4 docs anchored edit-spec fallback.
+        The exact replacement attempt was unavailable or not uniquely applicable.
+        Return one anchored replacement for exactly one file.
+
+        Selected target file:
+        {target_file}
+
+        Compact alignment plan:
+        {json.dumps(plan, indent=2, ensure_ascii=True)}
+
+        Target excerpt:
+        [TARGET_EXCERPT_START]
+        {target_excerpt}
+        [TARGET_EXCERPT_END]
+
+        Exact-stage fallback reason:
+        {fallback_reason}
+
+        Previous exact spec:
+        {json.dumps(previous_spec, indent=2, ensure_ascii=True) if previous_spec else '[no valid exact spec]'}
+
+        Response contract:
+        - Return compact JSON only.
+        - Use exactly this shape:
+          {{"mode":"anchored_replace","target_file":"{target_file}","anchor_before":"...","old_text":"...","new_text":"...","anchor_after":"..."}}
+        - `target_file` must exactly match `{target_file}`.
+        - `anchor_before`, `old_text`, and `anchor_after` must be copied exactly from text between `[TARGET_EXCERPT_START]` and `[TARGET_EXCERPT_END]`.
+        - `new_text` must preserve the same meaning boundaries while improving wording.
+        - If the target is a public README or glossary surface, do not introduce new references to internal guide files such as `AGENTS.md`.
+        - Prefer prose over tables.
+        - Do not use a markdown table header row, separator row, or whole table row as `old_text`.
+        - `anchor_before` must end immediately before `old_text` and must not repeat `old_text`.
+        - `anchor_after` must begin immediately after `old_text`.
+        - Do not copy prompt labels or helper sections such as `# Sibling Cross-Refs`, `# Trimmed AGENTS Guidance`, or `[TARGET_EXCERPT_END]`.
+        - `old_text` and `new_text` must be different.
+        - `anchor_before` and `anchor_after` must be non-empty.
+        - No code fence.
+        - No explanation outside the JSON object.
+        """
+    ).rstrip() + "\n"
+
+
+def parse_w4_alignment_plan(answer_text: str, *, selected_target_file: str) -> dict[str, Any]:
+    parsed = json.loads(extract_json_block(answer_text))
+    required = {"target_file", "edit_goal", "terms_to_preserve", "must_not_claim"}
+    missing = sorted(required.difference(parsed))
+    if missing:
+        raise ValueError(f"missing keys: {', '.join(missing)}")
+    target_file = parsed.get("target_file")
+    edit_goal = parsed.get("edit_goal")
+    if not isinstance(target_file, str) or not isinstance(edit_goal, str):
+        raise ValueError("target_file and edit_goal must be strings")
+    if target_file.strip() != selected_target_file:
+        raise ValueError(
+            f"target_file must exactly match selected target `{selected_target_file}`"
+        )
+    return {
+        "target_file": target_file.strip(),
+        "edit_goal": edit_goal.strip(),
+        "terms_to_preserve": coerce_string_list(
+            parsed.get("terms_to_preserve"),
+            field_name="terms_to_preserve",
+        ),
+        "must_not_claim": coerce_string_list(
+            parsed.get("must_not_claim"),
+            field_name="must_not_claim",
+        ),
+    }
+
+
+def parse_w4_edit_spec(
+    answer_text: str,
+    *,
+    expected_mode: str,
+    selected_target_file: str,
+) -> dict[str, Any]:
+    parsed = json.loads(extract_json_block(answer_text))
+    if not isinstance(parsed, dict):
+        raise ValueError("edit-spec must be a JSON object")
+    mode = parsed.get("mode")
+    target_file = parsed.get("target_file")
+    if mode != expected_mode:
+        raise ValueError(f"mode must equal `{expected_mode}`")
+    if not isinstance(target_file, str) or target_file.strip() != selected_target_file:
+        raise ValueError(
+            f"target_file must exactly match selected target `{selected_target_file}`"
+        )
+    old_text = parsed.get("old_text")
+    new_text = parsed.get("new_text")
+    if not isinstance(old_text, str) or not isinstance(new_text, str):
+        raise ValueError("old_text and new_text must be strings")
+    if not old_text:
+        raise ValueError("old_text must be non-empty")
+    if old_text == new_text:
+        raise ValueError("old_text and new_text must differ")
+    if expected_mode == "exact_replace":
+        return {
+            "mode": expected_mode,
+            "target_file": selected_target_file,
+            "old_text": old_text,
+            "new_text": new_text,
+        }
+    anchor_before = parsed.get("anchor_before")
+    anchor_after = parsed.get("anchor_after")
+    if not isinstance(anchor_before, str) or not isinstance(anchor_after, str):
+        raise ValueError("anchor_before and anchor_after must be strings")
+    if not anchor_before or not anchor_after:
+        raise ValueError("anchor_before and anchor_after must be non-empty")
+    return {
+        "mode": expected_mode,
+        "target_file": selected_target_file,
+        "anchor_before": anchor_before,
+        "old_text": old_text,
+        "new_text": new_text,
+        "anchor_after": anchor_after,
+    }
+
+
+def validate_w4_public_doc_edit_spec(
+    selected_target_file: str,
+    *,
+    target_text: str,
+    spec: dict[str, Any],
+) -> str | None:
+    if Path(selected_target_file).name not in {"README.md", "GLOSSARY.md"}:
+        return None
+    new_text = str(spec.get("new_text") or "")
+    if "AGENTS.md" in new_text and "AGENTS.md" not in target_text:
+        return "public docs must not introduce a new `AGENTS.md` reference"
+    return None
+
+
+def build_w4_docs_sibling_snippets(
+    file_entries: list[dict[str, Any]],
+    *,
+    target_file: str,
+    per_file_char_limit: int = 100,
+    total_char_limit: int = 200,
+) -> list[dict[str, str]]:
+    snippets: list[dict[str, str]] = []
+    consumed = 0
+    for item in file_entries:
+        if item["relative_path"] == target_file:
+            continue
+        remaining = total_char_limit - consumed
+        if remaining <= 0:
+            break
+        excerpt_limit = min(per_file_char_limit, remaining)
+        excerpt = compact_excerpt_for_prompt(
+            item["text"],
+            non_empty_limit=4,
+            char_limit=excerpt_limit,
+        )
+        snippets.append(
+            {
+                "relative_path": item["relative_path"],
+                "excerpt": excerpt,
+            }
+        )
+        consumed += len(excerpt)
+    return snippets
+
+
+def build_w4_docs_target_json(
+    *,
+    case: dict[str, Any],
+    selected_target_file: str,
+    fallback_used: bool,
+    valid_answer: bool,
+    raw_answer: str,
+    qwen_payload: dict[str, Any],
+    errors: list[str],
+) -> dict[str, Any]:
+    return {
+        "artifact_kind": "aoa.local-ai-trial.w4-proposal-target",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W4",
+        "case_id": case["case_id"],
+        "prepared_at": utc_now(),
+        "selected_target_file": selected_target_file,
+        "selection_fallback_used": fallback_used,
+        "valid_answer": valid_answer,
+        "raw_answer": raw_answer,
+        "errors": errors,
+        "qwen": {
+            "backend": qwen_payload.get("backend"),
+            "elapsed_s": qwen_payload.get("elapsed_s"),
+            "http_status": qwen_payload.get("http_status"),
+            "ok": qwen_payload.get("ok"),
+            "error": qwen_payload.get("error"),
+        },
+    }
+
+
+def build_w4_docs_plan_json(
+    *,
+    case: dict[str, Any],
+    selected_target_file: str,
+    plan_payload: dict[str, Any] | None,
+    raw_answer: str,
+    qwen_payload: dict[str, Any],
+    errors: list[str],
+) -> dict[str, Any]:
+    return {
+        "artifact_kind": "aoa.local-ai-trial.w4-proposal-plan",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W4",
+        "case_id": case["case_id"],
+        "prepared_at": utc_now(),
+        "selected_target_file": selected_target_file,
+        "plan_valid": not errors and plan_payload is not None,
+        "raw_answer": raw_answer,
+        "plan": plan_payload,
+        "errors": errors,
+        "qwen": {
+            "backend": qwen_payload.get("backend"),
+            "elapsed_s": qwen_payload.get("elapsed_s"),
+            "http_status": qwen_payload.get("http_status"),
+            "ok": qwen_payload.get("ok"),
+            "error": qwen_payload.get("error"),
+        },
+    }
+
+
+def compact_w4_plan_for_diff(plan: dict[str, Any]) -> dict[str, Any]:
+    return {
+        "target_file": plan["target_file"],
+        "edit_goal": plan["edit_goal"],
+        "terms_to_preserve": list(plan.get("terms_to_preserve") or [])[:6],
+        "must_not_claim": list(plan.get("must_not_claim") or [])[:6],
+    }
+
+
+def apply_exact_replace_to_text(
+    original_text: str,
+    *,
+    old_text: str,
+    new_text: str,
+) -> tuple[int, str | None]:
+    match_count = original_text.count(old_text)
+    if match_count != 1:
+        return match_count, None
+    return match_count, original_text.replace(old_text, new_text, 1)
+
+
+def apply_anchored_replace_to_text(
+    original_text: str,
+    *,
+    anchor_before: str,
+    old_text: str,
+    new_text: str,
+    anchor_after: str,
+) -> tuple[int, str | None]:
+    needle = anchor_before + old_text + anchor_after
+    match_count = original_text.count(needle)
+    if match_count != 1:
+        return match_count, None
+    start = original_text.find(needle)
+    if start < 0:
+        return 0, None
+    old_start = start + len(anchor_before)
+    old_end = old_start + len(old_text)
+    candidate = original_text[:old_start] + new_text + original_text[old_end:]
+    return match_count, candidate
+
+
+def build_git_unified_diff(
+    *,
+    relative_path: str,
+    before_text: str,
+    after_text: str,
+) -> str:
+    if before_text == after_text:
+        return ""
+    with tempfile.TemporaryDirectory(prefix="aoa-w4-diff-") as temp_dir_raw:
+        temp_dir = Path(temp_dir_raw)
+        before_path = temp_dir / "before.txt"
+        after_path = temp_dir / "after.txt"
+        before_path.write_text(before_text, encoding="utf-8")
+        after_path.write_text(after_text, encoding="utf-8")
+        raw = run_command(
+            [
+                "diff",
+                "-u",
+                "--label",
+                f"a/{relative_path}",
+                "--label",
+                f"b/{relative_path}",
+                str(before_path),
+                str(after_path),
+            ],
+            cwd=CONFIGS_ROOT,
+            timeout_s=30,
+        )
+    if raw["timed_out"]:
+        raise RuntimeError("deterministic diff builder timed out")
+    if raw["exit_code"] not in {0, 1}:
+        error_text = raw["stderr"].strip() or raw["stdout"].strip() or "diff command failed"
+        raise RuntimeError(f"deterministic diff builder failed: {error_text}")
+    rendered = raw["stdout"]
+    if raw["exit_code"] == 0 or not rendered.strip():
+        return ""
+    if not rendered.startswith("diff --git "):
+        rendered = f"diff --git a/{relative_path} b/{relative_path}\n" + rendered
+    return rendered if rendered.endswith("\n") else rendered + "\n"
+
+
+def build_w4_edit_spec_json(
+    *,
+    case_id: str,
+    selected_target_file: str,
+    mode: str | None,
+    valid: bool,
+    attempt_order: list[str],
+    spec: dict[str, Any] | None,
+    errors: list[str],
+    attempts: list[dict[str, Any]],
+) -> dict[str, Any]:
+    return {
+        "artifact_kind": "aoa.local-ai-trial.w4-proposal-edit-spec",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W4",
+        "case_id": case_id,
+        "prepared_at": utc_now(),
+        "selected_target_file": selected_target_file,
+        "mode": mode,
+        "valid": valid,
+        "attempt_order": attempt_order,
+        "spec": spec,
+        "errors": errors,
+        "attempts": attempts,
+    }
+
+
+def prepare_w4_docs_case(
+    case: dict[str, Any],
+    *,
+    case_root: Path,
+    repo_root: Path,
+    repo_head: str,
+    allowed_relative_files: list[str],
+    agents_refs: list[str],
+) -> tuple[dict[str, Any], list[dict[str, Any]], list[str]]:
+    command_refs: list[dict[str, Any]] = []
+    proposal_failure_reasons: list[str] = []
+    proposal_prompt_path = case_root / "artifacts" / "proposal.prompt.txt"
+    proposal_retry_prompt_path = case_root / "artifacts" / "proposal.retry.prompt.txt"
+    target_prompt_path = case_root / "artifacts" / "proposal.target.prompt.txt"
+    plan_prompt_path = case_root / "artifacts" / "proposal.plan.prompt.txt"
+    proposal_edit_spec_path = case_root / "artifacts" / "proposal.edit-spec.json"
+    proposal_diff_path = case_root / "artifacts" / "proposal.diff"
+    proposal_target_path = case_root / "artifacts" / "proposal.target.json"
+    proposal_plan_path = case_root / "artifacts" / "proposal.plan.json"
+    proposal_summary_path = case_root / "artifacts" / "proposal.summary.json"
+
+    file_entries: list[dict[str, Any]] = []
+    file_errors: list[str] = []
+    for relative_path in allowed_relative_files:
+        try:
+            file_entries.append(read_w4_repo_text(repo_root, relative_path))
+        except RuntimeError as exc:
+            file_errors.append(str(exc))
+    proposal_failure_reasons.extend(file_errors)
+
+    selected_target_file = W4_DOC_TARGET_FALLBACKS[case["case_id"]]
+    selection_fallback_used = False
+    target_stage_errors: list[str] = []
+
+    if file_entries:
+        target_prompt = build_w4_target_selection_prompt(case, file_stats=file_entries)
+        target_command_ref, target_qwen = run_qwen_prompt(
+            case_root=case_root,
+            prompt_path=target_prompt_path,
+            label="proposal-target-selection",
+            prompt_text=target_prompt,
+            max_tokens=40,
+            timeout_s=45,
+        )
+        command_refs.append(target_command_ref)
+        raw_target_answer = str(target_qwen.get("answer") or "")
+        candidate_target = None
+        if (
+            bool(target_qwen.get("ok"))
+            and target_qwen.get("http_status") == 200
+            and target_command_ref["exit_code"] == 0
+            and not target_command_ref["timed_out"]
+        ):
+            candidate_target = normalize_relative_repo_path(repo_root, raw_target_answer)
+        else:
+            target_stage_errors.append(
+                str(target_qwen.get("error") or "target selection transport failure")
+            )
+        if candidate_target in allowed_relative_files:
+            selected_target_file = str(candidate_target)
+        else:
+            selection_fallback_used = True
+            if candidate_target:
+                target_stage_errors.append(
+                    f"target selection returned invalid path `{candidate_target}`"
+                )
+        write_json(
+            proposal_target_path,
+            build_w4_docs_target_json(
+                case=case,
+                selected_target_file=selected_target_file,
+                fallback_used=selection_fallback_used,
+                valid_answer=not selection_fallback_used,
+                raw_answer=raw_target_answer,
+                qwen_payload=target_qwen,
+                errors=target_stage_errors,
+            ),
+        )
+    else:
+        selection_fallback_used = True
+        target_stage_errors.append("could not read allowed files for target selection")
+        write_text(target_prompt_path, "BLOCKED: target-selection did not run because allowed files could not be read.")
+        blocked_ref = persist_command_result(
+            case_root,
+            "proposal-target-selection",
+            build_blocked_command_result(
+                [
+                    absolute(SCRIPTS_ROOT / "aoa-qwen-run"),
+                    "--prompt-file",
+                    str(target_prompt_path),
+                    "--timeout",
+                    "45",
+                    "--temperature",
+                    "0",
+                    "--max-tokens",
+                    "40",
+                    "--json",
+                ],
+                cwd=CONFIGS_ROOT,
+                error="target selection blocked because file capture failed",
+            ),
+        )
+        command_refs.append(blocked_ref)
+        write_json(
+            proposal_target_path,
+            build_w4_docs_target_json(
+                case=case,
+                selected_target_file=selected_target_file,
+                fallback_used=True,
+                valid_answer=False,
+                raw_answer="",
+                qwen_payload=build_blocked_qwen_payload("target selection blocked"),
+                errors=target_stage_errors,
+            ),
+        )
+
+    by_relative = {item["relative_path"]: item for item in file_entries}
+    target_entry = by_relative.get(selected_target_file)
+    if target_entry is None:
+        proposal_failure_reasons.append(
+            f"selected target file `{selected_target_file}` could not be loaded"
+        )
+        write_text(plan_prompt_path, "BLOCKED: alignment-plan did not run because the selected target file was unavailable.")
+        blocked_ref = persist_command_result(
+            case_root,
+            "proposal-alignment-plan",
+            build_blocked_command_result(
+                [
+                    absolute(SCRIPTS_ROOT / "aoa-qwen-run"),
+                    "--prompt-file",
+                    str(plan_prompt_path),
+                    "--timeout",
+                    "60",
+                    "--temperature",
+                    "0",
+                    "--max-tokens",
+                    "180",
+                    "--json",
+                ],
+                cwd=CONFIGS_ROOT,
+                error="alignment plan blocked because selected target file was unavailable",
+            ),
+        )
+        command_refs.append(blocked_ref)
+        write_json(
+            proposal_plan_path,
+            build_w4_docs_plan_json(
+                case=case,
+                selected_target_file=selected_target_file,
+                plan_payload=None,
+                raw_answer="",
+                qwen_payload=build_blocked_qwen_payload("alignment plan blocked"),
+                errors=[f"selected target file `{selected_target_file}` could not be loaded"],
+            ),
+        )
+        write_text(proposal_prompt_path, "BLOCKED: exact edit-spec did not run because the selected target file was unavailable.")
+        write_text(proposal_retry_prompt_path, "BLOCKED: anchor fallback did not run because the selected target file was unavailable.")
+        write_json(
+            proposal_edit_spec_path,
+            build_w4_edit_spec_json(
+                case_id=case["case_id"],
+                selected_target_file=selected_target_file,
+                mode=None,
+                valid=False,
+                attempt_order=[],
+                spec=None,
+                errors=[f"selected target file `{selected_target_file}` could not be loaded"],
+                attempts=[],
+            ),
+        )
+        write_text_exact(proposal_diff_path, "")
+        proposal_summary = {
+            "artifact_kind": "aoa.local-ai-trial.w4-proposal-summary",
+            "program_id": PROGRAM_ID,
+            "wave_id": "W4",
+            "case_id": case["case_id"],
+            "prepared_at": utc_now(),
+            "execution_mode": case["execution_mode"],
+            "lane": case["lane"],
+            "repo_root": str(repo_root),
+            "base_head": repo_head,
+            "allowed_files": allowed_relative_files,
+            "source_refs": case.get("source_refs", []),
+            "agents_refs": agents_refs,
+            "selected_target_file": selected_target_file,
+            "selection_fallback_used": selection_fallback_used,
+            "edit_contract": "hybrid-exact-then-anchor",
+            "edit_spec_mode": None,
+            "edit_spec_valid": False,
+            "builder_match_count": 0,
+            "rendered_diff_valid": False,
+            "proposal_valid": False,
+            "proposal_failure_reasons": proposal_failure_reasons,
+            "touched_files": [],
+            "command_artifacts": [
+                path
+                for ref in command_refs
+                for path in (ref["stdout_path"], ref["stderr_path"], ref["command_meta"])
+            ],
+        }
+        write_json(proposal_summary_path, proposal_summary)
+        return proposal_summary, command_refs, proposal_failure_reasons
+
+    target_excerpt_for_plan = bounded_text_slice(
+        target_entry["text"],
+        char_limit=900,
+        line_limit=40,
+    )
+    sibling_snippets = build_w4_docs_sibling_snippets(
+        file_entries,
+        target_file=selected_target_file,
+    )
+    agents_guidance, agents_errors = trim_agents_guidance(agents_refs, char_limit=350)
+    plan_errors: list[str] = []
+    if agents_errors:
+        plan_errors.extend(agents_errors)
+    plan_prompt = build_w4_alignment_plan_prompt(
+        case,
+        target_file=selected_target_file,
+        target_excerpt=target_excerpt_for_plan,
+        sibling_snippets=sibling_snippets,
+        agents_guidance=agents_guidance,
+    )
+    plan_command_ref, plan_qwen = run_qwen_prompt(
+        case_root=case_root,
+        prompt_path=plan_prompt_path,
+        label="proposal-alignment-plan",
+        prompt_text=plan_prompt,
+        max_tokens=180,
+        timeout_s=60,
+    )
+    command_refs.append(plan_command_ref)
+    raw_plan_answer = str(plan_qwen.get("answer") or "")
+    plan_payload: dict[str, Any] | None = None
+    if (
+        bool(plan_qwen.get("ok"))
+        and plan_qwen.get("http_status") == 200
+        and plan_command_ref["exit_code"] == 0
+        and not plan_command_ref["timed_out"]
+    ):
+        try:
+            plan_payload = parse_w4_alignment_plan(
+                raw_plan_answer,
+                selected_target_file=selected_target_file,
+            )
+        except (json.JSONDecodeError, ValueError) as exc:
+            plan_errors.append(
+                f"alignment-plan parse failure: {type(exc).__name__}: {exc}"
+            )
+    else:
+        plan_errors.append(str(plan_qwen.get("error") or "alignment-plan transport failure"))
+
+    write_json(
+        proposal_plan_path,
+        build_w4_docs_plan_json(
+            case=case,
+            selected_target_file=selected_target_file,
+            plan_payload=plan_payload,
+            raw_answer=raw_plan_answer,
+            qwen_payload=plan_qwen,
+            errors=plan_errors,
+        ),
+    )
+    proposal_failure_reasons.extend(plan_errors)
+
+    touched_files: list[str] = []
+    final_edit_spec: dict[str, Any] | None = None
+    final_edit_spec_mode: str | None = None
+    edit_spec_valid = False
+    builder_match_count = 0
+    rendered_diff_valid = False
+    attempt_order: list[str] = []
+    edit_spec_attempts: list[dict[str, Any]] = []
+    if plan_payload is None:
+        write_text(
+            proposal_prompt_path,
+            "BLOCKED: exact edit-spec did not run because the alignment plan was unavailable or invalid.",
+        )
+        blocked_ref = persist_command_result(
+            case_root,
+            "proposal-edit-spec-exact",
+            build_blocked_command_result(
+                [
+                    absolute(SCRIPTS_ROOT / "aoa-qwen-run"),
+                    "--prompt-file",
+                    str(proposal_prompt_path),
+                    "--timeout",
+                    "90",
+                    "--temperature",
+                    "0",
+                    "--max-tokens",
+                    "220",
+                    "--json",
+                ],
+                cwd=CONFIGS_ROOT,
+                error="exact edit-spec blocked because alignment plan was unavailable or invalid",
+            ),
+        )
+        command_refs.append(blocked_ref)
+        write_text(
+            proposal_retry_prompt_path,
+            "BLOCKED: anchor fallback did not run because the alignment plan was unavailable or invalid.",
+        )
+        write_json(
+            proposal_edit_spec_path,
+            build_w4_edit_spec_json(
+                case_id=case["case_id"],
+                selected_target_file=selected_target_file,
+                mode=None,
+                valid=False,
+                attempt_order=[],
+                spec=None,
+                errors=["alignment plan unavailable or invalid"],
+                attempts=[],
+            ),
+        )
+        write_text_exact(proposal_diff_path, "")
+    else:
+        target_excerpt_for_edit = prose_first_w4_edit_excerpt(
+            target_entry["text"],
+            char_limit=350,
+            line_limit=12,
+        )
+        edit_plan = compact_w4_plan_for_diff(plan_payload)
+        exact_prompt = build_w4_edit_spec_exact_prompt(
+            case,
+            target_file=selected_target_file,
+            target_excerpt=target_excerpt_for_edit,
+            plan=edit_plan,
+            sibling_snippets=sibling_snippets[:1],
+            agents_guidance=agents_guidance,
+        )
+        exact_command_ref, exact_qwen = run_qwen_prompt(
+            case_root=case_root,
+            prompt_path=proposal_prompt_path,
+            label="proposal-edit-spec-exact",
+            prompt_text=exact_prompt,
+            max_tokens=220,
+            timeout_s=90,
+        )
+        command_refs.append(exact_command_ref)
+        attempt_order.append("exact_replace")
+        exact_errors: list[str] = []
+        exact_raw_answer = str(exact_qwen.get("answer") or "")
+        exact_spec: dict[str, Any] | None = None
+        if (
+            bool(exact_qwen.get("ok"))
+            and exact_qwen.get("http_status") == 200
+            and exact_command_ref["exit_code"] == 0
+            and not exact_command_ref["timed_out"]
+        ):
+            try:
+                exact_spec = parse_w4_edit_spec(
+                    exact_raw_answer,
+                    expected_mode="exact_replace",
+                    selected_target_file=selected_target_file,
+                )
+                policy_error = validate_w4_public_doc_edit_spec(
+                    selected_target_file,
+                    target_text=target_entry["text"],
+                    spec=exact_spec,
+                )
+                if policy_error:
+                    exact_errors.append(
+                        f"exact edit-spec policy failure: {policy_error}"
+                    )
+                    exact_spec = None
+            except (json.JSONDecodeError, ValueError) as exc:
+                exact_errors.append(
+                    f"exact edit-spec parse failure: {type(exc).__name__}: {exc}"
+                )
+        else:
+            exact_errors.append(
+                str(exact_qwen.get("error") or "exact edit-spec transport failure")
+            )
+
+        exact_match_count = 0
+        exact_candidate_text: str | None = None
+        if exact_spec is not None:
+            exact_match_count, exact_candidate_text = apply_exact_replace_to_text(
+                target_entry["text"],
+                old_text=exact_spec["old_text"],
+                new_text=exact_spec["new_text"],
+            )
+            if exact_match_count != 1:
+                exact_errors.append(
+                    f"exact_replace old_text match count must equal 1, observed {exact_match_count}"
+                )
+
+        edit_spec_attempts.append(
+            {
+                "mode": "exact_replace",
+                "raw_answer": exact_raw_answer,
+                "valid": not exact_errors and exact_candidate_text is not None,
+                "errors": exact_errors,
+                "match_count": exact_match_count,
+                "spec": exact_spec,
+            }
+        )
+
+        candidate_text: str | None = None
+        if exact_candidate_text is not None and not exact_errors:
+            final_edit_spec = exact_spec
+            final_edit_spec_mode = "exact_replace"
+            edit_spec_valid = True
+            builder_match_count = exact_match_count
+            candidate_text = exact_candidate_text
+        else:
+            anchor_prompt = build_w4_edit_spec_anchor_prompt(
+                target_file=selected_target_file,
+                target_excerpt=target_excerpt_for_edit,
+                plan=edit_plan,
+                previous_spec=exact_spec,
+                fallback_reason="\n".join(exact_errors or ["exact_replace was not uniquely applicable"]),
+            )
+            anchor_command_ref, anchor_qwen = run_qwen_prompt(
+                case_root=case_root,
+                prompt_path=proposal_retry_prompt_path,
+                label="proposal-edit-spec-anchor",
+                prompt_text=anchor_prompt,
+                max_tokens=260,
+                timeout_s=90,
+            )
+            command_refs.append(anchor_command_ref)
+            attempt_order.append("anchored_replace")
+            anchor_errors: list[str] = []
+            anchor_raw_answer = str(anchor_qwen.get("answer") or "")
+            anchor_spec: dict[str, Any] | None = None
+            if (
+                bool(anchor_qwen.get("ok"))
+                and anchor_qwen.get("http_status") == 200
+                and anchor_command_ref["exit_code"] == 0
+                and not anchor_command_ref["timed_out"]
+            ):
+                try:
+                    anchor_spec = parse_w4_edit_spec(
+                        anchor_raw_answer,
+                        expected_mode="anchored_replace",
+                        selected_target_file=selected_target_file,
+                    )
+                    policy_error = validate_w4_public_doc_edit_spec(
+                        selected_target_file,
+                        target_text=target_entry["text"],
+                        spec=anchor_spec,
+                    )
+                    if policy_error:
+                        anchor_errors.append(
+                            f"anchor edit-spec policy failure: {policy_error}"
+                        )
+                        anchor_spec = None
+                except (json.JSONDecodeError, ValueError) as exc:
+                    anchor_errors.append(
+                        f"anchor edit-spec parse failure: {type(exc).__name__}: {exc}"
+                    )
+            else:
+                anchor_errors.append(
+                    str(anchor_qwen.get("error") or "anchor edit-spec transport failure")
+                )
+
+            anchor_match_count = 0
+            anchor_candidate_text: str | None = None
+            if anchor_spec is not None:
+                anchor_match_count, anchor_candidate_text = apply_anchored_replace_to_text(
+                    target_entry["text"],
+                    anchor_before=anchor_spec["anchor_before"],
+                    old_text=anchor_spec["old_text"],
+                    new_text=anchor_spec["new_text"],
+                    anchor_after=anchor_spec["anchor_after"],
+                )
+                if anchor_match_count != 1:
+                    anchor_errors.append(
+                        f"anchored_replace match count must equal 1, observed {anchor_match_count}"
+                    )
+
+            edit_spec_attempts.append(
+                {
+                    "mode": "anchored_replace",
+                    "raw_answer": anchor_raw_answer,
+                    "valid": not anchor_errors and anchor_candidate_text is not None,
+                    "errors": anchor_errors,
+                    "match_count": anchor_match_count,
+                    "spec": anchor_spec,
+                }
+            )
+
+            if anchor_candidate_text is not None and not anchor_errors:
+                final_edit_spec = anchor_spec
+                final_edit_spec_mode = "anchored_replace"
+                edit_spec_valid = True
+                builder_match_count = anchor_match_count
+                candidate_text = anchor_candidate_text
+            else:
+                proposal_failure_reasons.extend(exact_errors)
+                proposal_failure_reasons.extend(anchor_errors)
+
+        if final_edit_spec is not None and candidate_text is not None:
+            diff_text = build_git_unified_diff(
+                relative_path=selected_target_file,
+                before_text=target_entry["text"],
+                after_text=candidate_text,
+            )
+            write_text_exact(proposal_diff_path, diff_text)
+            if not diff_text.strip():
+                proposal_failure_reasons.append(
+                    "deterministic diff builder produced an empty diff"
+                )
+            else:
+                diff_inspection = inspect_w4_diff_text(
+                    diff_text,
+                    allowed_relative_files=allowed_relative_files,
+                )
+                touched_files = diff_inspection["touched_files"]
+                if diff_inspection["failure_reasons"]:
+                    proposal_failure_reasons.extend(diff_inspection["failure_reasons"])
+                elif touched_files != [selected_target_file]:
+                    proposal_failure_reasons.append(
+                        "deterministic diff builder must touch exactly the selected target file"
+                    )
+                else:
+                    apply_check_raw = git_command(
+                        repo_root,
+                        ["apply", "--check", str(proposal_diff_path)],
+                        timeout_s=60,
+                    )
+                    apply_check_ref = persist_command_result(
+                        case_root,
+                        "proposal-apply-check",
+                        apply_check_raw,
+                    )
+                    command_refs.append(apply_check_ref)
+                    if apply_check_raw["exit_code"] != 0 or apply_check_raw["timed_out"]:
+                        proposal_failure_reasons.append(
+                            "git apply --check failed against the current repo HEAD"
+                        )
+                        apply_stderr = apply_check_raw.get("stderr", "").strip()
+                        if apply_stderr:
+                            proposal_failure_reasons.append(apply_stderr)
+                    else:
+                        rendered_diff_valid = True
+        else:
+            write_text_exact(proposal_diff_path, "")
+
+        write_json(
+            proposal_edit_spec_path,
+            build_w4_edit_spec_json(
+                case_id=case["case_id"],
+                selected_target_file=selected_target_file,
+                mode=final_edit_spec_mode,
+                valid=edit_spec_valid,
+                attempt_order=attempt_order,
+                spec=final_edit_spec,
+                errors=proposal_failure_reasons.copy(),
+                attempts=edit_spec_attempts,
+            ),
+        )
+
+    proposal_valid = not proposal_failure_reasons
+    proposal_summary = {
+        "artifact_kind": "aoa.local-ai-trial.w4-proposal-summary",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W4",
+        "case_id": case["case_id"],
+        "prepared_at": utc_now(),
+        "execution_mode": case["execution_mode"],
+        "lane": case["lane"],
+        "repo_root": str(repo_root),
+        "base_head": repo_head,
+        "allowed_files": allowed_relative_files,
+        "source_refs": case.get("source_refs", []),
+        "agents_refs": agents_refs,
+        "selected_target_file": selected_target_file,
+        "selection_fallback_used": selection_fallback_used,
+        "edit_contract": "hybrid-exact-then-anchor",
+        "edit_spec_mode": final_edit_spec_mode,
+        "edit_spec_valid": edit_spec_valid,
+        "builder_match_count": builder_match_count,
+        "rendered_diff_valid": rendered_diff_valid,
+        "proposal_valid": proposal_valid,
+        "proposal_failure_reasons": proposal_failure_reasons,
+        "touched_files": touched_files,
+        "command_artifacts": [
+            path
+            for ref in command_refs
+            for path in (ref["stdout_path"], ref["stderr_path"], ref["command_meta"])
+        ],
+    }
+    write_json(proposal_summary_path, proposal_summary)
+    return proposal_summary, command_refs, proposal_failure_reasons
+
+
+def render_w4_source_bundle(refs: list[str]) -> tuple[str, list[str]]:
+    entries: list[dict[str, Any]] = []
+    errors: list[str] = []
+    for ref in refs:
+        try:
+            entries.append(local_text_entry_for_prompt(ref))
+        except RuntimeError as exc:
+            errors.append(str(exc))
+    lines = ["# W4 Source Bundle", ""]
+    for item in entries:
+        lines.extend(
+            [
+                f"=== source_ref: {item['ref']} ===",
+                compact_excerpt_for_prompt(item["text"], non_empty_limit=12, char_limit=1400),
+                "",
+            ]
+        )
+    if errors:
+        lines.extend(["# Source Errors", *[f"- {item}" for item in errors], ""])
+    return "\n".join(lines).rstrip() + "\n", errors
+
+
+def build_w4_patch_prompt(
+    case: dict[str, Any],
+    *,
+    source_bundle: str,
+    allowed_relative_files: list[str],
+) -> str:
+    input_lines = "\n".join(f"- {item}" for item in case.get("inputs", []))
+    source_ref_lines = "\n".join(f"- {item}" for item in case.get("source_refs", []))
+    allowed_lines = "\n".join(f"- {item}" for item in allowed_relative_files)
+    acceptance_lines = "\n".join(f"- {item}" for item in case.get("acceptance_checks", []))
+    return textwrap.dedent(
+        f"""\
+        Bounded W4 supervised edit proposal.
+        Use only the supplied source refs and AGENTS guidance.
+        Keep the edit compact, source-of-truth-safe, and strictly inside the approved file scope.
+        Prefer wording alignment over semantic invention.
+
+        Goal:
+        {case.get("goal", "")}
+
+        Inputs:
+        {input_lines}
+
+        Exact source refs:
+        {source_ref_lines}
+
+        Allowed files relative to repo root:
+        {allowed_lines}
+
+        Acceptance checks after mutation:
+        {acceptance_lines}
+
+        Response contract:
+        - Return only a git-style unified diff.
+        - Modify only existing files from the allowed file list.
+        - Use paths relative to the repo root in `a/...` and `b/...` diff headers.
+        - No rename or delete.
+        - No binary patch.
+        - No prose outside the diff.
+        - If no safe change is possible, still return the smallest valid diff that keeps wording aligned.
+
+        Grounded source bundle:
+        {source_bundle.rstrip()}
+        """
+    ).rstrip() + "\n"
+
+
+def build_w4_script_refresh_plan(case: dict[str, Any], *, allowed_relative_files: list[str]) -> str:
+    builder_command = case.get("mutation_policy", {}).get("builder_command") or []
+    acceptance_lines = "\n".join(f"- {item}" for item in case.get("acceptance_checks", []))
+    allowed_lines = "\n".join(f"- {item}" for item in allowed_relative_files)
+    return textwrap.dedent(
+        f"""\
+        W4 script-refresh proposal for `{case['case_id']}`.
+
+        Execution mode:
+        - script_refresh
+
+        Repo:
+        - {repo_root_for_w4_case(case)}
+
+        Builder command:
+        - {format_command(builder_command) if builder_command else '<missing builder command>'}
+
+        Allowed files relative to repo root:
+        {allowed_lines}
+
+        Acceptance checks:
+        {acceptance_lines}
+
+        Notes:
+        - No model-written diff is used for this case.
+        - The builder command runs only after explicit approval and only inside an isolated worktree first.
+        """
+    ).rstrip() + "\n"
+
+
+def inspect_w4_diff_text(
+    diff_text: str,
+    *,
+    allowed_relative_files: list[str],
+) -> dict[str, Any]:
+    failures: list[str] = []
+    touched: list[str] = []
+    allowed = set(allowed_relative_files)
+    stripped = diff_text.strip()
+    if not stripped:
+        failures.append("empty diff")
+    if "```" in diff_text:
+        failures.append("code fence is not allowed in unified diff output")
+    if re.search(r"^rename (from|to) ", diff_text, flags=re.MULTILINE):
+        failures.append("rename headers are not allowed")
+    if re.search(r"^(deleted file mode|new file mode) ", diff_text, flags=re.MULTILINE):
+        failures.append("new/delete file headers are not allowed")
+    if "Binary files " in diff_text or "GIT binary patch" in diff_text:
+        failures.append("binary hunks are not allowed")
+
+    for match in re.finditer(r"^diff --git a/(.+?) b/(.+?)$", diff_text, flags=re.MULTILINE):
+        left = match.group(1).strip()
+        right = match.group(2).strip()
+        if left != right:
+            failures.append(f"rename-style diff header is not allowed: {left} -> {right}")
+        if right not in touched:
+            touched.append(right)
+
+    if not touched:
+        for match in re.finditer(r"^\+\+\+ b/(.+?)$", diff_text, flags=re.MULTILINE):
+            right = match.group(1).strip()
+            if right != "/dev/null" and right not in touched:
+                touched.append(right)
+
+    if re.search(r"^(---|\+\+\+) /dev/null$", diff_text, flags=re.MULTILINE):
+        failures.append("new/delete file patches are not allowed")
+
+    if not touched:
+        failures.append("could not identify touched files from unified diff")
+
+    unauthorized = sorted(path for path in touched if path not in allowed)
+    if unauthorized:
+        failures.append(
+            "touched files outside allowed scope: " + ", ".join(unauthorized)
+        )
+
+    return {
+        "proposal_valid": not failures,
+        "failure_reasons": failures,
+        "touched_files": touched,
+    }
+
+
+def write_w4_approval_status(
+    case_root: Path,
+    *,
+    case: dict[str, Any],
+    repo_head: str,
+) -> Path:
+    approval_path = case_root / "artifacts" / "approval.status.json"
+    payload = {
+        "artifact_kind": "aoa.local-ai-trial.w4-approval-status",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W4",
+        "case_id": case["case_id"],
+        "status": "pending",
+        "approved": False,
+        "prepared_at": utc_now(),
+        "base_head": repo_head,
+        "notes": "Set `status` to `approved` after reviewing the proposal before running apply-case.",
+    }
+    write_json(approval_path, payload)
+    return approval_path
+
+
+def load_json_file(path: Path) -> dict[str, Any]:
+    return json.loads(path.read_text(encoding="utf-8"))
+
+
+def w4_proposal_artifact_refs(case_root: Path) -> list[str]:
+    ordered_names = [
+        "proposal.target.prompt.txt",
+        "proposal.plan.prompt.txt",
+        "proposal.target.json",
+        "proposal.plan.json",
+        "proposal.edit-spec.json",
+        "proposal.prompt.txt",
+        "proposal.retry.prompt.txt",
+        "proposal.diff",
+        "proposal.summary.json",
+        "approval.status.json",
+        "worktree.manifest.json",
+    ]
+    refs: list[str] = []
+    for name in ordered_names:
+        path = case_root / "artifacts" / name
+        if path.exists():
+            refs.append(str(path))
+    for path in sorted((case_root / "artifacts").glob("proposal-*.stdout.txt")):
+        refs.append(str(path))
+    for path in sorted((case_root / "artifacts").glob("proposal-*.stderr.txt")):
+        refs.append(str(path))
+    for path in sorted((case_root / "artifacts").glob("proposal-*.command.json")):
+        refs.append(str(path))
+    for path in sorted((case_root / "artifacts").glob("landing.diff")):
+        refs.append(str(path))
+    return refs
+
+
+def prepare_w4_case(case: dict[str, Any], *, log_root: Path) -> dict[str, Any]:
+    case_root = case_dir(log_root, "W4", case["case_id"])
+    repo_root = repo_root_for_w4_case(case)
+    repo_head = git_head(repo_root)
+    allowed_relative_files = relative_repo_paths(
+        repo_root,
+        case["expected_result"]["allowed_files"],
+    )
+    agents_refs = collect_applicable_agents_refs(case)
+    proposal_prompt_path = case_root / "artifacts" / "proposal.prompt.txt"
+    proposal_diff_path = case_root / "artifacts" / "proposal.diff"
+    proposal_summary_path = case_root / "artifacts" / "proposal.summary.json"
+    approval_path = write_w4_approval_status(case_root, case=case, repo_head=repo_head)
+
+    command_refs: list[dict[str, Any]] = []
+    proposal_valid = False
+    proposal_failure_reasons: list[str] = []
+    touched_files: list[str] = []
+
+    if case["execution_mode"] == "qwen_patch":
+        try:
+            proposal_summary, command_refs, proposal_failure_reasons = prepare_w4_docs_case(
+                case,
+                case_root=case_root,
+                repo_root=repo_root,
+                repo_head=repo_head,
+                allowed_relative_files=allowed_relative_files,
+                agents_refs=agents_refs,
+            )
+        except Exception as exc:
+            proposal_failure_reasons = [
+                f"docs-lane staged preparation failed: {type(exc).__name__}: {exc}"
+            ]
+            proposal_summary = {
+                "artifact_kind": "aoa.local-ai-trial.w4-proposal-summary",
+                "program_id": PROGRAM_ID,
+                "wave_id": "W4",
+                "case_id": case["case_id"],
+                "prepared_at": utc_now(),
+                "execution_mode": case["execution_mode"],
+                "lane": case["lane"],
+                "repo_root": str(repo_root),
+                "base_head": repo_head,
+                "allowed_files": allowed_relative_files,
+                "source_refs": case.get("source_refs", []),
+                "agents_refs": agents_refs,
+                "selected_target_file": W4_DOC_TARGET_FALLBACKS.get(case["case_id"]),
+                "selection_fallback_used": True,
+                "edit_contract": "hybrid-exact-then-anchor",
+                "edit_spec_mode": None,
+                "edit_spec_valid": False,
+                "builder_match_count": 0,
+                "rendered_diff_valid": False,
+                "proposal_valid": False,
+                "proposal_failure_reasons": proposal_failure_reasons,
+                "touched_files": [],
+                "command_artifacts": [
+                    path
+                    for ref in command_refs
+                    for path in (ref["stdout_path"], ref["stderr_path"], ref["command_meta"])
+                ],
+            }
+            write_json(proposal_summary_path, proposal_summary)
+        proposal_valid = bool(proposal_summary.get("proposal_valid"))
+        touched_files = list(proposal_summary.get("touched_files") or [])
+    else:
+        prompt_text = build_w4_script_refresh_plan(
+            case,
+            allowed_relative_files=allowed_relative_files,
+        )
+        write_text(proposal_prompt_path, prompt_text)
+        write_text_exact(
+            proposal_diff_path,
+            "# script_refresh case\n# diff is produced only after approved worktree execution\n",
+        )
+        builder_command = case.get("mutation_policy", {}).get("builder_command") or []
+        proposal_valid = bool(builder_command)
+        if not proposal_valid:
+            proposal_failure_reasons.append("missing builder command for script_refresh case")
+        proposal_summary = {
+            "artifact_kind": "aoa.local-ai-trial.w4-proposal-summary",
+            "program_id": PROGRAM_ID,
+            "wave_id": "W4",
+            "case_id": case["case_id"],
+            "prepared_at": utc_now(),
+            "execution_mode": case["execution_mode"],
+            "lane": case["lane"],
+            "repo_root": str(repo_root),
+            "base_head": repo_head,
+            "allowed_files": allowed_relative_files,
+            "source_refs": case.get("source_refs", []),
+            "agents_refs": agents_refs,
+            "edit_contract": "script_refresh",
+            "edit_spec_mode": None,
+            "edit_spec_valid": False,
+            "builder_match_count": 0,
+            "rendered_diff_valid": False,
+            "proposal_valid": proposal_valid,
+            "proposal_failure_reasons": proposal_failure_reasons,
+            "touched_files": [],
+            "builder_command": builder_command,
+            "command_artifacts": [],
+        }
+
+    write_json(proposal_summary_path, proposal_summary)
+    return {
+        "case_id": case["case_id"],
+        "proposal_valid": proposal_valid,
+        "proposal_summary_path": str(proposal_summary_path),
+        "approval_path": str(approval_path),
+        "command_refs": command_refs,
+        "failure_reasons": proposal_failure_reasons,
+    }
+
+
+def run_w4_preflight(log_root: Path) -> None:
+    run_supervised_route_preflight(log_root, "W4")
+
+
+def w4_cases_for_lane(catalog: dict[str, list[dict[str, Any]]], lane: str) -> list[dict[str, Any]]:
+    all_by_id = {
+        case["case_id"]: case
+        for case in catalog["W4"]
+    }
+    if lane == "all":
+        return [
+            all_by_id[case_id]
+            for case_id in [*W4_DOC_PREPARE_ORDER, *W4_GENERATED_PREPARE_ORDER]
+            if case_id in all_by_id
+        ]
+    if lane == "docs":
+        return [
+            all_by_id[case_id]
+            for case_id in W4_DOC_PREPARE_ORDER
+            if case_id in all_by_id
+        ]
+    if lane == "generated":
+        return [
+            all_by_id[case_id]
+            for case_id in W4_GENERATED_PREPARE_ORDER
+            if case_id in all_by_id
+        ]
+    return []
+
+
+def load_w4_results(log_root: Path, catalog: dict[str, list[dict[str, Any]]]) -> list[dict[str, Any]]:
+    results: list[dict[str, Any]] = []
+    for case in catalog["W4"]:
+        result_path = case_dir(log_root, "W4", case["case_id"]) / "result.summary.json"
+        if result_path.exists():
+            results.append(load_json_file(result_path))
+    return results
+
+
+def w4_pass_changed_files_by_repo(
+    log_root: Path,
+    catalog: dict[str, list[dict[str, Any]]],
+) -> dict[str, set[str]]:
+    case_by_id = {case["case_id"]: case for case in catalog["W4"]}
+    changed_by_repo: dict[str, set[str]] = {}
+    for result in load_w4_results(log_root, catalog):
+        if result.get("status") != "pass":
+            continue
+        case = case_by_id.get(result.get("case_id"))
+        if case is None:
+            continue
+        repo_key = str(repo_root_for_w4_case(case))
+        changed = result.get("observed", {}).get("changed_files") or []
+        bucket = changed_by_repo.setdefault(repo_key, set())
+        bucket.update(path for path in changed if isinstance(path, str) and path)
+    return changed_by_repo
+
+
+def ensure_repo_ready_for_w4_case(
+    repo_root: Path,
+    *,
+    case: dict[str, Any],
+    log_root: Path,
+    catalog: dict[str, list[dict[str, Any]]],
+) -> list[str]:
+    tracked = tracked_status_lines(repo_root)
+    if not tracked:
+        return ignored_untracked_noise(repo_root)
+
+    changed_files = set(list_changed_files(repo_root))
+    allowed_for_case = set(
+        relative_repo_paths(repo_root, case["expected_result"].get("allowed_files", []))
+    )
+    prior_pass_paths = w4_pass_changed_files_by_repo(log_root, catalog).get(str(repo_root), set())
+    unexpected = sorted(
+        path for path in changed_files if path in allowed_for_case or path not in prior_pass_paths
+    )
+    if unexpected:
+        raise RuntimeError(
+            f"tracked git state is not clean for {repo_root}: "
+            + "; ".join(tracked)
+        )
+    return ignored_untracked_noise(repo_root)
+
+
+def w4_docs_lane_state(log_root: Path, catalog: dict[str, list[dict[str, Any]]]) -> dict[str, Any]:
+    results_by_id = {
+        result["case_id"]: result
+        for result in load_w4_results(log_root, catalog)
+    }
+    docs_results = [
+        results_by_id[case_id]
+        for case_id in W4_DOC_CASE_IDS
+        if case_id in results_by_id
+    ]
+    docs_pass = sum(1 for item in docs_results if item["status"] == "pass")
+    docs_criticals = [
+        item["case_id"]
+        for item in docs_results
+        if item.get("failure_class") in W4_CRITICAL_FAILURES
+    ]
+    return {
+        "pass_count": docs_pass,
+        "critical_case_ids": docs_criticals,
+        "unlock_generated_lane": docs_pass >= 5 and not docs_criticals,
+    }
+
+
+def update_w4_index(log_root: Path, mirror_root: Path, catalog: dict[str, list[dict[str, Any]]]) -> None:
+    results_by_id: dict[str, dict[str, Any]] = {}
+    proposal_summaries: dict[str, dict[str, Any]] = {}
+    approval_statuses: dict[str, dict[str, Any]] = {}
+    for case in catalog["W4"]:
+        result_path = case_dir(log_root, "W4", case["case_id"]) / "result.summary.json"
+        proposal_path = case_dir(log_root, "W4", case["case_id"]) / "artifacts" / "proposal.summary.json"
+        approval_path = case_dir(log_root, "W4", case["case_id"]) / "artifacts" / "approval.status.json"
+        if result_path.exists():
+            results_by_id[case["case_id"]] = load_json_file(result_path)
+        if proposal_path.exists():
+            proposal_summaries[case["case_id"]] = load_json_file(proposal_path)
+        if approval_path.exists():
+            approval_statuses[case["case_id"]] = load_json_file(approval_path)
+
+    pass_count = sum(1 for item in results_by_id.values() if item["status"] == "pass")
+    fail_count = sum(1 for item in results_by_id.values() if item["status"] == "fail")
+    planned_count = len(catalog["W4"]) - len(results_by_id)
+    critical_case_ids = [
+        item["case_id"]
+        for item in results_by_id.values()
+        if item.get("failure_class") in W4_CRITICAL_FAILURES
+    ]
+    docs_state = w4_docs_lane_state(log_root, catalog)
+    prepared_docs_cases = sum(1 for case_id in W4_DOC_CASE_IDS if case_id in proposal_summaries)
+    valid_docs_proposals = sum(
+        1
+        for case_id in W4_DOC_CASE_IDS
+        if proposal_summaries.get(case_id, {}).get("proposal_valid")
+    )
+    pending_approvals = sum(
+        1
+        for case_id in W4_DOC_CASE_IDS
+        if proposal_summaries.get(case_id, {}).get("proposal_valid")
+        and approval_statuses.get(case_id, {}).get("status") == "pending"
+    )
+
+    if not results_by_id:
+        gate_result = "not-run"
+        if prepared_docs_cases:
+            next_action = "Review prepared docs-lane proposals, approve the first live cases, and keep generated apply blocked until docs unlock."
+        else:
+            next_action = "Prepare docs-lane proposals, review them, then approve one case at a time."
+    elif critical_case_ids:
+        gate_result = "fail"
+        next_action = "Stop W4 and remediate the critical unauthorized-scope or validation failure before any further apply-case."
+    elif planned_count > 0:
+        gate_result = "in-progress"
+        if docs_state["unlock_generated_lane"]:
+            next_action = "Docs lane is unlocked. Continue approved W4 cases, including generated refresh if needed."
+        else:
+            next_action = "Continue docs-lane W4 cases until the generated lane unlock rule is satisfied."
+    elif pass_count >= 6:
+        gate_result = "pass"
+        next_action = "W4 gate passed. Review landed edits and decide whether a broader autonomous pilot is warranted."
+    else:
+        gate_result = "fail"
+        next_action = "Stop at W4 and form a remediation sub-plan before any broader autonomy claims."
+
+    index_payload = {
+        "artifact_kind": "aoa.local-ai-trial.wave-index",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W4",
+        "wave_title": WAVE_METADATA["W4"]["title"],
+        "wave_summary": WAVE_METADATA["W4"]["summary"],
+        "case_count": len(catalog["W4"]),
+        "status_counts": {
+            "pass": pass_count,
+            "fail": fail_count,
+            "planned": planned_count,
+        },
+        "gate_result": gate_result,
+        "next_action": next_action,
+        "cases": [
+            {
+                "case_id": case["case_id"],
+                "status": results_by_id.get(case["case_id"], {}).get("status", "planned"),
+                "repo_scope": case["repo_scope"],
+                "task_family": case["task_family"],
+                "case_spec": str(case_dir(log_root, "W4", case["case_id"]) / "case.spec.json"),
+                **(
+                    {
+                        "report_md": str(
+                            mirror_root / case_report_name("W4", case["case_id"])
+                        )
+                    }
+                    if case["case_id"] in results_by_id
+                    else {}
+                ),
+                "summary": case["title"],
+            }
+            for case in catalog["W4"]
+        ],
+        "gate_detail": {
+            "pass_count": pass_count,
+            "fail_count": fail_count,
+            "critical_failures": critical_case_ids,
+            "docs_lane_pass_count": docs_state["pass_count"],
+            "docs_lane_critical_case_ids": docs_state["critical_case_ids"],
+            "prepared_docs_cases": prepared_docs_cases,
+            "valid_docs_proposals": valid_docs_proposals,
+            "pending_approvals": pending_approvals,
+            "generated_lane_unlocked": docs_state["unlock_generated_lane"],
+            "next_action": next_action,
+        },
+    }
+    index_base = wave_index_name("W4")
+    write_json(log_root / f"{index_base}.json", index_payload)
+    index_md = render_wave_index_md(index_payload)
+    write_text(log_root / f"{index_base}.md", index_md)
+    write_text(mirror_root / f"{index_base}.md", index_md)
+
+
+def prepare_w4(log_root: Path, mirror_root: Path, lane: str) -> None:
+    catalog = build_catalog()
+    ensure_w3_gate_passed(log_root)
+    ensure_wave_materialized(log_root, mirror_root, "W4", catalog)
+    run_w4_preflight(log_root)
+
+    cases = []
+    for case in w4_cases_for_lane(catalog, lane):
+        result_path = case_dir(log_root, "W4", case["case_id"]) / "result.summary.json"
+        if result_path.exists():
+            existing = load_json_file(result_path)
+            if existing.get("status") == "pass":
+                continue
+        cases.append(case)
+    for case in cases:
+        repo_root = repo_root_for_w4_case(case)
+        ensure_repo_ready_for_w4_case(
+            repo_root,
+            case=case,
+            log_root=log_root,
+            catalog=catalog,
+        )
+        prepare_w4_case(case, log_root=log_root)
+
+    update_w4_index(log_root, mirror_root, catalog)
+
+
+def parse_approval_status(case_root: Path) -> dict[str, Any]:
+    approval_path = case_root / "artifacts" / "approval.status.json"
+    if not approval_path.exists():
+        raise RuntimeError(f"missing approval artifact: {approval_path}")
+    payload = load_json_file(approval_path)
+    if payload.get("status") != "approved":
+        raise RuntimeError(f"approval is not granted for this case: {approval_path}")
+    return payload
+
+
+def list_changed_files(repo_root: Path) -> list[str]:
+    raw = git_command(repo_root, ["diff", "--name-only", "--relative", "HEAD"], timeout_s=60)
+    if raw["exit_code"] != 0 or raw["timed_out"]:
+        raise RuntimeError(f"could not list changed files for {repo_root}")
+    return [line.strip() for line in raw["stdout"].splitlines() if line.strip()]
+
+
+def build_landing_diff(repo_root: Path, *, diff_path: Path) -> dict[str, Any]:
+    raw = git_command(repo_root, ["diff", "--binary", "--relative", "HEAD"], timeout_s=60)
+    if raw["exit_code"] != 0 or raw["timed_out"]:
+        raise RuntimeError(f"could not build landing diff for {repo_root}")
+    write_text_exact(diff_path, raw["stdout"])
+    return raw
+
+
+def run_acceptance_checks(
+    case_root: Path,
+    *,
+    repo_root: Path,
+    checks: list[str],
+    label_prefix: str,
+) -> tuple[list[dict[str, Any]], bool]:
+    refs: list[dict[str, Any]] = []
+    all_ok = True
+    for index, command in enumerate(checks, start=1):
+        wrapped = (
+            'export PATH="$HOME/.local/bin:$PATH"; '
+            'export PYTHONPATH="$PWD${PYTHONPATH:+:$PYTHONPATH}"; '
+            f"{command}"
+        )
+        raw = run_command(["bash", "-lc", wrapped], cwd=repo_root, timeout_s=600)
+        ref = persist_command_result(case_root, f"{label_prefix}-{index:02d}", raw)
+        refs.append(ref)
+        if raw["exit_code"] != 0 or raw["timed_out"]:
+            all_ok = False
+    return refs, all_ok
+
+
+def with_temp_worktree(
+    repo_root: Path,
+    *,
+    case_id: str,
+    log_root: Path,
+) -> tuple[Path, dict[str, Any]]:
+    parent = log_root / "_worktrees"
+    parent.mkdir(parents=True, exist_ok=True)
+    worktree_path = Path(tempfile.mkdtemp(prefix=f"{case_id}-", dir=str(parent)))
+    add_raw = git_command(
+        repo_root,
+        ["worktree", "add", "--detach", str(worktree_path), "HEAD"],
+        timeout_s=120,
+    )
+    return worktree_path, add_raw
+
+
+def ensure_w4_worktree_neighbor_links(worktree_path: Path) -> list[str]:
+    parent = worktree_path.parent
+    created: list[str] = []
+    for name in W4_WORKTREE_NEIGHBOR_REPOS:
+        target = Path("/srv") / name
+        link_path = parent / name
+        if not target.exists() or link_path.exists():
+            continue
+        link_path.symlink_to(target, target_is_directory=True)
+        created.append(str(link_path))
+    return created
+
+
+def remove_temp_worktree(repo_root: Path, worktree_path: Path) -> dict[str, Any]:
+    remove_raw = git_command(
+        repo_root,
+        ["worktree", "remove", "--force", str(worktree_path)],
+        timeout_s=120,
+    )
+    if worktree_path.exists():
+        shutil.rmtree(worktree_path, ignore_errors=True)
+    return remove_raw
+
+
+def w4_failure_summary(
+    case: dict[str, Any],
+    *,
+    log_root: Path,
+    mirror_root: Path,
+    failure_class: str,
+    reviewer_notes: str,
+    boundary_notes: str,
+    highlights: list[str],
+    failures: list[str],
+    command_refs: list[dict[str, Any]],
+    artifact_refs: list[str],
+    next_action: str,
+) -> None:
+    run_manifest = {
+        "artifact_kind": "aoa.local-ai-trial.run-manifest",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W4",
+        "case_id": case["case_id"],
+        "executed_at": utc_now(),
+        "runtime_selection": case["runtime_selection"],
+        "model": MODEL,
+        "backend": case["execution_mode"],
+        "commands": command_refs,
+        "artifact_refs": artifact_refs,
+        "notes": [
+            "W4 uses staged execution with explicit approval, isolated worktrees, and scoped landing back to the main repo only after validation.",
+        ],
+    }
+    result_summary = build_result_summary(
+        case=case,
+        status="fail",
+        score_breakdown={
+            "proposal_valid": failure_class != "proposal_invalid",
+            "approval_present": failure_class != "approval_missing",
+            "unauthorized_scope_expansion": failure_class == "unauthorized_scope_expansion",
+            "post_change_validation_failure": failure_class == "post_change_validation_failure",
+        },
+        observed={
+            "highlights": highlights,
+            "failures": failures,
+        },
+        failure_class=failure_class,
+        reviewer_notes=reviewer_notes,
+        boundary_notes=boundary_notes,
+        next_action=next_action,
+    )
+    finalize_case(
+        case=case,
+        log_root=log_root,
+        mirror_root=mirror_root,
+        run_manifest=run_manifest,
+        result_summary=result_summary,
+    )
+
+
+def apply_w4_case(case: dict[str, Any], *, log_root: Path, mirror_root: Path) -> None:
+    catalog = build_catalog()
+    case_root = case_dir(log_root, "W4", case["case_id"])
+    repo_root = repo_root_for_w4_case(case)
+    proposal_summary_path = case_root / "artifacts" / "proposal.summary.json"
+    proposal_diff_path = case_root / "artifacts" / "proposal.diff"
+    worktree_manifest_path = case_root / "artifacts" / "worktree.manifest.json"
+    landing_diff_path = case_root / "artifacts" / "landing.diff"
+    artifact_refs = [*w4_proposal_artifact_refs(case_root)]
+    command_refs: list[dict[str, Any]] = []
+
+    try:
+        ensure_repo_ready_for_w4_case(
+            repo_root,
+            case=case,
+            log_root=log_root,
+            catalog=catalog,
+        )
+    except RuntimeError as exc:
+        w4_failure_summary(
+            case,
+            log_root=log_root,
+            mirror_root=mirror_root,
+            failure_class="dirty_repo_block",
+            reviewer_notes="W4 apply-case stopped before mutation because the target repo had tracked changes.",
+            boundary_notes=w4_boundary_note(),
+            highlights=[f"Repo root: `{repo_root}`."],
+            failures=[str(exc)],
+            command_refs=command_refs,
+            artifact_refs=artifact_refs,
+            next_action="Restore a clean tracked state before rerunning this W4 apply-case.",
+        )
+        return
+
+    if not proposal_summary_path.exists():
+        w4_failure_summary(
+            case,
+            log_root=log_root,
+            mirror_root=mirror_root,
+            failure_class="proposal_invalid",
+            reviewer_notes="W4 apply-case stopped because no prepared proposal packet was present.",
+            boundary_notes=w4_boundary_note(),
+            highlights=[f"Missing prepared proposal for `{case['case_id']}`."],
+            failures=[f"missing proposal artifact: {proposal_summary_path}"],
+            command_refs=command_refs,
+            artifact_refs=artifact_refs,
+            next_action="Run prepare-wave W4 for this lane before attempting apply-case.",
+        )
+        return
+
+    proposal_summary = load_json_file(proposal_summary_path)
+    if not proposal_summary.get("proposal_valid"):
+        w4_failure_summary(
+            case,
+            log_root=log_root,
+            mirror_root=mirror_root,
+            failure_class="proposal_invalid",
+            reviewer_notes="W4 apply-case stopped because the prepared proposal was not valid for landing.",
+            boundary_notes=w4_boundary_note(),
+            highlights=[f"Prepared proposal loaded for `{case['case_id']}`."],
+            failures=proposal_summary.get("proposal_failure_reasons") or ["proposal marked invalid"],
+            command_refs=command_refs,
+            artifact_refs=artifact_refs,
+            next_action="Repair the proposal and re-approve before rerunning apply-case.",
+        )
+        return
+
+    try:
+        approval_status = parse_approval_status(case_root)
+    except RuntimeError as exc:
+        w4_failure_summary(
+            case,
+            log_root=log_root,
+            mirror_root=mirror_root,
+            failure_class="approval_missing",
+            reviewer_notes="W4 apply-case stopped because explicit approval was missing.",
+            boundary_notes=w4_boundary_note(),
+            highlights=[f"Prepared proposal exists for `{case['case_id']}`."],
+            failures=[str(exc)],
+            command_refs=command_refs,
+            artifact_refs=artifact_refs,
+            next_action="Review the proposal and set approval.status.json to approved before rerunning apply-case.",
+        )
+        return
+
+    if case["lane"] == "generated":
+        docs_state = w4_docs_lane_state(log_root, build_catalog())
+        if not docs_state["unlock_generated_lane"]:
+            w4_failure_summary(
+                case,
+                log_root=log_root,
+                mirror_root=mirror_root,
+                failure_class="preflight_failure",
+                reviewer_notes="Generated-lane W4 apply-case is blocked until the docs lane unlock rule is satisfied.",
+                boundary_notes=w4_boundary_note(),
+                highlights=[f"Docs lane pass count: `{docs_state['pass_count']}`."],
+                failures=[
+                    "generated lane is locked until docs lane has at least 5 passes and zero critical failures"
+                ],
+                command_refs=command_refs,
+                artifact_refs=artifact_refs,
+                next_action="Complete or remediate docs-lane W4 cases before applying generated refresh cases.",
+            )
+            return
+
+    base_head = str(proposal_summary.get("base_head") or "")
+    current_head = git_head(repo_root)
+    if base_head and current_head != base_head:
+        w4_failure_summary(
+            case,
+            log_root=log_root,
+            mirror_root=mirror_root,
+            failure_class="landing_reapply_failure",
+            reviewer_notes="W4 apply-case stopped because the repo HEAD drifted after proposal preparation.",
+            boundary_notes=w4_boundary_note(),
+            highlights=[f"Prepared base HEAD: `{base_head}`.", f"Current HEAD: `{current_head}`."],
+            failures=["repo HEAD drifted between prepare-wave and apply-case"],
+            command_refs=command_refs,
+            artifact_refs=artifact_refs,
+            next_action="Re-run prepare-wave for this case and review the refreshed proposal before applying again.",
+        )
+        return
+
+    worktree_path, add_raw = with_temp_worktree(
+        repo_root,
+        case_id=case["case_id"],
+        log_root=log_root,
+    )
+    add_ref = persist_command_result(case_root, "worktree-add", add_raw)
+    command_refs.append(add_ref)
+    artifact_refs.extend([add_ref["stdout_path"], add_ref["stderr_path"], add_ref["command_meta"]])
+    if add_raw["exit_code"] != 0 or add_raw["timed_out"]:
+        shutil.rmtree(worktree_path, ignore_errors=True)
+        w4_failure_summary(
+            case,
+            log_root=log_root,
+            mirror_root=mirror_root,
+            failure_class="preflight_failure",
+            reviewer_notes="W4 apply-case could not create an isolated git worktree.",
+            boundary_notes=w4_boundary_note(),
+            highlights=[f"Repo root: `{repo_root}`."],
+            failures=["git worktree add failed"],
+            command_refs=command_refs,
+            artifact_refs=artifact_refs,
+            next_action="Repair git worktree readiness before retrying W4 apply-case.",
+        )
+        return
+
+    worktree_repo_root = worktree_path
+    neighbor_links = ensure_w4_worktree_neighbor_links(worktree_path)
+    worktree_manifest = {
+        "artifact_kind": "aoa.local-ai-trial.w4-worktree-manifest",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W4",
+        "case_id": case["case_id"],
+        "created_at": utc_now(),
+        "repo_root": str(repo_root),
+        "worktree_path": str(worktree_path),
+        "base_head": base_head,
+        "execution_mode": case["execution_mode"],
+        "neighbor_links": neighbor_links,
+    }
+    write_json(worktree_manifest_path, worktree_manifest)
+    artifact_refs.append(str(worktree_manifest_path))
+
+    allowed_relative = set(proposal_summary.get("allowed_files") or [])
+    changed_files: list[str] = []
+    acceptance_refs: list[dict[str, Any]] = []
+    failure_class: str | None = None
+    failures: list[str] = []
+    highlights: list[str] = [f"Worktree path: `{worktree_path}`."]
+    if neighbor_links:
+        highlights.append(f"Worktree neighbor links: `{len(neighbor_links)}`.")
+
+    try:
+        if case["execution_mode"] == "qwen_patch":
+            apply_check_raw = git_command(
+                worktree_repo_root,
+                ["apply", "--check", str(proposal_diff_path)],
+                timeout_s=60,
+            )
+            apply_check_ref = persist_command_result(case_root, "worktree-apply-check", apply_check_raw)
+            command_refs.append(apply_check_ref)
+            artifact_refs.extend(
+                [apply_check_ref["stdout_path"], apply_check_ref["stderr_path"], apply_check_ref["command_meta"]]
+            )
+            if apply_check_raw["exit_code"] != 0 or apply_check_raw["timed_out"]:
+                failure_class = "proposal_invalid"
+                failures.append("git apply --check failed in isolated worktree")
+                raise RuntimeError("worktree apply check failed")
+
+            apply_raw = git_command(
+                worktree_repo_root,
+                ["apply", str(proposal_diff_path)],
+                timeout_s=60,
+            )
+            apply_ref = persist_command_result(case_root, "worktree-apply", apply_raw)
+            command_refs.append(apply_ref)
+            artifact_refs.extend(
+                [apply_ref["stdout_path"], apply_ref["stderr_path"], apply_ref["command_meta"]]
+            )
+            if apply_raw["exit_code"] != 0 or apply_raw["timed_out"]:
+                failure_class = "proposal_invalid"
+                failures.append("git apply failed in isolated worktree")
+                raise RuntimeError("worktree apply failed")
+        else:
+            builder_command = case.get("mutation_policy", {}).get("builder_command") or []
+            builder_raw = run_command(builder_command, cwd=worktree_repo_root, timeout_s=600)
+            builder_ref = persist_command_result(case_root, "worktree-builder", builder_raw)
+            command_refs.append(builder_ref)
+            artifact_refs.extend(
+                [builder_ref["stdout_path"], builder_ref["stderr_path"], builder_ref["command_meta"]]
+            )
+            if builder_raw["exit_code"] != 0 or builder_raw["timed_out"]:
+                failure_class = "post_change_validation_failure"
+                failures.append("approved builder command failed inside isolated worktree")
+                raise RuntimeError("builder command failed")
+
+        changed_files = list_changed_files(worktree_repo_root)
+        unauthorized = sorted(item for item in changed_files if item not in allowed_relative)
+        if unauthorized:
+            failure_class = "unauthorized_scope_expansion"
+            failures.append(
+                "changed files outside allowed scope: " + ", ".join(unauthorized)
+            )
+            raise RuntimeError("unauthorized changed files")
+
+        landing_raw = build_landing_diff(worktree_repo_root, diff_path=landing_diff_path)
+        landing_ref = persist_command_result(case_root, "worktree-landing-diff", landing_raw)
+        command_refs.append(landing_ref)
+        artifact_refs.extend(
+            [landing_ref["stdout_path"], landing_ref["stderr_path"], landing_ref["command_meta"], str(landing_diff_path)]
+        )
+
+        acceptance_refs, acceptance_ok = run_acceptance_checks(
+            case_root,
+            repo_root=worktree_repo_root,
+            checks=case.get("acceptance_checks", []),
+            label_prefix="worktree-acceptance",
+        )
+        command_refs.extend(acceptance_refs)
+        for ref in acceptance_refs:
+            artifact_refs.extend([ref["stdout_path"], ref["stderr_path"], ref["command_meta"]])
+        if not acceptance_ok:
+            failure_class = "post_change_validation_failure"
+            failures.append("one or more acceptance checks failed in isolated worktree")
+            raise RuntimeError("worktree acceptance failed")
+
+        ensure_repo_ready_for_w4_case(
+            repo_root,
+            case=case,
+            log_root=log_root,
+            catalog=catalog,
+        )
+        if git_head(repo_root) != base_head:
+            failure_class = "landing_reapply_failure"
+            failures.append("repo HEAD drifted before landing validated diff back to main repo")
+            raise RuntimeError("main repo head drifted")
+
+        landing_diff_text = landing_diff_path.read_text(encoding="utf-8")
+        if landing_diff_text.strip():
+            main_check_raw = git_command(
+                repo_root,
+                ["apply", "--check", str(landing_diff_path)],
+                timeout_s=60,
+            )
+            main_check_ref = persist_command_result(case_root, "landing-apply-check", main_check_raw)
+            command_refs.append(main_check_ref)
+            artifact_refs.extend(
+                [main_check_ref["stdout_path"], main_check_ref["stderr_path"], main_check_ref["command_meta"]]
+            )
+            if main_check_raw["exit_code"] != 0 or main_check_raw["timed_out"]:
+                failure_class = "landing_reapply_failure"
+                failures.append("validated diff could not be applied cleanly back to the main repo")
+                raise RuntimeError("main repo apply check failed")
+
+            main_apply_raw = git_command(
+                repo_root,
+                ["apply", str(landing_diff_path)],
+                timeout_s=60,
+            )
+            main_apply_ref = persist_command_result(case_root, "landing-apply", main_apply_raw)
+            command_refs.append(main_apply_ref)
+            artifact_refs.extend(
+                [main_apply_ref["stdout_path"], main_apply_ref["stderr_path"], main_apply_ref["command_meta"]]
+            )
+            if main_apply_raw["exit_code"] != 0 or main_apply_raw["timed_out"]:
+                failure_class = "landing_reapply_failure"
+                failures.append("validated diff failed during landing apply in the main repo")
+                raise RuntimeError("main repo apply failed")
+
+        main_acceptance_refs, main_acceptance_ok = run_acceptance_checks(
+            case_root,
+            repo_root=repo_root,
+            checks=case.get("acceptance_checks", []),
+            label_prefix="landing-acceptance",
+        )
+        command_refs.extend(main_acceptance_refs)
+        for ref in main_acceptance_refs:
+            artifact_refs.extend([ref["stdout_path"], ref["stderr_path"], ref["command_meta"]])
+        if not main_acceptance_ok:
+            reverse_diff_text = landing_diff_path.read_text(encoding="utf-8")
+            if reverse_diff_text.strip():
+                git_command(repo_root, ["apply", "-R", str(landing_diff_path)], timeout_s=60)
+            failure_class = "post_change_validation_failure"
+            failures.append("one or more acceptance checks failed after landing diff back to the main repo")
+            raise RuntimeError("main repo acceptance failed")
+
+        run_manifest = {
+            "artifact_kind": "aoa.local-ai-trial.run-manifest",
+            "program_id": PROGRAM_ID,
+            "wave_id": "W4",
+            "case_id": case["case_id"],
+            "executed_at": utc_now(),
+            "runtime_selection": case["runtime_selection"],
+            "model": MODEL,
+            "backend": case["execution_mode"],
+            "commands": command_refs,
+            "artifact_refs": artifact_refs,
+            "notes": [
+                "W4 landed only after isolated worktree mutation, scoped diff validation, and repeated acceptance checks in the main repo.",
+            ],
+        }
+        result_summary = build_result_summary(
+            case=case,
+            status="pass",
+            score_breakdown={
+                "proposal_valid": True,
+                "approval_present": True,
+                "unauthorized_scope_expansion": False,
+                "post_change_validation_failure": False,
+            },
+            observed={
+                "highlights": [
+                    *highlights,
+                    f"Changed files: `{json.dumps(changed_files, ensure_ascii=True)}`.",
+                    "All worktree and main-repo acceptance checks passed.",
+                ],
+                "failures": ["None."],
+                "changed_files": changed_files,
+            },
+            failure_class=None,
+            reviewer_notes="The W4 case stayed inside approved scope, passed isolated validation, and landed cleanly back to the main repo.",
+            boundary_notes=w4_boundary_note(),
+            next_action="Review the landed diff and decide whether to approve the next W4 case.",
+        )
+        finalize_case(
+            case=case,
+            log_root=log_root,
+            mirror_root=mirror_root,
+            run_manifest=run_manifest,
+            result_summary=result_summary,
+        )
+    except RuntimeError:
+        w4_failure_summary(
+            case,
+            log_root=log_root,
+            mirror_root=mirror_root,
+            failure_class=failure_class or "proposal_invalid",
+            reviewer_notes="The W4 apply-case did not satisfy the staged bounded-mutation contract.",
+            boundary_notes=w4_boundary_note(),
+            highlights=highlights,
+            failures=failures or ["unknown W4 apply failure"],
+            command_refs=command_refs,
+            artifact_refs=artifact_refs,
+            next_action="Inspect the proposal, worktree artifacts, and acceptance logs before retrying this W4 case.",
+        )
+    finally:
+        remove_raw = remove_temp_worktree(repo_root, worktree_path)
+        remove_ref = persist_command_result(case_root, "worktree-remove", remove_raw)
+        command_refs.append(remove_ref)
+        write_json(
+            worktree_manifest_path,
+            {
+                **worktree_manifest,
+                "removed_at": utc_now(),
+                "remove_exit_code": remove_raw["exit_code"],
+                "remove_timed_out": remove_raw["timed_out"],
+            },
+        )
+
+
+def apply_w4(log_root: Path, mirror_root: Path, case_id: str) -> None:
+    catalog = build_catalog()
+    ensure_w3_gate_passed(log_root)
+    ensure_wave_materialized(log_root, mirror_root, "W4", catalog)
+    run_w4_preflight(log_root)
+    case = next((item for item in catalog["W4"] if item["case_id"] == case_id), None)
+    if case is None:
+        raise RuntimeError(f"unknown W4 case_id: {case_id}")
+    apply_w4_case(case, log_root=log_root, mirror_root=mirror_root)
+    update_w4_index(log_root, mirror_root, catalog)
+
+
+def run_w0(log_root: Path, mirror_root: Path) -> None:
+    materialize_program(log_root, mirror_root, build_catalog())
+
+    up_intel = [absolute(SCRIPTS_ROOT / "aoa-up"), "--preset", "intel-full"]
+    wait_intel = [absolute(SCRIPTS_ROOT / "aoa-wait"), "--preset", "intel-full"]
+    up_baseline = [absolute(SCRIPTS_ROOT / "aoa-up"), "--preset", "intel-full", "--profile", "federation"]
+    wait_baseline = [absolute(SCRIPTS_ROOT / "aoa-wait"), "--preset", "intel-full", "--profile", "federation"]
+    setup_case_root = log_root / "waves" / "W0" / "_setup"
+    setup_case_root.mkdir(parents=True, exist_ok=True)
+    setup_up = persist_command_result(setup_case_root, "intel-up", run_command(up_intel, cwd=CONFIGS_ROOT))
+    setup_wait = persist_command_result(setup_case_root, "intel-wait", run_command(wait_intel, cwd=CONFIGS_ROOT, timeout_s=180))
+
+    if setup_up["exit_code"] != 0 or setup_wait["exit_code"] != 0:
+        raise RuntimeError("intel-full runtime did not come up cleanly before W0")
+
+    # Shared benchmark evidence for case 1 and 2.
+    bench_cmd = [absolute(SCRIPTS_ROOT / "aoa-qwen-bench"), "--preset", "intel-full"]
+    bench_raw = run_command(bench_cmd, cwd=CONFIGS_ROOT, timeout_s=240)
+    bench_dir = parse_bench_run_dir(bench_raw["stdout"])
+    bench_summary = json.loads((bench_dir / "summary.json").read_text(encoding="utf-8"))
+    bench_manifest = json.loads((bench_dir / "benchmark.manifest.json").read_text(encoding="utf-8"))
+    raw_results = json.loads((bench_dir / "raw" / "results.json").read_text(encoding="utf-8"))
+    warmup_results_path = bench_dir / "raw" / "warmup_results.json"
+    warmup_results = (
+        json.loads(warmup_results_path.read_text(encoding="utf-8"))
+        if warmup_results_path.exists()
+        else []
+    )
+    no_5xx_or_timeout = all(
+        bool(row.get("ok"))
+        and row.get("http_status") == 200
+        and row.get("elapsed_s") is not None
+        and "error" not in row
+        for row in [*warmup_results, *raw_results]
+    )
+
+    for case_id, metric_name, budget in [
+        ("warm-exact-reply", "exact-reply", 3.5),
+        ("warm-repo-routing", "repo-routing", 12.0),
+    ]:
+        case = load_case_spec(log_root, "W0", case_id)
+        case_root = case_dir(log_root, "W0", case_id)
+        command_ref = persist_command_result(case_root, "shared-bench", bench_raw)
+        breakdown = bench_summary["case_breakdown"][metric_name]
+        mean_s = breakdown["mean_s"]
+        runs = breakdown["runs"]
+        passed = breakdown["passed"]
+        status = "pass" if passed == runs and mean_s is not None and mean_s <= budget and no_5xx_or_timeout and bench_raw["exit_code"] == 0 else "fail"
+        observed = {
+            "highlights": [
+                f"Shared bench run dir: {bench_dir}",
+                f"{metric_name} passed {passed}/{runs} runs with mean {mean_s}s.",
+                f"Benchmark all_passed={bench_summary['all_passed']}.",
+            ],
+            "failures": [] if status == "pass" else [
+                "Shared benchmark evidence did not satisfy all W0 latency and success requirements."
+            ],
+            "benchmark_summary": bench_summary,
+        }
+        run_manifest = {
+            "artifact_kind": "aoa.local-ai-trial.run-manifest",
+            "program_id": PROGRAM_ID,
+            "wave_id": "W0",
+            "case_id": case_id,
+            "executed_at": utc_now(),
+            "runtime_selection": case["runtime_selection"],
+            "model": MODEL,
+            "backend": "langchain-api -> ollama-native",
+            "commands": [command_ref],
+            "artifact_refs": [
+                str(bench_dir / "benchmark.manifest.json"),
+                str(bench_dir / "summary.json"),
+                str(bench_dir / "raw" / "results.json"),
+                str(bench_dir / "raw" / "warmup_results.json"),
+                str(bench_dir / "notes.md"),
+            ],
+            "latency": {
+                "metric": metric_name,
+                "mean_s": mean_s,
+                "best_s": breakdown["best_s"],
+                "worst_s": breakdown["worst_s"],
+            },
+            "shared_evidence": [str(bench_dir)],
+            "notes": ["This case uses shared bench evidence with the paired W0 run-path case."],
+        }
+        result_summary = build_result_summary(
+            case=case,
+            status=status,
+            score_breakdown={
+                "all_runs_pass": passed == runs,
+                "mean_within_budget": mean_s is not None and mean_s <= budget,
+                "no_timeout_or_5xx": no_5xx_or_timeout,
+            },
+            observed=observed,
+            failure_class=None if status == "pass" else "latency_or_run_path_failure",
+            reviewer_notes=(
+                f"The shared benchmark evidence satisfied the W0 {metric_name} gate."
+                if status == "pass"
+                else f"The shared benchmark evidence did not satisfy the W0 {metric_name} gate."
+            ),
+            boundary_notes=w0_boundary_note(),
+            next_action=(
+                "Use the paired benchmark case and the broader W0 gate to decide whether to proceed to routing trials."
+            ),
+        )
+        finalize_case(case=case, log_root=log_root, mirror_root=mirror_root, run_manifest=run_manifest, result_summary=result_summary)
+
+    single_command_cases = [
+        ("intel-full-smoke-internal", [absolute(SCRIPTS_ROOT / "aoa-smoke"), "--with-internal", "--preset", "intel-full"], 240, "intel_full_smoke_passed"),
+    ]
+    for case_id, command, timeout_s, score_key in single_command_cases:
+        case = load_case_spec(log_root, "W0", case_id)
+        case_root = case_dir(log_root, "W0", case_id)
+        raw = run_command(command, cwd=CONFIGS_ROOT, timeout_s=timeout_s)
+        command_ref = persist_command_result(case_root, "primary", raw)
+        status = "pass" if raw["exit_code"] == 0 and not raw["timed_out"] else "fail"
+        observed = {
+            "highlights": [
+                f"Command exited with code {raw['exit_code']}.",
+                f"Elapsed time: {raw['elapsed_s']}s.",
+            ],
+            "failures": [] if status == "pass" else [f"{command_ref['display']} failed or timed out."],
+        }
+        run_manifest = {
+            "artifact_kind": "aoa.local-ai-trial.run-manifest",
+            "program_id": PROGRAM_ID,
+            "wave_id": "W0",
+            "case_id": case_id,
+            "executed_at": utc_now(),
+            "runtime_selection": case["runtime_selection"],
+            "model": MODEL,
+            "backend": "service-smoke",
+            "commands": [command_ref],
+            "artifact_refs": [command_ref["stdout_path"], command_ref["stderr_path"]],
+            "notes": ["This case checks runtime service health rather than model quality."],
+        }
+        result_summary = build_result_summary(
+            case=case,
+            status=status,
+            score_breakdown={score_key: status == "pass"},
+            observed=observed,
+            failure_class=None if status == "pass" else "service_smoke_failure",
+            reviewer_notes=(
+                "The service-level smoke case passed and did not show runtime-path instability."
+                if status == "pass"
+                else "The service-level smoke case failed and blocks promotion to higher pilot waves."
+            ),
+            boundary_notes=w0_boundary_note(),
+            next_action="Proceed only if the rest of W0 stays green.",
+        )
+        finalize_case(case=case, log_root=log_root, mirror_root=mirror_root, run_manifest=run_manifest, result_summary=result_summary)
+
+    # Bring up federation only after the pure intel-full latency and smoke cases are captured.
+    federation_case = load_case_spec(log_root, "W0", "federation-smoke")
+    federation_case_root = case_dir(log_root, "W0", "federation-smoke")
+    federation_up_raw = run_command([absolute(SCRIPTS_ROOT / "aoa-up"), "--profile", "federation"], cwd=CONFIGS_ROOT, timeout_s=180)
+    federation_up_ref = persist_command_result(federation_case_root, "federation-up", federation_up_raw)
+    federation_wait_raw = run_command([absolute(SCRIPTS_ROOT / "aoa-wait"), "--profile", "federation"], cwd=CONFIGS_ROOT, timeout_s=180)
+    federation_wait_ref = persist_command_result(federation_case_root, "federation-wait", federation_wait_raw)
+    federation_smoke_raw = run_command([absolute(SCRIPTS_ROOT / "aoa-smoke"), "--profile", "federation"], cwd=CONFIGS_ROOT, timeout_s=180)
+    federation_smoke_ref = persist_command_result(federation_case_root, "primary", federation_smoke_raw)
+    federation_status = (
+        "pass"
+        if federation_up_raw["exit_code"] == 0
+        and not federation_up_raw["timed_out"]
+        and federation_wait_raw["exit_code"] == 0
+        and not federation_wait_raw["timed_out"]
+        and federation_smoke_raw["exit_code"] == 0
+        and not federation_smoke_raw["timed_out"]
+        else "fail"
+    )
+    federation_manifest = {
+        "artifact_kind": "aoa.local-ai-trial.run-manifest",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W0",
+        "case_id": federation_case["case_id"],
+        "executed_at": utc_now(),
+        "runtime_selection": federation_case["runtime_selection"],
+        "model": MODEL,
+        "backend": "route-api",
+        "commands": [federation_up_ref, federation_wait_ref, federation_smoke_ref],
+        "artifact_refs": [
+            federation_up_ref["stdout_path"],
+            federation_up_ref["stderr_path"],
+            federation_wait_ref["stdout_path"],
+            federation_wait_ref["stderr_path"],
+            federation_smoke_ref["stdout_path"],
+            federation_smoke_ref["stderr_path"],
+        ],
+        "notes": [
+            "Federation is brought up after the pure intel-full latency and smoke cases.",
+            "This keeps the latency cases aligned with their frozen runtime selection.",
+        ],
+    }
+    federation_summary = build_result_summary(
+        case=federation_case,
+        status=federation_status,
+        score_breakdown={
+            "federation_up_passed": federation_up_raw["exit_code"] == 0 and not federation_up_raw["timed_out"],
+            "federation_wait_passed": federation_wait_raw["exit_code"] == 0 and not federation_wait_raw["timed_out"],
+            "federation_smoke_passed": federation_smoke_raw["exit_code"] == 0 and not federation_smoke_raw["timed_out"],
+        },
+        observed={
+            "highlights": [
+                f"Federation bring-up command exited with code {federation_up_raw['exit_code']}.",
+                f"Federation wait command exited with code {federation_wait_raw['exit_code']}.",
+                f"Federation smoke command exited with code {federation_smoke_raw['exit_code']}.",
+            ],
+            "failures": [] if federation_status == "pass" else ["Federation bring-up or federation smoke failed."],
+        },
+        failure_class=None if federation_status == "pass" else "service_smoke_failure",
+        reviewer_notes=(
+            "The federation-only runtime surface came up cleanly and passed smoke."
+            if federation_status == "pass"
+            else "The federation-only runtime surface did not come up cleanly."
+        ),
+        boundary_notes=w0_boundary_note(),
+        next_action="Proceed to the combined restart case only if federation stays healthy.",
+    )
+    finalize_case(
+        case=federation_case,
+        log_root=log_root,
+        mirror_root=mirror_root,
+        run_manifest=federation_manifest,
+        result_summary=federation_summary,
+    )
+
+    # Cold restart recovery.
+    case = load_case_spec(log_root, "W0", "cold-restart-recovery")
+    case_root = case_dir(log_root, "W0", "cold-restart-recovery")
+    cold_steps = [
+        ("down", [absolute(SCRIPTS_ROOT / "aoa-down"), "--preset", "intel-full", "--profile", "federation"], 180),
+        ("up", [absolute(SCRIPTS_ROOT / "aoa-up"), "--preset", "intel-full", "--profile", "federation"], 240),
+        ("wait", [absolute(SCRIPTS_ROOT / "aoa-wait"), "--preset", "intel-full", "--profile", "federation"], 240),
+        ("smoke", [absolute(SCRIPTS_ROOT / "aoa-smoke"), "--with-internal", "--preset", "intel-full", "--profile", "federation"], 240),
+    ]
+    cold_refs: list[dict[str, Any]] = []
+    cold_all_ok = True
+    for label, command, timeout_s in cold_steps:
+        raw = run_command(command, cwd=CONFIGS_ROOT, timeout_s=timeout_s)
+        ref = persist_command_result(case_root, label, raw)
+        cold_refs.append(ref)
+        if raw["exit_code"] != 0 or raw["timed_out"]:
+            cold_all_ok = False
+            break
+    cold_status = "pass" if cold_all_ok else "fail"
+    cold_manifest = {
+        "artifact_kind": "aoa.local-ai-trial.run-manifest",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W0",
+        "case_id": case["case_id"],
+        "executed_at": utc_now(),
+        "runtime_selection": {"preset": "intel-full", "profile": "federation", "path": "service-restart"},
+        "model": MODEL,
+        "backend": "compose restart + smoke",
+        "commands": cold_refs,
+        "artifact_refs": [item["stdout_path"] for item in cold_refs] + [item["stderr_path"] for item in cold_refs],
+        "notes": ["This is the disruptive W0 recovery case. It restores the Intel + federation selection."],
+    }
+    cold_summary = build_result_summary(
+        case=case,
+        status=cold_status,
+        score_breakdown={"all_restart_steps_passed": cold_all_ok},
+        observed={
+            "highlights": [
+                f"Recorded {len(cold_refs)} restart steps.",
+                "The final smoke step is included in the same recovery sequence.",
+            ],
+            "failures": [] if cold_status == "pass" else ["One or more restart sequence steps failed or timed out."],
+        },
+        failure_class=None if cold_status == "pass" else "restart_recovery_failure",
+        reviewer_notes=(
+            "The runtime recovered cleanly from a full local restart."
+            if cold_status == "pass"
+            else "The runtime did not recover cleanly from a full local restart."
+        ),
+        boundary_notes=w0_boundary_note(),
+        next_action="If this case fails, hold the pilot at W0 and remediate recovery posture first.",
+    )
+    finalize_case(case=case, log_root=log_root, mirror_root=mirror_root, run_manifest=cold_manifest, result_summary=cold_summary)
+
+    # Agent-full parity sample, then restore baseline.
+    case = load_case_spec(log_root, "W0", "agent-full-parity-sample")
+    case_root = case_dir(log_root, "W0", "agent-full-parity-sample")
+    parity_steps = [
+        ("up", [absolute(SCRIPTS_ROOT / "aoa-up"), "--preset", "agent-full"], 240),
+        ("wait", [absolute(SCRIPTS_ROOT / "aoa-wait"), "--preset", "agent-full"], 240),
+        ("smoke", [absolute(SCRIPTS_ROOT / "aoa-smoke"), "--preset", "agent-full"], 240),
+        ("exact-reply", [absolute(SCRIPTS_ROOT / "aoa-qwen-check"), "--case", "exact-reply", "--json"], 120),
+    ]
+    parity_refs: list[dict[str, Any]] = []
+    parity_ok = True
+    exact_reply_payload: dict[str, Any] | None = None
+    for label, command, timeout_s in parity_steps:
+        raw = run_command(command, cwd=CONFIGS_ROOT, timeout_s=timeout_s)
+        ref = persist_command_result(case_root, label, raw)
+        parity_refs.append(ref)
+        if label == "exact-reply" and raw["stdout"].strip():
+            try:
+                exact_reply_payload = json.loads(raw["stdout"])
+            except json.JSONDecodeError:
+                exact_reply_payload = {"ok": False, "error": "invalid_json"}
+        if raw["exit_code"] != 0 or raw["timed_out"]:
+            parity_ok = False
+            break
+    # Restore baseline after parity sampling.
+    restore_case_root = log_root / "waves" / "W0" / "_restore"
+    restore_case_root.mkdir(parents=True, exist_ok=True)
+    restore_up = persist_command_result(restore_case_root, "restore-up", run_command(up_baseline, cwd=CONFIGS_ROOT, timeout_s=240))
+    restore_wait = persist_command_result(restore_case_root, "restore-wait", run_command(wait_baseline, cwd=CONFIGS_ROOT, timeout_s=240))
+    if restore_up["exit_code"] != 0 or restore_wait["exit_code"] != 0:
+        parity_ok = False
+
+    parity_status = "pass" if parity_ok and exact_reply_payload and exact_reply_payload.get("ok") else "fail"
+    parity_manifest = {
+        "artifact_kind": "aoa.local-ai-trial.run-manifest",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W0",
+        "case_id": case["case_id"],
+        "executed_at": utc_now(),
+        "runtime_selection": {"preset": "agent-full", "profile": None, "path": "langchain-api:/run"},
+        "model": MODEL,
+        "backend": "agent-full parity sample",
+        "commands": parity_refs,
+        "artifact_refs": [item["stdout_path"] for item in parity_refs] + [item["stderr_path"] for item in parity_refs],
+        "notes": [
+            "This is a parity sample only.",
+            "The baseline Intel + federation runtime is restored immediately after this case.",
+        ],
+    }
+    parity_summary = build_result_summary(
+        case=case,
+        status=parity_status,
+        score_breakdown={
+            "agent_full_smoke_passed": parity_ok,
+            "agent_full_exact_reply_passed": bool(exact_reply_payload and exact_reply_payload.get("ok")),
+            "baseline_restored": restore_up["exit_code"] == 0 and restore_wait["exit_code"] == 0,
+        },
+        observed={
+            "highlights": [
+                "One parity sample was taken on `agent-full`.",
+                f"Exact-reply payload: {json.dumps(exact_reply_payload, ensure_ascii=True) if exact_reply_payload else 'none'}",
+            ],
+            "failures": [] if parity_status == "pass" else ["The agent-full parity sample or baseline restoration failed."],
+        },
+        failure_class=None if parity_status == "pass" else "parity_sample_failure",
+        reviewer_notes=(
+            "The parity sample passed and did not show `agent-full` outperforming the Intel baseline on stability."
+            if parity_status == "pass"
+            else "The parity sample failed or baseline restoration failed."
+        ),
+        boundary_notes=w0_boundary_note(),
+        next_action="If this case passes, W0 can gate on the shared runtime metrics plus the service and restart cases.",
+    )
+    finalize_case(case=case, log_root=log_root, mirror_root=mirror_root, run_manifest=parity_manifest, result_summary=parity_summary)
+
+    # Final W0 indexes.
+    results: list[dict[str, Any]] = []
+    for item in build_catalog()["W0"]:
+        result_path = case_dir(log_root, "W0", item["case_id"]) / "result.summary.json"
+        results.append(json.loads(result_path.read_text(encoding="utf-8")))
+
+    exact_result = next(result for result in results if result["case_id"] == "warm-exact-reply")
+    repo_result = next(result for result in results if result["case_id"] == "warm-repo-routing")
+    exact_manifest = json.loads((case_dir(log_root, "W0", "warm-exact-reply") / "run.manifest.json").read_text(encoding="utf-8"))
+    repo_manifest = json.loads((case_dir(log_root, "W0", "warm-repo-routing") / "run.manifest.json").read_text(encoding="utf-8"))
+
+    exact_mean = exact_manifest["latency"]["mean_s"]
+    repo_mean = repo_manifest["latency"]["mean_s"]
+
+    all_pass = all(result["status"] == "pass" for result in results)
+    gate_detail = {
+        "all_cases_passed": all_pass,
+        "exact_reply_mean_s": exact_mean,
+        "repo_routing_mean_s": repo_mean,
+        "exact_reply_budget_s": 3.5,
+        "repo_routing_budget_s": 12.0,
+        "no_timeout_or_5xx_shared_bench": exact_result["score_breakdown"]["no_timeout_or_5xx"] and repo_result["score_breakdown"]["no_timeout_or_5xx"],
+        "intel_not_worse_than_agent_full_by_stability": all(
+            result["status"] == "pass"
+            for result in results
+            if result["case_id"] in {"intel-full-smoke-internal", "agent-full-parity-sample"}
+        ),
+    }
+    gate_pass = (
+        gate_detail["all_cases_passed"]
+        and exact_mean is not None
+        and exact_mean <= 3.5
+        and repo_mean is not None
+        and repo_mean <= 12.0
+        and gate_detail["no_timeout_or_5xx_shared_bench"]
+        and gate_detail["intel_not_worse_than_agent_full_by_stability"]
+    )
+
+    index_payload = {
+        "artifact_kind": "aoa.local-ai-trial.wave-index",
+        "program_id": PROGRAM_ID,
+        "wave_id": "W0",
+        "wave_title": WAVE_METADATA["W0"]["title"],
+        "wave_summary": WAVE_METADATA["W0"]["summary"],
+        "case_count": len(results),
+        "status_counts": {
+            "pass": sum(1 for result in results if result["status"] == "pass"),
+            "fail": sum(1 for result in results if result["status"] == "fail"),
+            "planned": 0,
+        },
+        "gate_result": "pass" if gate_pass else "fail",
+        "next_action": (
+            "Proceed to W1 routing and ownership under the same per-case reporting contract."
+            if gate_pass
+            else "Stop the pilot here and form a remediation sub-plan before any higher wave."
+        ),
+        "cases": [
+            {
+                "case_id": item["case_id"],
+                "status": next(result["status"] for result in results if result["case_id"] == item["case_id"]),
+                "repo_scope": item["repo_scope"],
+                "task_family": item["task_family"],
+                "case_spec": str(case_dir(log_root, "W0", item["case_id"]) / "case.spec.json"),
+                "report_md": str(mirror_root / case_report_name("W0", item["case_id"])),
+                "summary": item["title"],
+            }
+            for item in build_catalog()["W0"]
+        ],
+        "gate_detail": gate_detail,
+    }
+    index_base = wave_index_name("W0")
+    write_json(log_root / f"{index_base}.json", index_payload)
+    index_md = render_wave_index_md(index_payload)
+    write_text(log_root / f"{index_base}.md", index_md)
+    write_text(mirror_root / f"{index_base}.md", index_md)
+
+
+def build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(description="Materialize and run the supervised local Qwen pilot.")
+    parser.add_argument("--log-root", default=str(LOG_ROOT_DEFAULT))
+    parser.add_argument("--mirror-root", default=str(MIRROR_ROOT_DEFAULT))
+    sub = parser.add_subparsers(dest="command", required=True)
+
+    sub.add_parser("materialize", help="Materialize contracts, case specs, and planned wave indexes.")
+    run_wave = sub.add_parser("run-wave", help="Run a wave that already has materialized case specs.")
+    run_wave.add_argument("wave_id", choices=sorted(WAVE_METADATA))
+    refresh_wave_parser = sub.add_parser(
+        "refresh-wave",
+        help="Regenerate Markdown reports and wave index Markdown from stored JSON artifacts.",
+    )
+    refresh_wave_parser.add_argument("wave_id", choices=sorted(WAVE_METADATA))
+    prepare_wave = sub.add_parser(
+        "prepare-wave",
+        help="Prepare a staged wave without mutating target repos.",
+    )
+    prepare_wave.add_argument("wave_id", choices=["W4"])
+    prepare_wave.add_argument("--lane", choices=["docs", "generated", "all"], default="all")
+
+    apply_case = sub.add_parser(
+        "apply-case",
+        help="Apply one approved staged case through isolated worktree validation.",
+    )
+    apply_case.add_argument("wave_id", choices=["W4"])
+    apply_case.add_argument("case_id")
+    return parser
+
+
+def main() -> int:
+    parser = build_parser()
+    args = parser.parse_args()
+
+    log_root = Path(args.log_root)
+    mirror_root = Path(args.mirror_root)
+    catalog = build_catalog()
+
+    if args.command == "materialize":
+        materialize_program(log_root, mirror_root, catalog)
+        print(f"materialized {PROGRAM_ID} at {log_root}")
+        return 0
+
+    if args.command == "run-wave":
+        if args.wave_id == "W0":
+            run_w0(log_root, mirror_root)
+            print(f"executed {PROGRAM_ID} {args.wave_id} at {log_root}")
+            return 0
+        if args.wave_id == "W1":
+            run_w1(log_root, mirror_root)
+            print(f"executed {PROGRAM_ID} {args.wave_id} at {log_root}")
+            return 0
+        if args.wave_id == "W2":
+            run_w2(log_root, mirror_root)
+            print(f"executed {PROGRAM_ID} {args.wave_id} at {log_root}")
+            return 0
+        if args.wave_id == "W3":
+            run_w3(log_root, mirror_root)
+            print(f"executed {PROGRAM_ID} {args.wave_id} at {log_root}")
+            return 0
+        if args.wave_id != "W4":
+            parser.error(f"unsupported wave_id for run-wave: {args.wave_id}")
+            return 2
+        if args.wave_id == "W4":
+            materialize_program(log_root, mirror_root, catalog)
+            print(
+                f"{args.wave_id} specs and indexes are materialized. "
+                "Use `prepare-wave W4 --lane ...` and `apply-case W4 <case_id>` for the staged supervised-edit flow."
+            )
+            return 0
+
+    if args.command == "prepare-wave":
+        if args.wave_id != "W4":
+            parser.error(f"unsupported wave_id for prepare-wave: {args.wave_id}")
+            return 2
+        prepare_w4(log_root, mirror_root, args.lane)
+        print(
+            f"prepared {PROGRAM_ID} {args.wave_id} lane={args.lane} proposals at {log_root}"
+        )
+        return 0
+
+    if args.command == "apply-case":
+        if args.wave_id != "W4":
+            parser.error(f"unsupported wave_id for apply-case: {args.wave_id}")
+            return 2
+        apply_w4(log_root, mirror_root, args.case_id)
+        print(f"applied {PROGRAM_ID} {args.wave_id} case={args.case_id} at {log_root}")
+        return 0
+
+    if args.command == "refresh-wave":
+        if not log_root.exists():
+            materialize_program(log_root, mirror_root, catalog)
+        refresh_wave(log_root, mirror_root, args.wave_id)
+        print(f"refreshed {PROGRAM_ID} {args.wave_id} markdown artifacts")
+        return 0
+
+    parser.error("unsupported command")
+    return 2
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/scripts/aoa-machine-fit b/scripts/aoa-machine-fit
new file mode 100755
index 0000000..33a95e7
--- /dev/null
+++ b/scripts/aoa-machine-fit
@@ -0,0 +1,627 @@
+#!/usr/bin/env python3
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import platform
+import re
+import shutil
+import subprocess
+import sys
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any
+
+SCRIPT_ROOT = Path(__file__).resolve().parents[1]
+DEFAULT_STACK_ROOT = Path(os.environ.get("AOA_STACK_ROOT", "/srv/abyss-stack"))
+DEFAULT_CONFIGS_ROOT = Path(
+    os.environ.get("AOA_CONFIGS_ROOT", str(DEFAULT_STACK_ROOT / "Configs"))
+)
+DEFAULT_MACHINE_FIT_ROOT = DEFAULT_STACK_ROOT / "Logs" / "machine-fit"
+DEFAULT_PACKAGE_NAMES = [
+    "kernel-core",
+    "linux-firmware",
+    "fwupd",
+    "podman",
+    "podman-compose",
+    "mesa-dri-drivers",
+    "mesa-vulkan-drivers",
+    "intel-media-driver",
+    "libva-intel-media-driver",
+    "intel-compute-runtime",
+]
+
+
+def run_command(*args: str) -> tuple[int, str, str]:
+    try:
+        completed = subprocess.run(
+            list(args),
+            check=False,
+            capture_output=True,
+            text=True,
+        )
+    except FileNotFoundError:
+        return 127, "", ""
+    return completed.returncode, completed.stdout.strip(), completed.stderr.strip()
+
+
+def read_json(path: Path) -> dict[str, Any] | None:
+    if not path.exists():
+        return None
+    try:
+        return json.loads(path.read_text(encoding="utf-8"))
+    except (OSError, json.JSONDecodeError):
+        return None
+
+
+def read_os_release() -> dict[str, str]:
+    data: dict[str, str] = {}
+    path = Path("/etc/os-release")
+    if not path.exists():
+        return data
+    for raw_line in path.read_text(encoding="utf-8").splitlines():
+        line = raw_line.strip()
+        if not line or line.startswith("#") or "=" not in line:
+            continue
+        key, value = line.split("=", 1)
+        data[key] = value.strip().strip('"')
+    return data
+
+
+def parse_lscpu() -> dict[str, str]:
+    returncode, output, _ = run_command("lscpu")
+    if returncode != 0:
+        return {}
+    data: dict[str, str] = {}
+    for line in output.splitlines():
+        if ":" not in line:
+            continue
+        key, value = line.split(":", 1)
+        data[key.strip()] = value.strip()
+    return data
+
+
+def int_or_none(value: str | None) -> int | None:
+    if value is None or value == "":
+        return None
+    try:
+        return int(value)
+    except ValueError:
+        try:
+            return int(float(value))
+        except ValueError:
+            return None
+
+
+def read_meminfo() -> dict[str, int | None]:
+    result = {
+        "MemTotal": None,
+        "MemAvailable": None,
+        "SwapTotal": None,
+    }
+    path = Path("/proc/meminfo")
+    if not path.exists():
+        return result
+
+    for line in path.read_text(encoding="utf-8").splitlines():
+        parts = line.split()
+        if len(parts) < 2:
+            continue
+        key = parts[0].rstrip(":")
+        if key in result:
+            value = int_or_none(parts[1])
+            result[key] = None if value is None else value * 1024
+    return result
+
+
+def read_loadavg() -> tuple[float | None, float | None, float | None]:
+    path = Path("/proc/loadavg")
+    if not path.exists():
+        return None, None, None
+    try:
+        raw = path.read_text(encoding="utf-8").strip().split()
+    except OSError:
+        return None, None, None
+    if len(raw) < 3:
+        return None, None, None
+    try:
+        return float(raw[0]), float(raw[1]), float(raw[2])
+    except ValueError:
+        return None, None, None
+
+
+def read_drm_nodes() -> dict[str, Any]:
+    dri = Path("/dev/dri")
+    accel = Path("/dev/accel")
+    render_nodes: list[str] = []
+    accel_nodes: list[str] = []
+    if dri.exists():
+        render_nodes = sorted(path.name for path in dri.glob("renderD*"))
+    if accel.exists():
+        accel_nodes = sorted(path.name for path in accel.glob("accel*"))
+    return {
+        "dev_dri_present": dri.exists(),
+        "render_nodes": render_nodes,
+        "dev_accel_present": accel.exists(),
+        "accel_nodes": accel_nodes,
+    }
+
+
+def read_loaded_modules() -> list[str]:
+    returncode, output, _ = run_command("lsmod")
+    if returncode != 0:
+        return []
+    modules: list[str] = []
+    for index, line in enumerate(output.splitlines()):
+        if index == 0:
+            continue
+        parts = line.split()
+        if parts:
+            modules.append(parts[0])
+    return modules
+
+
+def slugify(text: str) -> str:
+    value = re.sub(r"[^a-z0-9]+", "-", text.strip().lower())
+    value = value.strip("-")
+    return value or "unknown-machine"
+
+
+def detect_hardware_class(cpu_model: str | None, intel_drm_present: bool) -> str | None:
+    if not cpu_model:
+        return None
+    normalized = cpu_model.strip()
+    if intel_drm_present and "intel" not in normalized.lower():
+        normalized = f"intel {normalized}"
+    normalized = normalized.replace("(R)", "").replace("(TM)", "")
+    return slugify(normalized)
+
+
+def parse_pci_devices() -> tuple[list[dict[str, Any]], list[dict[str, Any]]]:
+    returncode, output, _ = run_command("lspci", "-nnk")
+    if returncode != 0:
+        return [], []
+
+    blocks: list[list[str]] = []
+    current: list[str] = []
+    for raw_line in output.splitlines():
+        if not raw_line.strip():
+            continue
+        if re.match(r"^[0-9a-fA-F]{2}:[0-9a-fA-F]{2}\.[0-9a-fA-F]\s", raw_line):
+            if current:
+                blocks.append(current)
+            current = [raw_line.rstrip()]
+        else:
+            current.append(raw_line.rstrip())
+    if current:
+        blocks.append(current)
+
+    display_devices: list[dict[str, Any]] = []
+    ai_devices: list[dict[str, Any]] = []
+
+    for block in blocks:
+        lines = [line.rstrip() for line in block if line.strip()]
+        if not lines:
+            continue
+        header = lines[0]
+        lower = header.lower()
+        if "[8086:" not in lower:
+            continue
+
+        driver_in_use = None
+        kernel_modules: list[str] = []
+        for line in lines[1:]:
+            stripped = line.strip()
+            if stripped.lower().startswith("kernel driver in use:"):
+                driver_in_use = stripped.split(":", 1)[1].strip()
+            elif stripped.lower().startswith("kernel modules:"):
+                raw_modules = stripped.split(":", 1)[1].strip()
+                kernel_modules = [item.strip() for item in raw_modules.split(",") if item.strip()]
+
+        device = {
+            "header": header,
+            "driver_in_use": driver_in_use,
+            "kernel_modules": kernel_modules,
+        }
+
+        if "vga compatible controller" in lower or "display controller" in lower:
+            display_devices.append(device)
+            continue
+        if (
+            "neural accelerator" in lower
+            or "gaussian" in lower
+            or "processing accelerators" in lower
+            or "vpu" in lower
+        ):
+            ai_devices.append(device)
+    return display_devices, ai_devices
+
+
+def parse_group_membership() -> dict[str, bool]:
+    returncode, output, _ = run_command("id", "-nG")
+    groups = set(output.split()) if returncode == 0 else set()
+    return {
+        "in_render_group": "render" in groups,
+        "in_video_group": "video" in groups,
+    }
+
+
+def parse_overlays(value: str) -> list[str]:
+    if not value.strip():
+        return []
+    items = re.split(r"[,:\n]+", value.strip())
+    return [item.strip() for item in items if item.strip()]
+
+
+def package_record(name: str) -> dict[str, Any]:
+    returncode, output, _ = run_command(
+        "rpm",
+        "-q",
+        "--qf",
+        "%{NAME} %{VERSION}-%{RELEASE}.%{ARCH}\n",
+        name,
+    )
+    if returncode != 0:
+        return {
+            "name": name,
+            "installed": False,
+            "version": None,
+        }
+
+    running_kernel = platform.release()
+    entries: list[str] = []
+    for line in output.splitlines():
+        parts = line.strip().split(maxsplit=1)
+        if len(parts) != 2:
+            continue
+        entries.append(parts[1])
+
+    version = None
+    if name == "kernel-core":
+        target = running_kernel
+        for entry in entries:
+            if entry == target:
+                version = entry
+                break
+    if version is None:
+        for suffix in [".x86_64", ".noarch", ".aarch64"]:
+            preferred = [entry for entry in entries if entry.endswith(suffix)]
+            if preferred:
+                version = preferred[0]
+                break
+    if version is None and entries:
+        version = entries[0]
+    return {
+        "name": name,
+        "installed": True,
+        "version": version,
+    }
+
+
+def check_package_freshness(installed_names: list[str]) -> tuple[str, list[str], str | None]:
+    if not installed_names:
+        return "unknown", [], None
+    if shutil.which("dnf") is None:
+        return "unknown", [], None
+
+    command = ["dnf", "-q", "check-update", *installed_names]
+    returncode, output, stderr = run_command(*command)
+    updates: list[str] = []
+    if returncode == 0:
+        return "up-to-date", updates, " ".join(command)
+    if returncode == 100:
+        for line in output.splitlines():
+            stripped = line.strip()
+            if not stripped or stripped.startswith("Last metadata expiration check"):
+                continue
+            if stripped.startswith("Obsoleting Packages"):
+                continue
+            parts = stripped.split()
+            if len(parts) < 3:
+                continue
+            name_arch = parts[0]
+            name = re.sub(r"\.[^.]+$", "", name_arch)
+            updates.append(name)
+        return "updates-available", sorted(set(updates)), " ".join(command)
+    if stderr:
+        return "unknown", [], " ".join(command)
+    return "unknown", [], " ".join(command)
+
+
+def load_profile_names(preset_name: str) -> list[str]:
+    preset_path = DEFAULT_CONFIGS_ROOT / "compose" / "presets" / f"{preset_name}.txt"
+    if not preset_path.exists():
+        return []
+    names: list[str] = []
+    for raw in preset_path.read_text(encoding="utf-8").splitlines():
+        line = raw.split("#", 1)[0].strip()
+        if line:
+            names.append(line)
+    return names
+
+
+def default_ref(mode: str, relative_path: str) -> str | None:
+    private_path = DEFAULT_STACK_ROOT / relative_path
+    if mode == "private" and private_path.exists():
+        return f"local:{private_path}"
+    return None
+
+
+def public_ref(relative_path: str) -> str | None:
+    repo_path = SCRIPT_ROOT / relative_path
+    if repo_path.exists():
+        return f"repo:{relative_path}"
+    return None
+
+
+def build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(
+        description="Capture a bounded machine-fit assessment for abyss-stack."
+    )
+    parser.add_argument("--mode", choices=["public", "private"], default="private")
+    parser.add_argument("--write", help="Optional output path. Defaults to stdout.")
+    parser.add_argument("--assessment-id", help="Optional explicit assessment id.")
+    parser.add_argument(
+        "--noise-threshold-ratio",
+        type=float,
+        default=0.50,
+        help="Warn when 1m loadavg exceeds this fraction of logical CPUs.",
+    )
+    parser.add_argument(
+        "--min-available-memory-bytes",
+        type=int,
+        default=8 * 1024 * 1024 * 1024,
+        help="Warn when available memory drops below this floor.",
+    )
+    parser.add_argument(
+        "--package",
+        action="append",
+        default=[],
+        help="Extra package to inspect for installed version and freshness.",
+    )
+    parser.add_argument("--host-facts-ref", default=None)
+    parser.add_argument("--platform-adaptation-ref", default=None)
+    parser.add_argument("--evidence-ref", action="append", default=[])
+    return parser
+
+
+def main() -> int:
+    parser = build_parser()
+    args = parser.parse_args()
+
+    now = datetime.now(timezone.utc)
+    captured_at = now.replace(microsecond=0).isoformat().replace("+00:00", "Z")
+    timestamp = now.strftime("%Y-%m-%dT%H%M%SZ")
+
+    os_release = read_os_release()
+    lscpu = parse_lscpu()
+    meminfo = read_meminfo()
+    load_1m, load_5m, load_15m = read_loadavg()
+    drm_nodes = read_drm_nodes()
+    loaded_modules = read_loaded_modules()
+    display_devices, ai_devices = parse_pci_devices()
+    group_membership = parse_group_membership()
+
+    cpu_model = lscpu.get("Model name")
+    logical_cpus = int_or_none(lscpu.get("CPU(s)"))
+    hardware_class = detect_hardware_class(cpu_model, drm_nodes["dev_dri_present"])
+    assessment_id = args.assessment_id or f"{timestamp}__machine-fit__{hardware_class or 'unknown-host'}"
+
+    package_names = sorted(set(DEFAULT_PACKAGE_NAMES + args.package))
+    packages = [package_record(name) for name in package_names]
+    installed_names = [record["name"] for record in packages if record["installed"]]
+    freshness_state, updates_available, freshness_command = check_package_freshness(installed_names)
+    missing_packages = [record["name"] for record in packages if not record["installed"]]
+
+    preferred_preset = "intel-full" if drm_nodes["dev_dri_present"] else "agent-full"
+    preferred_profiles = load_profile_names(preferred_preset)
+    current_overlays = parse_overlays(os.environ.get("AOA_EXTRA_COMPOSE_FILES", ""))
+    if drm_nodes["dev_dri_present"] and ai_devices:
+        validated_acceleration_posture = (
+            "OVMS embeddings on Intel GPU; Qwen chat via Ollama; Intel NPU is visible but not yet part of the validated canonical path."
+        )
+    elif drm_nodes["dev_dri_present"]:
+        validated_acceleration_posture = (
+            "Intel GPU is available for OVMS-side acceleration; Qwen chat remains on Ollama."
+        )
+    else:
+        validated_acceleration_posture = (
+            "No Intel accelerator path is assumed; use the generic local inference posture."
+        )
+
+    latest_adaptation = DEFAULT_STACK_ROOT / "Logs" / "platform-adaptations" / "latest" / "latest.private.json"
+    adaptation_record = read_json(latest_adaptation)
+    validated_settings: dict[str, str] = {}
+    if adaptation_record:
+        adaptation_hardware_class = (
+            adaptation_record.get("platform_scope", {}).get("hardware_class")
+            if isinstance(adaptation_record.get("platform_scope"), dict)
+            else None
+        )
+        raw_settings = (
+            adaptation_record.get("adaptation", {}).get("settings", {})
+            if isinstance(adaptation_record.get("adaptation"), dict)
+            else {}
+        )
+        if isinstance(raw_settings, dict) and (
+            adaptation_hardware_class is None or adaptation_hardware_class == hardware_class
+        ):
+            validated_settings = {
+                str(key): str(value)
+                for key, value in raw_settings.items()
+                if value is not None
+            }
+
+    host_facts_ref = args.host_facts_ref
+    if host_facts_ref is None:
+        host_facts_ref = (
+            default_ref(args.mode, "Logs/host-facts/latest.private.json")
+            if args.mode == "private"
+            else public_ref("docs/reference-platform/reference-host.public.json")
+            or public_ref("docs/reference-platform/reference-host.public.json.example")
+        )
+
+    platform_adaptation_ref = args.platform_adaptation_ref
+    if platform_adaptation_ref is None:
+        platform_adaptation_ref = (
+            default_ref(args.mode, "Logs/platform-adaptations/latest/latest.private.json")
+            if args.mode == "private"
+            else public_ref("docs/platform-adaptations/platform-adaptation.public.json.example")
+        )
+
+    envelope_notes: list[str] = []
+    latency_trial_ready = True
+    if logical_cpus and load_1m is not None and load_1m > logical_cpus * args.noise_threshold_ratio:
+        latency_trial_ready = False
+        envelope_notes.append(
+            f"1m loadavg {load_1m:.2f} is above the configured noise threshold for {logical_cpus} logical CPUs."
+        )
+    available_memory = meminfo.get("MemAvailable")
+    if (
+        available_memory is not None
+        and available_memory < args.min_available_memory_bytes
+    ):
+        latency_trial_ready = False
+        envelope_notes.append(
+            "Available memory is below the configured latency-trial floor."
+        )
+    if not group_membership["in_render_group"] and drm_nodes["dev_dri_present"]:
+        envelope_notes.append(
+            "Current user is not in the render group; Intel accelerator containers may need extra attention."
+        )
+    if ai_devices and not drm_nodes["dev_accel_present"]:
+        envelope_notes.append(
+            "Intel AI accelerator PCI device is present but /dev/accel is not exposed on the host."
+        )
+
+    status = "qualified"
+    if not drm_nodes["dev_dri_present"] and preferred_preset == "intel-full":
+        status = "needs-attention"
+    elif freshness_state == "updates-available":
+        status = "needs-attention"
+    elif not latency_trial_ready:
+        status = "qualified-noisy-host"
+
+    summary_parts = [
+        f"Preferred preset is {preferred_preset}.",
+        "Qwen chat should stay on langchain-api /run through the validated local path.",
+    ]
+    if freshness_state == "up-to-date":
+        summary_parts.append("Relevant host packages are current in the configured Fedora repositories.")
+    elif freshness_state == "updates-available":
+        summary_parts.append("Relevant host packages have updates pending in the configured Fedora repositories.")
+    else:
+        summary_parts.append("Package freshness could not be confirmed from the current package-manager context.")
+    if not latency_trial_ready:
+        summary_parts.append("Current host load is noisy for latency-sensitive trials.")
+
+    record = {
+        "artifact_kind": "aoa.machine-fit",
+        "schema_version": "1",
+        "capture_mode": args.mode,
+        "captured_at": captured_at,
+        "captured_by": "scripts/aoa-machine-fit",
+        "assessment_id": assessment_id,
+        "machine": {
+            "os_id": os_release.get("ID"),
+            "os_version_id": os_release.get("VERSION_ID"),
+            "kernel_release": platform.release(),
+            "arch": platform.machine() or None,
+            "cpu_model": cpu_model,
+            "logical_cpus": logical_cpus,
+            "memory_total_bytes": meminfo.get("MemTotal"),
+            "hardware_class": hardware_class,
+        },
+        "driver_posture": {
+            "kernel_modules_loaded": [
+                module
+                for module in ["i915", "xe", "intel_vpu"]
+                if module in loaded_modules
+            ],
+            "dri": {
+                "dev_dri_present": drm_nodes["dev_dri_present"],
+                "render_nodes": drm_nodes["render_nodes"],
+                "current_user_in_render_group": group_membership["in_render_group"],
+                "current_user_in_video_group": group_membership["in_video_group"],
+            },
+            "accel": {
+                "dev_accel_present": drm_nodes["dev_accel_present"],
+                "accel_nodes": drm_nodes["accel_nodes"],
+            },
+            "display_devices": display_devices,
+            "ai_devices": ai_devices,
+        },
+        "package_freshness": {
+            "package_manager": "dnf" if shutil.which("dnf") else None,
+            "state": freshness_state,
+            "packages": packages,
+            "updates_available": updates_available,
+            "missing_packages": missing_packages,
+            "checked_command": freshness_command,
+        },
+        "runtime_recommendation": {
+            "preferred_preset": preferred_preset,
+            "preferred_profile_set": preferred_profiles,
+            "preferred_runtime_path": "intel-full -> langchain-api /run -> litellm/ollama + route-api"
+            if preferred_preset == "intel-full"
+            else "agent-full -> langchain-api /run -> litellm/ollama + route-api",
+            "validated_acceleration_posture": validated_acceleration_posture,
+            "validated_settings": validated_settings,
+            "recommended_overlays": [],
+            "current_overlays": current_overlays,
+            "host_facts_ref": host_facts_ref,
+            "platform_adaptation_ref": platform_adaptation_ref,
+        },
+        "host_envelope": {
+            "loadavg_1m": load_1m,
+            "loadavg_5m": load_5m,
+            "loadavg_15m": load_15m,
+            "available_memory_bytes": available_memory,
+            "latency_trial_ready": latency_trial_ready,
+            "notes": envelope_notes,
+        },
+        "fit_verdict": {
+            "status": status,
+            "summary": " ".join(summary_parts),
+            "next_actions": [
+                f"Run scripts/aoa-doctor --preset {preferred_preset} before launch.",
+                "Refresh host facts when the host or kernel changes.",
+                "Re-run machine-fit after driver, kernel, container-runtime, or benchmark drift.",
+            ],
+            "retest_on": [
+                "kernel update",
+                "linux-firmware update",
+                "mesa or Intel runtime update",
+                "Ollama or langchain-api runtime change",
+                "host load envelope change before latency-sensitive trials",
+            ],
+        },
+        "evidence_refs": args.evidence_ref,
+        "non_claims": [
+            "This record does not claim global model quality.",
+            "This record does not replace bounded runtime benchmarks.",
+            "This record does not prove latency budgets under arbitrary concurrent desktop load.",
+        ],
+    }
+
+    if args.mode == "public":
+        record["redaction"] = {
+            "redacted_fields": [
+                "local-only hostnames",
+                "exact local paths outside repo refs",
+            ]
+        }
+
+    rendered = json.dumps(record, indent=2, ensure_ascii=True) + "\n"
+    if args.write:
+        output_path = Path(args.write)
+        output_path.parent.mkdir(parents=True, exist_ok=True)
+        output_path.write_text(rendered, encoding="utf-8")
+    else:
+        sys.stdout.write(rendered)
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/scripts/aoa-qwen-bench b/scripts/aoa-qwen-bench
new file mode 100755
index 0000000..7db5767
--- /dev/null
+++ b/scripts/aoa-qwen-bench
@@ -0,0 +1,310 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
+# shellcheck source=scripts/aoa-lib.sh
+source "${SCRIPT_DIR}/aoa-lib.sh"
+
+repeat=2
+timeout_s=90
+write_root="${AOA_STACK_ROOT}/Logs/runtime-benchmarks"
+run_url="http://127.0.0.1:5401/run"
+selector_args=()
+
+while (($#)); do
+  case "$1" in
+    --repeat)
+      shift || true
+      (($#)) || aoa_die "missing value after --repeat"
+      repeat="$1"
+      ;;
+    --repeat=*)
+      repeat="${1#*=}"
+      ;;
+    --timeout)
+      shift || true
+      (($#)) || aoa_die "missing value after --timeout"
+      timeout_s="$1"
+      ;;
+    --timeout=*)
+      timeout_s="${1#*=}"
+      ;;
+    --write-root)
+      shift || true
+      (($#)) || aoa_die "missing value after --write-root"
+      write_root="$1"
+      ;;
+    --write-root=*)
+      write_root="${1#*=}"
+      ;;
+    --url)
+      shift || true
+      (($#)) || aoa_die "missing value after --url"
+      run_url="$1"
+      ;;
+    --url=*)
+      run_url="${1#*=}"
+      ;;
+    *)
+      selector_args+=("$1")
+      ;;
+  esac
+  shift || true
+done
+
+aoa_parse_profile_args "${selector_args[@]}"
+aoa_resolve_modules
+aoa_print_profile_summary
+
+has_module() {
+  local target="$1"
+  local module
+  for module in "${AOA_PROFILE_MODULE_NAMES[@]}"; do
+    [[ "$module" == "$target" ]] && return 0
+  done
+  return 1
+}
+
+has_module "41-agent-api.yml" || aoa_die "qwen bench requires 41-agent-api.yml in the selected runtime"
+
+timestamp="$(date -u +%Y-%m-%dT%H%M%SZ)"
+run_dir="${write_root}/runs/${timestamp}__latency-single-turn__workhorse-local-qwen3.5-9b"
+mkdir -p "${run_dir}/raw"
+
+export AOA_QWEN_BENCH_REPEAT="$repeat"
+export AOA_QWEN_BENCH_TIMEOUT_S="$timeout_s"
+export AOA_QWEN_BENCH_URL="$run_url"
+export AOA_QWEN_BENCH_PRESET="$AOA_STACK_PRESET"
+export AOA_QWEN_BENCH_PROFILE="$AOA_STACK_PROFILE"
+export AOA_QWEN_BENCH_RUN_DIR="$run_dir"
+export AOA_QWEN_CHECK_PATH="${SCRIPT_DIR}/aoa-qwen-check"
+
+python3 - <<'PY'
+from __future__ import annotations
+
+import json
+import os
+import platform
+import statistics
+import subprocess
+import sys
+from datetime import datetime, timezone
+from pathlib import Path
+
+repeat = int(os.environ["AOA_QWEN_BENCH_REPEAT"])
+timeout_s = float(os.environ["AOA_QWEN_BENCH_TIMEOUT_S"])
+run_url = os.environ["AOA_QWEN_BENCH_URL"]
+preset = os.environ.get("AOA_QWEN_BENCH_PRESET", "")
+profile = os.environ.get("AOA_QWEN_BENCH_PROFILE", "")
+run_dir = Path(os.environ["AOA_QWEN_BENCH_RUN_DIR"])
+check_path = os.environ["AOA_QWEN_CHECK_PATH"]
+cases = ["exact-reply", "repo-routing"]
+warmup_runs_per_case = 1
+
+
+def maybe_cpu_model() -> str | None:
+    try:
+        output = subprocess.run(
+            ["lscpu"],
+            check=True,
+            text=True,
+            capture_output=True,
+        ).stdout.splitlines()
+    except Exception:
+        return None
+
+    for line in output:
+        if line.startswith("Model name:"):
+            return line.split(":", 1)[1].strip()
+    return None
+
+
+raw_results: list[dict[str, object]] = []
+warmup_results: list[dict[str, object]] = []
+all_passed = True
+
+for case in cases:
+    for warmup_index in range(1, warmup_runs_per_case + 1):
+        proc = subprocess.run(
+            [
+                check_path,
+                "--case",
+                case,
+                "--url",
+                run_url,
+                "--timeout",
+                str(timeout_s),
+                "--json",
+            ],
+            check=False,
+            text=True,
+            capture_output=True,
+        )
+        stdout = proc.stdout.strip()
+        if not stdout:
+            result = {
+                "ok": False,
+                "case": case,
+                "error": f"empty_stdout exit={proc.returncode}",
+            }
+        else:
+            result = json.loads(stdout)
+        result["warmup_index"] = warmup_index
+        result["phase"] = "warmup"
+        warmup_results.append(result)
+        if not result.get("ok"):
+            all_passed = False
+
+    for run_index in range(1, repeat + 1):
+        proc = subprocess.run(
+            [
+                check_path,
+                "--case",
+                case,
+                "--url",
+                run_url,
+                "--timeout",
+                str(timeout_s),
+                "--json",
+            ],
+            check=False,
+            text=True,
+            capture_output=True,
+        )
+        stdout = proc.stdout.strip()
+        if not stdout:
+            result: dict[str, object] = {
+                "ok": False,
+                "case": case,
+                "error": f"empty_stdout exit={proc.returncode}",
+            }
+        else:
+            result = json.loads(stdout)
+        result["run_index"] = run_index
+        result["phase"] = "measured"
+        raw_results.append(result)
+        if not result.get("ok"):
+            all_passed = False
+
+summary_cases: dict[str, object] = {}
+elapsed_all: list[float] = []
+for case in cases:
+    case_rows = [row for row in raw_results if row.get("case") == case]
+    elapsed_values = [
+        float(row["elapsed_s"])
+        for row in case_rows
+        if row.get("ok") and row.get("elapsed_s") is not None
+    ]
+    elapsed_all.extend(elapsed_values)
+    summary_cases[case] = {
+        "runs": len(case_rows),
+        "passed": sum(1 for row in case_rows if row.get("ok")),
+        "mean_s": round(statistics.mean(elapsed_values), 3) if elapsed_values else None,
+        "best_s": round(min(elapsed_values), 3) if elapsed_values else None,
+        "worst_s": round(max(elapsed_values), 3) if elapsed_values else None,
+    }
+
+captured_at = datetime.now(timezone.utc).replace(microsecond=0).isoformat().replace("+00:00", "Z")
+benchmark_id = "qwen3.5-9b-langchain-latency-single-turn"
+selection = {"preset": preset or None, "profile": profile or None}
+truth_refs = []
+if preset:
+    truth_refs.append(f"scripts/aoa-render-services --preset {preset}")
+    truth_refs.append(f"scripts/aoa-smoke --with-internal --preset {preset}")
+elif profile:
+    truth_refs.append(f"scripts/aoa-render-services --profile {profile}")
+    truth_refs.append(f"scripts/aoa-smoke --profile {profile}")
+
+manifest = {
+    "artifact_kind": "aoa.runtime-benchmark",
+    "schema_version": "1",
+    "captured_at": captured_at,
+    "benchmark_id": benchmark_id,
+    "benchmark_family": "latency-single-turn",
+    "runtime_selection": selection,
+    "system_under_test": {
+        "backend": "langchain-api -> ollama-native",
+        "model": "qwen3.5:9b",
+        "profile_class": "workhorse",
+        "context_budget_class": "bounded-local",
+        "quantization_or_runtime_variant": "Q4_K_M via Ollama",
+    },
+    "host_surface": {
+        "os_family": platform.system().lower(),
+        "cpu_model": maybe_cpu_model(),
+    },
+    "runtime_truth_refs": truth_refs,
+    "fixture_surface": {
+        "fixture_family": "qwen-run-path-smoke",
+        "case_count": len(raw_results),
+        "cases": cases,
+        "warmup_runs_per_case": warmup_runs_per_case,
+        "token_budgeting": {
+            "exact-reply": 8,
+            "repo-routing": 120,
+        },
+    },
+    "metrics": {
+        "units": "seconds",
+        "summary_semantics": "end-to-end POST /run latency through langchain-api",
+    },
+    "warmup_results": warmup_results,
+    "results": raw_results,
+    "summary": {
+        "all_passed": all_passed,
+        "warmup_all_passed": all(row.get("ok") for row in warmup_results),
+        "case_breakdown": summary_cases,
+        "overall_mean_s": round(statistics.mean(elapsed_all), 3) if elapsed_all else None,
+        "overall_best_s": round(min(elapsed_all), 3) if elapsed_all else None,
+        "overall_worst_s": round(max(elapsed_all), 3) if elapsed_all else None,
+    },
+    "non_claims": [
+        "This is a runtime latency check, not a reasoning-quality verdict.",
+        "This does not rank Qwen against other models.",
+        "This does not prove long-context behavior or multi-turn stability.",
+    ],
+}
+
+summary = {
+    "benchmark_id": benchmark_id,
+    "captured_at": captured_at,
+    "all_passed": all_passed,
+    "runtime_selection": selection,
+    "case_breakdown": summary_cases,
+    "overall_mean_s": manifest["summary"]["overall_mean_s"],
+    "overall_best_s": manifest["summary"]["overall_best_s"],
+    "overall_worst_s": manifest["summary"]["overall_worst_s"],
+}
+
+notes = [
+    "# Qwen Runtime Notes",
+    "",
+    "- Bench path: `langchain-api /run`.",
+    "- Fixture family: `exact-reply` and `repo-routing`.",
+    "- One uncounted warmup run is executed per case before measured repeats.",
+    "- This is runtime-local evidence for `abyss-stack`, not a portable proof verdict.",
+    "- The check stays on the intended chat path instead of raw `ollama` probing.",
+]
+
+(run_dir / "benchmark.manifest.json").write_text(
+    json.dumps(manifest, indent=2, ensure_ascii=True) + "\n",
+    encoding="utf-8",
+)
+(run_dir / "summary.json").write_text(
+    json.dumps(summary, indent=2, ensure_ascii=True) + "\n",
+    encoding="utf-8",
+)
+(run_dir / "raw" / "results.json").write_text(
+    json.dumps(raw_results, indent=2, ensure_ascii=True) + "\n",
+    encoding="utf-8",
+)
+(run_dir / "raw" / "warmup_results.json").write_text(
+    json.dumps(warmup_results, indent=2, ensure_ascii=True) + "\n",
+    encoding="utf-8",
+)
+(run_dir / "notes.md").write_text("\n".join(notes) + "\n", encoding="utf-8")
+
+print(f"run dir: {run_dir}")
+print(json.dumps(summary, ensure_ascii=True))
+sys.exit(0 if all_passed else 1)
+PY
diff --git a/scripts/aoa-qwen-check b/scripts/aoa-qwen-check
new file mode 100755
index 0000000..a422b90
--- /dev/null
+++ b/scripts/aoa-qwen-check
@@ -0,0 +1,166 @@
+#!/usr/bin/env python3
+from __future__ import annotations
+
+import argparse
+import json
+import sys
+import time
+import urllib.error
+import urllib.request
+
+
+EXACT_REPLY = "Qwen local OK."
+ROUTING_EXPECTED = {
+    "task1": "aoa-evals",
+    "task2": "abyss-stack",
+    "task3": "Tree-of-Sophia",
+}
+
+
+def build_prompt(case: str) -> tuple[str, int]:
+    if case == "exact-reply":
+        return f"Reply exactly with: {EXACT_REPLY}", 8
+
+    if case == "repo-routing":
+        prompt = """Return compact JSON {"task1":"...","task2":"...","task3":"..."}.
+Use exact repo names only.
+aoa-evals = portable proof surfaces for bounded claims.
+abyss-stack = runtime, deployment, storage, lifecycle, and infra glue.
+Tree-of-Sophia = source-first philosophy and world-thought knowledge architecture.
+task1 = portable proof surfaces for bounded claims.
+task2 = the runtime body the system runs on.
+task3 = the source-first knowledge world for philosophy and world thought.
+"""
+        return prompt, 120
+
+    raise ValueError(f"unsupported case: {case}")
+
+
+def extract_json_block(text: str) -> str:
+    stripped = text.strip()
+    if stripped.startswith("```"):
+        lines = stripped.splitlines()
+        if len(lines) >= 3 and lines[-1].strip() == "```":
+            body = "\n".join(lines[1:-1]).strip()
+            if body.startswith("json"):
+                body = body[4:].lstrip()
+            return body
+    return stripped
+
+
+def run_check(url: str, case: str, timeout_s: float, temperature: float, max_tokens: int | None) -> dict[str, object]:
+    prompt, default_max_tokens = build_prompt(case)
+    payload = {
+        "user_text": prompt,
+        "temperature": float(temperature),
+        "max_tokens": int(max_tokens or default_max_tokens),
+    }
+
+    req = urllib.request.Request(
+        url=url,
+        data=json.dumps(payload).encode("utf-8"),
+        headers={"Content-Type": "application/json"},
+        method="POST",
+    )
+
+    start = time.perf_counter()
+    with urllib.request.urlopen(req, timeout=timeout_s) as resp:
+        raw = resp.read().decode("utf-8", errors="ignore")
+        status = resp.status
+    elapsed_s = round(time.perf_counter() - start, 3)
+
+    body = json.loads(raw)
+    answer = str(body.get("answer") or "").strip()
+    backend = body.get("backend")
+    model = body.get("model")
+
+    validation: dict[str, object] = {}
+    ok = False
+
+    if case == "exact-reply":
+        validation["expected"] = EXACT_REPLY
+        validation["observed"] = answer
+        ok = answer == EXACT_REPLY
+    elif case == "repo-routing":
+        parsed = json.loads(extract_json_block(answer))
+        validation["expected"] = ROUTING_EXPECTED
+        validation["observed"] = parsed
+        ok = parsed == ROUTING_EXPECTED
+
+    return {
+        "ok": ok,
+        "case": case,
+        "url": url,
+        "http_status": status,
+        "elapsed_s": elapsed_s,
+        "backend": backend,
+        "model": model,
+        "answer": answer,
+        "validation": validation,
+    }
+
+
+def build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(
+        description="Run a bounded Qwen chat-path check through langchain-api."
+    )
+    parser.add_argument(
+        "--case",
+        choices=["exact-reply", "repo-routing"],
+        required=True,
+    )
+    parser.add_argument("--url", default="http://127.0.0.1:5401/run")
+    parser.add_argument("--timeout", type=float, default=70.0)
+    parser.add_argument("--temperature", type=float, default=0.0)
+    parser.add_argument("--max-tokens", type=int, default=None)
+    parser.add_argument("--json", action="store_true")
+    return parser
+
+
+def main() -> int:
+    parser = build_parser()
+    args = parser.parse_args()
+
+    try:
+        result = run_check(
+            url=args.url,
+            case=args.case,
+            timeout_s=args.timeout,
+            temperature=args.temperature,
+            max_tokens=args.max_tokens,
+        )
+    except urllib.error.HTTPError as exc:
+        payload = exc.read().decode("utf-8", errors="ignore")
+        result = {
+            "ok": False,
+            "case": args.case,
+            "url": args.url,
+            "http_status": exc.code,
+            "elapsed_s": None,
+            "error": f"http_error {exc.code}: {payload[:300]}",
+        }
+    except Exception as exc:
+        result = {
+            "ok": False,
+            "case": args.case,
+            "url": args.url,
+            "http_status": None,
+            "elapsed_s": None,
+            "error": f"{type(exc).__name__}: {exc}",
+        }
+
+    if args.json:
+        sys.stdout.write(json.dumps(result, ensure_ascii=True) + "\n")
+    else:
+        if result.get("ok"):
+            elapsed = result.get("elapsed_s")
+            print(f"ok   qwen {args.case} {args.url} {elapsed}s")
+        else:
+            detail = result.get("error") or result.get("validation")
+            print(f"fail qwen {args.case} {args.url} {detail}")
+
+    return 0 if result.get("ok") else 1
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/scripts/aoa-qwen-run b/scripts/aoa-qwen-run
new file mode 100755
index 0000000..4d15dbc
--- /dev/null
+++ b/scripts/aoa-qwen-run
@@ -0,0 +1,121 @@
+#!/usr/bin/env python3
+from __future__ import annotations
+
+import argparse
+import json
+import sys
+import time
+import urllib.error
+import urllib.request
+from pathlib import Path
+
+
+def run_prompt(
+    *,
+    prompt_file: Path,
+    url: str,
+    timeout_s: float,
+    temperature: float,
+    max_tokens: int | None,
+) -> dict[str, object]:
+    prompt = prompt_file.read_text(encoding="utf-8")
+    payload = {
+        "user_text": prompt,
+        "temperature": float(temperature),
+    }
+    if max_tokens is not None:
+        payload["max_tokens"] = int(max_tokens)
+
+    req = urllib.request.Request(
+        url=url,
+        data=json.dumps(payload).encode("utf-8"),
+        headers={"Content-Type": "application/json"},
+        method="POST",
+    )
+
+    start = time.perf_counter()
+    with urllib.request.urlopen(req, timeout=timeout_s) as resp:
+        raw = resp.read().decode("utf-8", errors="ignore")
+        status = resp.status
+    elapsed_s = round(time.perf_counter() - start, 3)
+
+    body = json.loads(raw)
+    return {
+        "ok": True,
+        "url": url,
+        "http_status": status,
+        "elapsed_s": elapsed_s,
+        "backend": body.get("backend"),
+        "model": body.get("model"),
+        "answer": str(body.get("answer") or "").strip(),
+        "prompt_file": str(prompt_file),
+    }
+
+
+def build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(
+        description="Run a bounded prompt file through langchain-api /run."
+    )
+    parser.add_argument("--prompt-file", required=True)
+    parser.add_argument("--url", default="http://127.0.0.1:5401/run")
+    parser.add_argument("--timeout", type=float, default=70.0)
+    parser.add_argument("--temperature", type=float, default=0.0)
+    parser.add_argument("--max-tokens", type=int, default=None)
+    parser.add_argument("--json", action="store_true")
+    return parser
+
+
+def main() -> int:
+    parser = build_parser()
+    args = parser.parse_args()
+    prompt_file = Path(args.prompt_file)
+
+    try:
+        result = run_prompt(
+            prompt_file=prompt_file,
+            url=args.url,
+            timeout_s=args.timeout,
+            temperature=args.temperature,
+            max_tokens=args.max_tokens,
+        )
+    except urllib.error.HTTPError as exc:
+        payload = exc.read().decode("utf-8", errors="ignore")
+        result = {
+            "ok": False,
+            "url": args.url,
+            "http_status": exc.code,
+            "elapsed_s": None,
+            "backend": None,
+            "model": None,
+            "answer": "",
+            "prompt_file": str(prompt_file),
+            "error": f"http_error {exc.code}: {payload[:300]}",
+        }
+    except Exception as exc:
+        result = {
+            "ok": False,
+            "url": args.url,
+            "http_status": None,
+            "elapsed_s": None,
+            "backend": None,
+            "model": None,
+            "answer": "",
+            "prompt_file": str(prompt_file),
+            "error": f"{type(exc).__name__}: {exc}",
+        }
+
+    if args.json:
+        sys.stdout.write(json.dumps(result, ensure_ascii=True) + "\n")
+    else:
+        if result.get("ok"):
+            print(
+                f"ok   qwen run {result['url']} {result['elapsed_s']}s {result['prompt_file']}"
+            )
+        else:
+            print(f"fail qwen run {result['url']} {result.get('error')}")
+
+    return 0 if result.get("ok") else 1
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/scripts/aoa-smoke b/scripts/aoa-smoke
index ddc5e2e..517ed60 100755
--- a/scripts/aoa-smoke
+++ b/scripts/aoa-smoke
@@ -63,6 +63,7 @@ fi
 
 if has_module "41-agent-api.yml"; then
   aoa_probe_http "langchain-api" "http://127.0.0.1:5401/health" || failures=$((failures + 1))
+  "${SCRIPT_DIR}/aoa-qwen-check" --case exact-reply || failures=$((failures + 1))
 fi
 
 if has_module "43-federation-router.yml"; then
diff --git a/scripts/validate_stack.py b/scripts/validate_stack.py
index c67e979..6651900 100644
--- a/scripts/validate_stack.py
+++ b/scripts/validate_stack.py
@@ -26,7 +26,12 @@
 REQUIRED_SCRIPTS = {
     "aoa-doctor",
     "aoa-host-facts",
+    "aoa-machine-fit",
     "aoa-platform-adaptation",
+    "aoa-local-ai-trials",
+    "aoa-qwen-check",
+    "aoa-qwen-run",
+    "aoa-qwen-bench",
     "aoa-export-memo-candidate",
     "aoa-export-runtime-evidence-selection",
     "aoa-export-artifact-hook-candidate",
@@ -68,6 +73,7 @@
     ROOT / "docs" / "PROFILE_RECIPES.md",
     ROOT / "docs" / "RENDER_TRUTH.md",
     ROOT / "docs" / "RUNTIME_BENCH_POLICY.md",
+    ROOT / "docs" / "LOCAL_AI_TRIALS.md",
     ROOT / "docs" / "PLATFORM_ADAPTATION_POLICY.md",
     ROOT / "docs" / "BRANCH_POLICY.md",
     ROOT / "docs" / "MEMO_RUNTIME_SEAM.md",
@@ -77,6 +83,7 @@
     ROOT / "docs" / "INTERNAL_PROBES.md",
     ROOT / "docs" / "REFERENCE_PLATFORM.md",
     ROOT / "docs" / "REFERENCE_PLATFORM_SPEC.md",
+    ROOT / "docs" / "MACHINE_FIT_POLICY.md",
     ROOT / "docs" / "SECRETS_BOOTSTRAP.md",
     ROOT / "docs" / "WINDOWS_BRIDGE.md",
     ROOT / "docs" / "WINDOWS_SETUP.md",
@@ -84,6 +91,9 @@
     ROOT / "docs" / "reference-platform" / "README.md",
     ROOT / "docs" / "reference-platform" / "schema.v1.json",
     ROOT / "docs" / "reference-platform" / "reference-host.public.json.example",
+    ROOT / "docs" / "machine-fit" / "README.md",
+    ROOT / "docs" / "machine-fit" / "schema.v1.json",
+    ROOT / "docs" / "machine-fit" / "machine-fit.public.json.example",
     ROOT / "docs" / "platform-adaptations" / "README.md",
     ROOT / "docs" / "platform-adaptations" / "schema.v1.json",
     ROOT / "docs" / "platform-adaptations" / "platform-adaptation.public.json.example",
@@ -235,6 +245,8 @@ def validate_paths(errors: list[str]) -> None:
         errors.append("README.md must route readers to docs/REFERENCE_PLATFORM.md")
     if "docs/REFERENCE_PLATFORM_SPEC.md" not in readme:
         errors.append("README.md must route readers to docs/REFERENCE_PLATFORM_SPEC.md")
+    if "docs/MACHINE_FIT_POLICY.md" not in readme:
+        errors.append("README.md must route readers to docs/MACHINE_FIT_POLICY.md")
     if "docs/PLATFORM_ADAPTATION_POLICY.md" not in readme:
         errors.append("README.md must route readers to docs/PLATFORM_ADAPTATION_POLICY.md")
     if "docs/BRANCH_POLICY.md" not in readme:
@@ -248,6 +260,23 @@ def validate_paths(errors: list[str]) -> None:
     if "docs/KAG_RUNTIME_SEAM.md" not in readme:
         errors.append("README.md must route readers to docs/KAG_RUNTIME_SEAM.md")
 
+    local_ai_trials = (ROOT / "docs" / "LOCAL_AI_TRIALS.md").read_text(encoding="utf-8")
+    for required_snippet in (
+        "prepare-wave W4 --lane docs",
+        "apply-case W4 <case-id>",
+        "proposal.edit-spec.json",
+        "exact_replace",
+        "anchored_replace",
+        "deterministically inside the runner",
+        "script_refresh",
+        "approval.status.json",
+        "isolated git worktree",
+    ):
+        if required_snippet not in local_ai_trials:
+            errors.append(
+                f"docs/LOCAL_AI_TRIALS.md must mention `{required_snippet}`"
+            )
+
     paths_doc = (ROOT / "docs" / "PATHS.md").read_text(encoding="utf-8")
     if "/srv/abyss-stack" not in paths_doc:
         errors.append("docs/PATHS.md must mention /srv/abyss-stack")