docs: add context optimization design spec, implementation plan, and pi-layer research by trek-e · Pull Request #3474 · gsd-build/gsd-2

trek-e · 2026-04-03T21:05:52Z

Spec: 6-change design for GSD extension context optimization
Plan: 9-task TDD implementation plan with exact file paths and code
Pi-layer doc: 10 infrastructure opportunities (research only, not planned)

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

…pi-layer research - Spec: 6-change design for GSD extension context optimization - Plan: 9-task TDD implementation plan with exact file paths and code - Pi-layer doc: 10 infrastructure opportunities (research only, not planned) Part of #3171, #3406, #3452, #3433. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Introduces PhaseAnchor read/write utilities so downstream agents can inherit decisions, blockers, and intent written at phase boundaries without re-inferring from conversation history. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ent preferences Implement ADR-004 Phase 2 capability scoring with 7-dimension model profiles, task requirement vectors, and weighted scoring. Add ContextManagementConfig preferences for observation masking thresholds. Wire capability scoring into auto-model-selection dispatch path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…cation Register observation masker in before_provider_request hook to replace old tool results with placeholders during auto-mode. Add tool result truncation (configurable via context_management.tool_result_max_chars). Inject phase handoff anchors into prompt builders so downstream phases inherit decisions from research/planning. Write anchors after successful phase completion. Update ADR-004 status to Implemented. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ment Update dynamic-model-routing.md with capability-aware scoring section. Update token-optimization.md with observation masking, tool truncation, and phase handoff anchor documentation. Update configuration.md with context_management preference block and capability_routing flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix slice anchor collisions: key anchors by (phase, sliceId) so research-slice/plan-slice anchors from different slices no longer overwrite each other within the same milestone. - Fix payload shape mismatch: context-masker and tool result truncation now handle both internal message format (type field) and provider API formats (role=tool, content arrays with tool_result blocks). - Plumb context_management into KNOWN_PREFERENCE_KEYS and mergePreferences() so the config is properly recognized and merged. - Remove dead compaction_threshold_percent config that was validated and documented but never read at runtime. - Populate structured handoff data in phase anchors by extracting decisions, blockers, and next steps from the artifact files produced by each phase. https://claude.ai/code/session_012ysgpj3kKCNcZdEL7W5eRe

github-actions · 2026-04-03T21:06:14Z

🔴 PR Risk Report — CRITICAL


Files changed	21
Systems affected	4
Overall risk	🔴 CRITICAL

Affected Systems

Risk	System
🔴 critical	Auto Engine
🟠 high	GSD Workflow
🟡 medium	Model System
🟢 low	Loader/Bootstrap

File Breakdown

Risk	File	Systems
🔴	`src/resources/extensions/gsd/auto-model-selection.ts`	Auto Engine, Model System
🔴	`src/resources/extensions/gsd/auto/phases.ts`	Auto Engine
🟠	`src/resources/extensions/gsd/bootstrap/register-hooks.ts`	GSD Workflow, Loader/Bootstrap
🟡	`src/resources/extensions/gsd/model-router.ts`	Model System
🟠	`src/resources/extensions/gsd/prompts/execute-task.md`	GSD Workflow
⚪	`docs/ADR-004-capability-aware-model-routing.md`	(unclassified)
⚪	`docs/configuration.md`	(unclassified)
⚪	`docs/dynamic-model-routing.md`	(unclassified)
⚪	`docs/pi-context-optimization-opportunities.md`	(unclassified)
⚪	`docs/token-optimization.md`	(unclassified)
⚪	`src/resources/extensions/gsd/auto-prompts.ts`	(unclassified)
⚪	`src/resources/extensions/gsd/complexity-classifier.ts`	(unclassified)
⚪	`src/resources/extensions/gsd/context-masker.ts`	(unclassified)
⚪	`src/resources/extensions/gsd/docs/preferences-reference.md`	(unclassified)
⚪	`src/resources/extensions/gsd/phase-anchor.ts`	(unclassified)
⚪	`src/resources/extensions/gsd/preferences-types.ts`	(unclassified)
⚪	`src/resources/extensions/gsd/preferences-validation.ts`	(unclassified)
⚪	`src/resources/extensions/gsd/preferences.ts`	(unclassified)
⚪	`src/resources/extensions/gsd/tests/context-masker.test.ts`	(unclassified)
⚪	`src/resources/extensions/gsd/tests/model-router.test.ts`	(unclassified)
⚪	`src/resources/extensions/gsd/tests/phase-anchor.test.ts`	(unclassified)

⚠️ Critical risk — please verify: state persistence, auth token lifecycle, agent loop race conditions, RPC protocol compatibility.

jeremymcs · 2026-04-04T17:07:42Z

Adversarial Review — Context Management & Handoff Behaviors

Verdict: needs-attention

This change introduces context-management and handoff behaviors that can silently fail or propagate stale state under realistic runtime conditions.

Findings

[high] Tool-result truncation misses canonical tool result messages (register-hooks.ts:289-299)

before_provider_request truncation only treats messages as tool output when msg.type is toolResult/tool_result or msg.role === "tool". In this codebase, canonical tool results are role: "toolResult" messages (per shared message contract), so this path is skipped for normal tool outputs. Under long sessions, large tool results are repeatedly re-sent, defeating the intended guard and increasing risk of context blowouts/compaction churn.

Recommendation: Handle the canonical role === "toolResult" shape explicitly in truncation/masking logic, and add integration tests using real Message objects from convertToLlm() rather than synthetic type fields.

[high] Observation masking can become a no-op for valid Anthropic-style user turns (context-masker.ts:17-33)

Turn detection treats a user turn as type === "user" or role === "user" with non-array content. Valid user messages with block arrays (e.g., text/tool-result block formats) are not counted as turns, causing findTurnBoundary() to return 0 and skip masking entirely. This creates a silent failure mode where old tool outputs remain unmasked in precisely the long, block-heavy conversations this feature targets.

Recommendation: Count role: "user" messages as turns regardless of string-vs-array content, then exclude only pure tool-result carrier messages explicitly. Add tests with real mixed block arrays (text + tool blocks) and array-only user text messages.

[medium] Phase anchors written based on artifact existence, not successful unit completion (auto/phases.ts:1191-1217)

Anchor writing is documented as happening after successful research/planning completion, but the guard is artifactVerified && mid && phase. artifactVerified is file-existence based and can be true from stale prior artifacts even when the current unit run failed/was partial. This can persist outdated decisions/blockers into anchor files and then inject stale guidance into downstream prompts.

Recommendation: Gate anchor writes on explicit successful unit status (and preferably artifact freshness tied to current run timestamp/session) before calling extractHandoffData/writePhaseAnchor.

Next Steps

Patch context masking/truncation to support canonical toolResult and array-content user messages, then add provider-shape integration tests.
Require successful unit completion + freshness checks before writing phase anchors to prevent stale handoff propagation.

Review generated via Codex adversarial review

trek-e · 2026-04-04T19:21:45Z

think the bot went rogue last night on this, was supposed to be a doc update

trek-e and others added 8 commits April 3, 2026 15:45

feat(context): add observation masking for auto-mode sessions

1272438

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chore: remove internal planning artifacts from PR

6ea9776

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions bot added the documentation Improvements or additions to documentation label Apr 3, 2026

trek-e closed this Apr 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: add context optimization design spec, implementation plan, and pi-layer research#3474

docs: add context optimization design spec, implementation plan, and pi-layer research#3474
trek-e wants to merge 8 commits intomainfrom
claude/address-pr-comments-zE6OZ

trek-e commented Apr 3, 2026

Uh oh!

github-actions bot commented Apr 3, 2026

Uh oh!

jeremymcs commented Apr 4, 2026

Uh oh!

trek-e commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

trek-e commented Apr 3, 2026

Uh oh!

github-actions bot commented Apr 3, 2026

🔴 PR Risk Report — CRITICAL

Affected Systems

Uh oh!

jeremymcs commented Apr 4, 2026

Adversarial Review — Context Management & Handoff Behaviors

Findings

Next Steps

Uh oh!

trek-e commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants