Skip to content

docs: add context optimization design spec, implementation plan, and pi-layer research#3474

Closed
trek-e wants to merge 8 commits intomainfrom
claude/address-pr-comments-zE6OZ
Closed

docs: add context optimization design spec, implementation plan, and pi-layer research#3474
trek-e wants to merge 8 commits intomainfrom
claude/address-pr-comments-zE6OZ

Conversation

@trek-e
Copy link
Copy Markdown
Collaborator

@trek-e trek-e commented Apr 3, 2026

  • Spec: 6-change design for GSD extension context optimization
  • Plan: 9-task TDD implementation plan with exact file paths and code
  • Pi-layer doc: 10 infrastructure opportunities (research only, not planned)

Part of #3171, #3406, #3452, #3433.

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

trek-e and others added 8 commits April 3, 2026 15:45
…pi-layer research

- Spec: 6-change design for GSD extension context optimization
- Plan: 9-task TDD implementation plan with exact file paths and code
- Pi-layer doc: 10 infrastructure opportunities (research only, not planned)

Part of #3171, #3406, #3452, #3433.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces PhaseAnchor read/write utilities so downstream agents can
inherit decisions, blockers, and intent written at phase boundaries
without re-inferring from conversation history.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ent preferences

Implement ADR-004 Phase 2 capability scoring with 7-dimension model
profiles, task requirement vectors, and weighted scoring. Add
ContextManagementConfig preferences for observation masking thresholds.
Wire capability scoring into auto-model-selection dispatch path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cation

Register observation masker in before_provider_request hook to replace
old tool results with placeholders during auto-mode. Add tool result
truncation (configurable via context_management.tool_result_max_chars).
Inject phase handoff anchors into prompt builders so downstream phases
inherit decisions from research/planning. Write anchors after successful
phase completion. Update ADR-004 status to Implemented.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ment

Update dynamic-model-routing.md with capability-aware scoring section.
Update token-optimization.md with observation masking, tool truncation,
and phase handoff anchor documentation. Update configuration.md with
context_management preference block and capability_routing flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix slice anchor collisions: key anchors by (phase, sliceId) so
  research-slice/plan-slice anchors from different slices no longer
  overwrite each other within the same milestone.
- Fix payload shape mismatch: context-masker and tool result truncation
  now handle both internal message format (type field) and provider API
  formats (role=tool, content arrays with tool_result blocks).
- Plumb context_management into KNOWN_PREFERENCE_KEYS and
  mergePreferences() so the config is properly recognized and merged.
- Remove dead compaction_threshold_percent config that was validated
  and documented but never read at runtime.
- Populate structured handoff data in phase anchors by extracting
  decisions, blockers, and next steps from the artifact files
  produced by each phase.

https://claude.ai/code/session_012ysgpj3kKCNcZdEL7W5eRe
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

🔴 PR Risk Report — CRITICAL

Files changed 21
Systems affected 4
Overall risk 🔴 CRITICAL

Affected Systems

Risk System
🔴 critical Auto Engine
🟠 high GSD Workflow
🟡 medium Model System
🟢 low Loader/Bootstrap
File Breakdown
Risk File Systems
🔴 src/resources/extensions/gsd/auto-model-selection.ts Auto Engine, Model System
🔴 src/resources/extensions/gsd/auto/phases.ts Auto Engine
🟠 src/resources/extensions/gsd/bootstrap/register-hooks.ts GSD Workflow, Loader/Bootstrap
🟡 src/resources/extensions/gsd/model-router.ts Model System
🟠 src/resources/extensions/gsd/prompts/execute-task.md GSD Workflow
docs/ADR-004-capability-aware-model-routing.md (unclassified)
docs/configuration.md (unclassified)
docs/dynamic-model-routing.md (unclassified)
docs/pi-context-optimization-opportunities.md (unclassified)
docs/token-optimization.md (unclassified)
src/resources/extensions/gsd/auto-prompts.ts (unclassified)
src/resources/extensions/gsd/complexity-classifier.ts (unclassified)
src/resources/extensions/gsd/context-masker.ts (unclassified)
src/resources/extensions/gsd/docs/preferences-reference.md (unclassified)
src/resources/extensions/gsd/phase-anchor.ts (unclassified)
src/resources/extensions/gsd/preferences-types.ts (unclassified)
src/resources/extensions/gsd/preferences-validation.ts (unclassified)
src/resources/extensions/gsd/preferences.ts (unclassified)
src/resources/extensions/gsd/tests/context-masker.test.ts (unclassified)
src/resources/extensions/gsd/tests/model-router.test.ts (unclassified)
src/resources/extensions/gsd/tests/phase-anchor.test.ts (unclassified)

⚠️ Critical risk — please verify: state persistence, auth token lifecycle, agent loop race conditions, RPC protocol compatibility.

@jeremymcs
Copy link
Copy Markdown
Collaborator

Adversarial Review — Context Management & Handoff Behaviors

Verdict: needs-attention

This change introduces context-management and handoff behaviors that can silently fail or propagate stale state under realistic runtime conditions.

Findings

[high] Tool-result truncation misses canonical tool result messages (register-hooks.ts:289-299)

before_provider_request truncation only treats messages as tool output when msg.type is toolResult/tool_result or msg.role === "tool". In this codebase, canonical tool results are role: "toolResult" messages (per shared message contract), so this path is skipped for normal tool outputs. Under long sessions, large tool results are repeatedly re-sent, defeating the intended guard and increasing risk of context blowouts/compaction churn.

Recommendation: Handle the canonical role === "toolResult" shape explicitly in truncation/masking logic, and add integration tests using real Message objects from convertToLlm() rather than synthetic type fields.


[high] Observation masking can become a no-op for valid Anthropic-style user turns (context-masker.ts:17-33)

Turn detection treats a user turn as type === "user" or role === "user" with non-array content. Valid user messages with block arrays (e.g., text/tool-result block formats) are not counted as turns, causing findTurnBoundary() to return 0 and skip masking entirely. This creates a silent failure mode where old tool outputs remain unmasked in precisely the long, block-heavy conversations this feature targets.

Recommendation: Count role: "user" messages as turns regardless of string-vs-array content, then exclude only pure tool-result carrier messages explicitly. Add tests with real mixed block arrays (text + tool blocks) and array-only user text messages.


[medium] Phase anchors written based on artifact existence, not successful unit completion (auto/phases.ts:1191-1217)

Anchor writing is documented as happening after successful research/planning completion, but the guard is artifactVerified && mid && phase. artifactVerified is file-existence based and can be true from stale prior artifacts even when the current unit run failed/was partial. This can persist outdated decisions/blockers into anchor files and then inject stale guidance into downstream prompts.

Recommendation: Gate anchor writes on explicit successful unit status (and preferably artifact freshness tied to current run timestamp/session) before calling extractHandoffData/writePhaseAnchor.

Next Steps

  • Patch context masking/truncation to support canonical toolResult and array-content user messages, then add provider-shape integration tests.
  • Require successful unit completion + freshness checks before writing phase anchors to prevent stale handoff propagation.

Review generated via Codex adversarial review

@trek-e
Copy link
Copy Markdown
Collaborator Author

trek-e commented Apr 4, 2026

think the bot went rogue last night on this, was supposed to be a doc update

@trek-e trek-e closed this Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants