Skip to content

Add: Dynamic prompt cache breakpoints for Anthropic prefix caching#621

Merged
FL4TLiN3 merged 7 commits intomainfrom
feat/prompt-cache-breakpoints
Feb 25, 2026
Merged

Add: Dynamic prompt cache breakpoints for Anthropic prefix caching#621
FL4TLiN3 merged 7 commits intomainfrom
feat/prompt-cache-breakpoints

Conversation

@FL4TLiN3
Copy link
Contributor

Summary

  • Add applyCacheBreakpoints() pure function that places optimal cache breakpoints before each LLM call
    • Breakpoint 1: Instruction message (existing, preserved)
    • Breakpoint 2: Last message in conversation (dynamic frontier, moves each turn)
  • Integrate in generatingToolCallLogic — the only LLM call site in the state machine
  • Add cache usage verification to existing E2E tests (providers.test.ts, continue.test.ts)

Impact: On multi-turn Anthropic runs, Turn 2+ gets prefix cache hits on the entire previous conversation. Verified with 3-turn test: Turn 2 cached 1,164 tokens (40.8%), Turn 3 cached 2,328 tokens (36.2%).

No changes needed for OpenAI (automatic prefix caching) or Google Gemini (automatic implicit caching).

Test plan

  • Unit tests: 8 new tests for applyCacheBreakpoints() — empty array, single message, multi-message, instruction preservation, stale flag clearing, immutability, reference reuse, multi-turn simulation
  • Typecheck passes
  • Format/lint clean
  • E2E providers.test.ts: cache token metric fields verified across all 3 providers
  • E2E continue.test.ts: cache usage flows through multi-turn conversations
  • Manual verification: 3-turn run with claude-sonnet-4-5 confirms cachedInputTokens > 0 on Turn 2+

🤖 Generated with Claude Code

FL4TLiN3 and others added 7 commits February 25, 2026 02:00
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ching

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BP1 on system message covers tools+system (prefix order: tools→system→messages).
BP2-4 distributed every ~20 content blocks across conversation messages,
working backwards from the last message. Removes redundant tool-level BP.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace explicit applyCacheBreakpoints() with Anthropic's request-level
automatic caching (cache_control: {type: "ephemeral"}). Auto-places
breakpoints on last system, tool, and message blocks optimally.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FL4TLiN3 FL4TLiN3 merged commit 5cbcda9 into main Feb 25, 2026
11 checks passed
@FL4TLiN3 FL4TLiN3 deleted the feat/prompt-cache-breakpoints branch February 25, 2026 03:10
@FL4TLiN3 FL4TLiN3 mentioned this pull request Feb 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant