You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: agent/context/README.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,9 +37,9 @@ Both [Anthropic](https://docs.anthropic.com/en/docs/build-with-claude/compaction
37
37
38
38
-**Keep summaries intact** — every summarization pass discards detail. We never re-summarize existing summaries; they're preserved as-is and new summaries are added alongside them ([LangChain `moving_summary_buffer`](https://langchain-doc.readthedocs.io/en/latest/modules/memory/types/summary_buffer.html)).
39
39
-**Keep recent turns verbatim** — the most recent exchanges carry the highest signal. Summarize the older prefix, never the tail ([Anthropic](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents), [LangChain summary-buffer](https://langchain-doc.readthedocs.io/en/latest/modules/memory/types/summary_buffer.html)). Anthropic's Claude Code uses the same shape: compressed context + the N most recently accessed items ([source](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents)).
40
-
-**Use cost as a secondary signal** — large cache write costs indicate a bloated context even if token counts look fine. We use cost thresholds alongside token thresholds ([Anthropic prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching)).
41
-
-**Compact before hitting the hard limit** — model recall drops well before the hard context window. Compact proactively at a soft threshold, not at the edge ([OpenAI cookbook](https://cookbook.openai.com/examples/context_summarization_with_realtime_api), [Anthropic post on compaction](https://docs.anthropic.com/en/docs/build-with-claude/compaction), [Anthropic post on context engineering](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents)).
40
+
-**Compact before hitting the hard limit** — model recall drops well before the hard context window. We compact at a configurable soft token limit (defaults to 60% of the context window), matching the industry-standard approach used by [Claude Code](https://docs.anthropic.com/en/docs/build-with-claude/compaction), [OpenAI](https://developers.openai.com/api/docs/guides/context-management), and [LangChain](https://langchain-doc.readthedocs.io/en/latest/modules/memory/types/summary_buffer.html).
42
41
-**Don't summarize old tool call results** — raw tool output is useful when fresh but redundant once acted on. Clearing old results is the lightest-touch compaction step ([Anthropic](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents)).
42
+
-**Compress tool results in-place** — no matter if we compact the messages or not we do run tool result compression on it which truncates tool results older than the last 3 user turns.
43
43
44
44
## What you should know
45
45
@@ -49,7 +49,7 @@ Both [Anthropic](https://docs.anthropic.com/en/docs/build-with-claude/compaction
49
49
50
50
When you call `compact()`:
51
51
52
-
1.**`checkLimit()`** — checks thresholds (cache write cost, cache write tokens, soft token limit). First match fires a `reached: true`. This mirrors Anthropic's token-threshold trigger and OpenAI's `compact_threshold`.
52
+
1.**`checkLimit()`** — checks whether context tokens exceed the soft token limit (defaults to 60% of the context window). Returns `reached: true` when the limit is crossed.
53
53
2.**`split()`** — finds the boundary between "prefix to summarize" and "tail to preserve." We try to keep 5 recent user turns, then 3, then 1, each checked against a token budget (40% of context window). Falls back to the largest token-bounded suffix that fits. Token-bounded retention follows the LangChain `max_token_limit` pattern.
54
54
3.**`summarize()`** — sends the messages to compact to the LLM with a summarization prompt. Custom instructions replace the default prompt (when `instructions.strategy` is set to `replace`), matching Anthropic's `instructions` parameter behavior.
55
55
4.**Reassemble** — We put together the messages to preserve from `split()` and the `summary` for a new array of messages.
@@ -58,7 +58,7 @@ When you call `compact()`:
58
58
59
59
```ts
60
60
importtype { Model, Api } from'@mariozechner/pi-ai';
constSUMMARIZE_SYSTEM=`You are a summarizer. Given a conversation history, produce a concise summary that preserves key facts, decisions, topics, and context needed to continue the conversation. Output only the summary, no preamble.`;
5
6
@@ -61,24 +62,15 @@ export async function summarize(
0 commit comments