Fix auto-compaction token counting to include system prompt and tools #6411

HansPeterRadtke · 2026-01-09T09:05:55Z

Summary

Type of Change

AI Assistance

This PR was created or reviewed with AI assistance

Testing

Related Issues

Relates to #ISSUE_ID
Discussion: LINK (if any)

Screenshots/Demos (for UX changes)

Before:

After:

DOsinga

I don't think this is how token counting works /cc @katzdave

DOsinga · 2026-01-09T12:59:17Z

crates/goose/src/context_mgmt/mod.rs

-        Some(tokens) => (tokens as usize, "session metadata"),
+        Some(tokens) => {
+            // Session metadata only tracks message tokens, so we still need to
+            // add the system_prompt + tools overhead


I don't think that is true; if you start a new session, we immediately hit 6.6K tokens. if that was just the first message and the reply, it would be way less

Yes, exactly — that ~6.6K is the baseline overhead from system prompt + tools (and any default Goose framing). The previous compaction check only counted message tokens (or session metadata), so it could underestimate and trigger compaction too late. This change makes the check include that overhead so we decide correctly even at session start.

yeah, so I don't think that is true. we get the tokens that were used by the provider and that includes the tools and the system prompt. so adding that again should not be needed

Yeah I don't think so either. We are using the provider's token count to determine if we need to compact.

So if we fail from context exceeded from the provider, on compaction we'll remove the system prompt and the compaction prompt is much shorter than that.

We have some other defense mechanisms too for large inputs (removing old tool responses),

Signed-off-by: Your Name <you@example.com>

katzdave · 2026-01-20T20:04:40Z

Sorry for the delay was OOO, closing this. Auto-compact uses the counts returned from the provider.

HansPeterRadtke · 2026-01-22T06:52:24Z

Why this fix is correct

Original code (BEFORE):

let (current_tokens, token_source) = match session.total_tokens {
Some(tokens) => (tokens as usize, "session metadata"),
None => {
let token_counts: Vec<_> = messages
.iter()
.filter(|m| m.is_agent_visible())
.map(|msg| token_counter.count_chat_tokens("", std::slice::from_ref(msg), &[]))
.collect();
(token_counts.iter().sum(), "estimated")
}
};

Problems:

Some(tokens) - uses session tokens directly, no overhead added
None - calls count_chat_tokens("", ..., &[]) with empty system_prompt and empty tools array

Fixed code (AFTER):

Some(tokens) => {
let overhead = token_counter.count_chat_tokens(system_prompt, &[], tools);
(tokens as usize + overhead, "session metadata + overhead")
}
None => {
let total_tokens = token_counter.count_chat_tokens(system_prompt, &agent_visible_messages, tools);
(total_tokens, "estimated with full context")
}

Why the objection is wrong:

"we get the tokens that were used by the provider and that includes the tools and the system prompt"

session.total_tokens stores tokens from the last response. But check_if_compaction_needed predicts if the next request will exceed context.

Next request = system_prompt + tools + messages

Example:

session.total_tokens = 5000
system_prompt + tools = 6600
Old code sees: 5000 tokens → no compaction
Reality: next request = 11600 tokens

This explains issue #5255 ("minimum message size is 8k") - the overhead alone fills small contexts but the old code never counted it.

HansPeterRadtke · 2026-01-22T06:57:23Z

Timing issue: check_if_compaction_needed runs BEFORE the provider call:

User sends new message
check_if_compaction_needed() runs ← uses session.total_tokens from LAST request
Provider call happens
session.total_tokens updated with new count

So session.total_tokens is always ONE message behind - it doesn't include the new user message about to be sent.

But the real killer is the None case:

.map(|msg| token_counter.count_chat_tokens("", std::slice::from_ref(msg), &[]))
// ^^ ^^
// empty system_prompt empty tools

When is session.total_tokens None?

New session (first message)
Session restored from disk
Any state where provider hasn't responded yet

In these cases, the old code estimated tokens with zero overhead - that's clearly a bug regardless of what the provider returns later.

The statement "Auto-compact uses the counts returned from the provider" is only partially true, and completely ignores the fallback path.

HansPeterRadtke force-pushed the fix/compaction-counts-system-prompt-and-tools branch from 0574f19 to 2c2c438 Compare January 9, 2026 10:05

DOsinga requested changes Jan 9, 2026

View reviewed changes

Fix auto-compaction token counting to include system prompt and tools

40c4784

Signed-off-by: Your Name <you@example.com>

HansPeterRadtke force-pushed the fix/compaction-counts-system-prompt-and-tools branch from 2c2c438 to 40c4784 Compare January 9, 2026 15:49

katzdave closed this Jan 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix auto-compaction token counting to include system prompt and tools #6411

Fix auto-compaction token counting to include system prompt and tools #6411

Uh oh!

HansPeterRadtke commented Jan 9, 2026 •

edited

Loading

Uh oh!

DOsinga left a comment

Uh oh!

DOsinga Jan 9, 2026

Uh oh!

HansPeterRadtke Jan 9, 2026

Uh oh!

DOsinga Jan 12, 2026

Uh oh!

katzdave Jan 20, 2026

Uh oh!

katzdave commented Jan 20, 2026

Uh oh!

HansPeterRadtke commented Jan 22, 2026

Uh oh!

HansPeterRadtke commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix auto-compaction token counting to include system prompt and tools #6411

Fix auto-compaction token counting to include system prompt and tools #6411

Uh oh!

Conversation

HansPeterRadtke commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of Change

AI Assistance

Testing

Related Issues

Screenshots/Demos (for UX changes)

Uh oh!

DOsinga left a comment

Choose a reason for hiding this comment

Uh oh!

DOsinga Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

HansPeterRadtke Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

DOsinga Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

katzdave Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

katzdave commented Jan 20, 2026

Uh oh!

HansPeterRadtke commented Jan 22, 2026

Uh oh!

HansPeterRadtke commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HansPeterRadtke commented Jan 9, 2026 •

edited

Loading