Skip to content

fix(telemetry): use max instead of sum for streaming token usage aggregation#1098

Open
LearningGp wants to merge 1 commit intoagentscope-ai:mainfrom
LearningGp:fix/fix-token
Open

fix(telemetry): use max instead of sum for streaming token usage aggregation#1098
LearningGp wants to merge 1 commit intoagentscope-ai:mainfrom
LearningGp:fix/fix-token

Conversation

@LearningGp
Copy link
Copy Markdown
Collaborator

AgentScope-Java Version

1.0.11

Description

use max instead of sum for streaming token usage aggregation

Checklist

Please check the following items before code is ready to be reviewed.

  • Code has been formatted with mvn spotless:apply
  • All tests are passing (mvn test)
  • Javadoc comments are complete and follow project conventions
  • Related documentation has been updated (e.g. links, examples, etc.)
  • Code is ready for review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts telemetry aggregation for streaming ChatResponse so that token usage is treated as cumulative across chunks (taking the maximum observed value) rather than summing per-chunk values, which can overcount for providers that report cumulative totals.

Changes:

  • Update StreamChatResponseAggregator to aggregate inputTokens/outputTokens via Math.max(...) instead of summation.
  • Add unit tests covering cumulative-usage (max) behavior and “usage only in last chunk” behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
agentscope-extensions/agentscope-extensions-studio/src/main/java/io/agentscope/core/tracing/telemetry/StreamChatResponseAggregator.java Switch token usage aggregation from sum to max for streaming telemetry.
agentscope-extensions/agentscope-extensions-studio/src/test/java/io/agentscope/core/tracing/telemetry/StreamChatResponseAggregatorTest.java Add tests validating the new token usage aggregation behavior.

Comment on lines 73 to 78
ChatUsage usage = chunk.getUsage();
if (usage != null) {
inputTokens.addAndGet(usage.getInputTokens());
outputTokens.addAndGet(usage.getOutputTokens());
inputTokens = Math.max(inputTokens, usage.getInputTokens());
outputTokens = Math.max(outputTokens, usage.getOutputTokens());
time = usage.getTime();
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputTokens/outputTokens were switched to take the max across chunks (cumulative semantics), but time is still overwritten with the latest chunk’s value. If providers emit cumulative elapsed time (similar to cumulative token totals), this can regress to a smaller value when later chunks omit/reset time. Consider aggregating time consistently (e.g., take max across chunks, or only update when the new value is greater/non-zero).

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants