feat(telemetry): emit gen_ai.usage.cached_tokens across all providers#1497
Merged
joshua-mo-143 merged 1 commit into0xPlaygrounds:mainfrom Mar 16, 2026
Merged
Conversation
Cached input tokens are billed at reduced rates and processed faster, but this data was not being surfaced in telemetry spans. This adds gen_ai.usage.cached_tokens to every provider's tracing spans for cost tracking and cache hit rate observability. - Update record_token_usage in SpanCombinator to emit cached_input_tokens - Add gen_ai.usage.cached_tokens field to all span declarations - Record cached tokens in manual recording sites (OpenAI, DeepSeek, Groq, xAI) - Record cached tokens in agent-level spans (prompt_request)
Collaborator
|
This seems mostly ready? I think? Are there any blockers here? @alwayys-afk |
Contributor
Author
|
Yeah should be good |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
gen_ai.usage.cached_tokensto tracing spans across all providers and agent-level spansrecord_token_usageinSpanCombinatorto automatically emitcached_input_tokensfor providers using the telemetry traitCached input tokens are billed at reduced rates (typically 50-90% cheaper) and processed faster. Emitting them in telemetry spans enables accurate cost tracking, cache hit rate monitoring, and latency correlation.
This PR was generated with assistance from Claude.
Test plan
cargo check -p rig-corepassescargo clippy -p rig-core --all-targets --all-featurespassescargo fmt -- --checkpasses