ENG-2922 Token usage tracking in model responses and agent progress by aix-ahmet · Pull Request #890 · aixplain/aiXplain

aix-ahmet · 2026-04-01T21:52:47Z

Summary

Model response token usage: Add usage (prompt/completion/total tokens) and asset fields to model response handling in both V1 and V2, with robust parsing that gracefully handles NaN, null, and string values from inconsistent backend responses.
Fix model.run() hanging: Resolve issue where model.run() on sync-only models would return IN_PROGRESS without polling, and where NaN token values caused deserialization errors leading to infinite retry loops.
Agent progress token display: Show input/output/total tokens inline on each step line at all verbosity levels, with aggregated totals in the completion summary.
Backend documentation: Include curl commands documenting token reporting inconsistencies across different LLM providers for the backend team.

Test plan

All 911 unit tests pass
Pre-commit hooks pass (ruff lint, ruff format, pytest)
Manual verification: run an agent with progress_verbosity=1,2,3 and confirm token counts appear on step lines
Manual verification: run check_llm_usage.py to confirm token parsing for various LLM backends

Made with Cursor

Surface token usage (prompt_tokens, completion_tokens, total_tokens) and asset info from model serving as first-class fields on model responses. Fix V2 poll path which previously dropped usage and asset from the filtered response dict for async models. Also fix pre-existing broken tests: remove test_action_inputs_proxy.py (imports removed ActionInputsProxy class) and fix subagents -> agents assertion in test_v2_agent_duplicate.py.

Several model providers (GPT-5.4, Claude, Mistral Large) return "NaN"/null for token counts in the usage block. This caused: 1. Usage dataclass deserialization to fail (int("NaN")) 2. sync_poll to retry the same completed response forever until timeout 3. Sync-only model.run() to return IN_PROGRESS without polling Changes: - Make Usage fields Optional[int] with a safe decoder that handles NaN, null, strings, and floats gracefully - Add poll fallback in resource.poll() for completed responses that fail deserialization - Add polling after _run_sync_v2() for sync models that return a poll URL Made-with: Cursor

Display input/output/total tokens inline on each step line and aggregate totals in the completion summary. Also includes curl commands documenting backend token reporting inconsistencies. Made-with: Cursor

aix-ahmet added 2 commits March 27, 2026 21:42

Remove unused mock import from test_action_inputs_proxy.py

1665ab4

aix-ahmet force-pushed the ENG-2922-Add-token-usage-in-llm-calls-in-SDK branch from 4055df1 to e691f72 Compare April 1, 2026 21:54

aix-ahmet added 2 commits April 2, 2026 00:56

ENG-2922 Show token usage in agent progress at all verbosity levels

b1530cc

Display input/output/total tokens inline on each step line and aggregate totals in the completion summary. Also includes curl commands documenting backend token reporting inconsistencies. Made-with: Cursor

aix-ahmet force-pushed the ENG-2922-Add-token-usage-in-llm-calls-in-SDK branch from e691f72 to b1530cc Compare April 1, 2026 21:56

aix-ahmet merged commit a0ca7eb into development Apr 1, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENG-2922 Token usage tracking in model responses and agent progress#890

ENG-2922 Token usage tracking in model responses and agent progress#890
aix-ahmet merged 4 commits intodevelopmentfrom
ENG-2922-Add-token-usage-in-llm-calls-in-SDK

aix-ahmet commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aix-ahmet commented Apr 1, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant