feat: 2602 agent workflow metrics 2 by skamenan7 · Pull Request #3491 · llamastack/llama-stack

skamenan7 · 2025-09-19T15:06:03Z

What does this PR do?

Adds workflow metrics tracking to the agent system to monitor performance and usage patterns. The implementation
tracks step execution, workflow completion times, and tool usage with proper telemetry integration.

The metrics provide visibility into agent behavior and can be queried using the telemetry system. Tool names are
normalized for consistency (knowledge_search becomes rag).

Closes #2602

Test Plan

Integration Test:

LLAMA_STACK_CONFIG="http://localhost:8321" python -m pytest tests/integration/agents/test_agent_metrics_integratio
n.py::TestAgentMetricsIntegration::test_agent_metrics_end_to_end -v

validating:

Agent workflows generate the expected metrics (steps, tools, duration)
Tool calls are tracked with normalized names
Metrics can be queried via telemetry.query_metrics()
Both web_search and knowledge_search tools appear in results

Verifies metrics are properly collected and queryable, building on query functionality from #3074.

Please Note: Most of the code was reviewed in #2993 but we wanted to test the metrics using query_metrics from #3074 and I was focusing more on higher priory items. So I created this PR for easier reviews as commits were many and was not easy to follow.

Add comprehensive OpenTelemetry-based metrics for agent observability: - Workflow completion/failure tracking with duration measurements - Step execution counters for performance monitoring - Tool usage tracking with normalized tool names - Non-blocking telemetry emission with named async tasks - Comprehensive unit and integration test coverage - Graceful handling when telemetry is disabled

- simplified test to use telemetry.query_metrics for verification - test now validates actual queryable metrics data - verified by query metrics functionality added in llamastack#3074

skamenan7 · 2025-09-19T15:11:46Z

cc: @cdoern please review as you had developed query metrics and we thought of using that to test the metrics. Thanks!

skamenan7 added 2 commits September 19, 2025 10:39

improve agent metrics integration test and cleanup fixtures

8f0413e

- simplified test to use telemetry.query_metrics for verification - test now validates actual queryable metrics data - verified by query metrics functionality added in llamastack#3074

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 19, 2025

skamenan7 marked this pull request as ready for review September 19, 2025 15:11

skamenan7 requested review from ashwinb, bbrowning, ehhuang, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, slekkala1, terrytangyuan and yanxi0830 as code owners September 19, 2025 15:11

skamenan7 changed the title ~~Feature/2602 agent workflow metrics 2~~ feat: 2602 agent workflow metrics 2 Sep 22, 2025

skamenan7 closed this Nov 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: 2602 agent workflow metrics 2#3491

feat: 2602 agent workflow metrics 2#3491
skamenan7 wants to merge 2 commits intollamastack:mainfrom
skamenan7:feature/2602-agent-workflow-metrics-2

skamenan7 commented Sep 19, 2025 •

edited

Loading

Uh oh!

skamenan7 commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

skamenan7 commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

skamenan7 commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

skamenan7 commented Sep 19, 2025 •

edited

Loading