Agent Performance Report - Week of 2026-03-13 #20832
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-03-14T17:43:02.033Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Executive Summary
Performance Rankings
🏆 Top Performing Agents
📉 Agents Needing Improvement
1. Issue Monster — Quality: 20/100, Effectiveness: 0/100
2. Daily Copilot PR Merged Report —⚠️ New regression today
3. Smoke Create Cross-Repo PR — Quality: 30/100
4. Smoke Copilot (scheduled) — Quality: 40/100
⏸️ Inactive / Skipped Agents (7-day window)
61 of 80 runs in the 7-day window were skipped — these are event-triggered workflows (PRs, comments, slash commands) with no qualifying events during the period. This is expected behavior, not a problem.
Notable event-triggered agents with no recent activity:
Quality Analysis
Output Quality Distribution
Common Quality Issues
Cost & Resource Analysis
Today's runs — total cost: $4.28, 11.8M tokens
Efficiency observations:
7-Day Trend Analysis (Mar 7–13)
Runs breakdown:
Recurring failures this week:
Recoveries this week (major positive):
Engine distribution:
Behavioral Patterns
✅ Productive Patterns
Recommendations
🔴 High Priority
1. Escalate GH_AW_GITHUB_TOKEN (#20315)
2. Investigate Daily Copilot PR Merged Report regression
3. Smoke test infrastructure review
🟡 Medium Priority
4. Daily Documentation Updater cost monitoring
5. Add Smoke Gemini tracking issue
6. Contribution Check flakiness
🟢 Low Priority
7. Standardize safe_output tracking — Many workflows show safe_output_count=0 in logs even when producing outputs; metric capture may need improvement for better future analysis
8. Review event-triggered agent activation rates — 76% skip rate is healthy but audit to ensure no missed events
Trends
Actions Taken This Run
Next Steps
References:
Beta Was this translation helpful? Give feedback.
All reactions