Agent Performance Report – Week of 2026-03-15 #21099
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-03-16T17:37:49.069Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Executive Summary
GH_AW_GITHUB_TOKENlockdown ([P1] Lockdown token failures: Issue Monster, PR Triage Agent, Daily Issues Report #20315), with 2 anomalous successes today warranting investigationPerformance Rankings
Top Performing Agents 🏆
All 10 regularly scheduled agents above achieved 100% success on their triggered runs.
Notable: Push-Event Failures Are Expected Behavior
27 workflows triggered by
pushevents all showfailure/completedstatus with zero errors and no duration. This is the expected activation filter behavior — workflows triggered on push to non-matching branches (e.g.,copilot/*branches) immediately exit thepre_activationjob without running the agent. These are not real failures and should not be counted against agent quality scores.Agents Requiring Attention 📉
Issue Monster — P0 Critical, Ongoing
lockdown: trueconfigured butGH_AW_GITHUB_TOKENsecret not set ([P1] Lockdown token failures: Issue Monster, PR Triage Agent, Daily Issues Report #20315)Lockdown mode is enabled (lockdown: true) but no custom GitHub token is configuredQuality Analysis
Output Quality Distribution
Based on 7-day run data and historical outputs:
Common quality patterns observed:
Effectiveness Analysis
Task Completion & Efficiency
Cost efficiency (from 2026-03-13 data):
Runtime efficiency:
Safe output usage:
Behavioral Patterns
Productive Patterns ✅
schedule,issues, andissue_commentevents with consistent performance — demonstrates good event isolationProblematic Patterns⚠️
GH_AW_GITHUB_TOKEN) has caused a cascade affecting 4 workflows for 12+ days. The lockdown pattern, while intentionally secure, creates a hard dependency that blocks entire agent capabilities when the secret is unavailable. The monitoring has been in place but the root cause remains unresolved.Coverage Analysis
Agent Coverage Map
Well-covered areas:
Known gaps (from P0 outage):
Engine diversity:
Recommendations
High Priority
Investigate Issue Monster anomalous successes today — Runs Fix template injection in GitHub context expressions used in workflow prompts #2895 and Secure Markdown prompt rendering with environment variables #2897 succeeded for the first time in ~2 weeks. Check if:
GH_AW_GITHUB_TOKENwas temporarily set (then removed?)Implement circuit breaker for lockdown failures ([P1] Lockdown token failures: Issue Monster, PR Triage Agent, Daily Issues Report #20315) — Issue Monster is burning runner minutes with 20+ failed runs/day. Consider adding a pre-check job that verifies token availability before attempting activation, or reducing schedule frequency to once per day when in degraded mode.
Medium Priority
Investigate Daily Copilot PR Merged Report failure (from 2026-03-13) — Run §23062733795 failed. Confirm if this was a one-time fluke or an emerging regression.
Close resolved tracking issues — [aw] Smoke Codex failed #20285 (Smoke Codex) and [aw] Duplicate Code Detector failed #20304 (Duplicate Code Detector) both recovered on 3/11 and should be closed to reduce issue backlog noise.
Low Priority
Trends
The ecosystem is in a stable, mature state with a persistent P0 exception. The anomalous Issue Monster successes today are the most notable development this week.
Actions Taken This Run
agent-performance-latest.md,shared-alerts.md)References:
Beta Was this translation helpful? Give feedback.
All reactions