Agent Performance Report – Week of 2026-03-15 #21099

2026-03-15T17:37:49Z

github-actions[bot]
bot Mar 15, 2026

Executive Summary

Agents analyzed: 27 distinct workflows in 7-day window
Scheduled/event runs reviewed: 18 (excluding expected push-filter skips)
Overall quality score: 86/100 (stable, ↑1 from 3/08)
Overall effectiveness score: 86/100 (stable)
Ecosystem health: 72/100 (stable — P0 lockdown persists)
Top performers: AI Moderator, Daily Safe Outputs Conformance Checker, Semantic Function Refactoring
Critical issue: Issue Monster (P0) — 100% scheduled failure rate due to GH_AW_GITHUB_TOKEN lockdown ([P1] Lockdown token failures: Issue Monster, PR Triage Agent, Daily Issues Report #20315), with 2 anomalous successes today warranting investigation

Performance Rankings

Top Performing Agents 🏆

Agent	Schedule Success	Avg Duration	Notes
AI Moderator	100% (4/4)	7.9 min	Handles schedule + issues + issue_comment events
Semantic Function Refactoring	100% (1/1)	9.8 min	High-value code improvement agent
Chroma Issue Indexer	100% (1/1)	8.5 min	Stable indexing agent
Workflow Skill Extractor	100% (1/1)	11.7 min	Longest schedule runtime but consistent
Daily Team Evolution Insights	100% (1/1)	7.4 min	Reliable analytics
Lockfile Statistics Analysis Agent	100% (1/1)	6.3 min	Consistent
The Great Escapi	100% (1/1)	3.5 min	Fast, efficient
Contribution Check	100% (1/1)	3.7 min	Fast, accurate
Daily Safe Outputs Conformance Checker	100% (1/1)	3.4 min	Fastest scheduled agent
/cloclo	100% event	13.2 min	1 cancelled discussion_comment (expected)

All 10 regularly scheduled agents above achieved 100% success on their triggered runs.

Notable: Push-Event Failures Are Expected Behavior

27 workflows triggered by push events all show failure/completed status with zero errors and no duration. This is the expected activation filter behavior — workflows triggered on push to non-matching branches (e.g., copilot/* branches) immediately exit the pre_activation job without running the agent. These are not real failures and should not be counted against agent quality scores.

Agents Requiring Attention 📉

Issue Monster — P0 Critical, Ongoing

Failure rate: ~92% on scheduled runs today (23+ errors); first 2 successful runs in weeks observed today
Root cause: lockdown: true configured but GH_AW_GITHUB_TOKEN secret not set ([P1] Lockdown token failures: Issue Monster, PR Triage Agent, Daily Issues Report #20315)
Error: Lockdown mode is enabled (lockdown: true) but no custom GitHub token is configured
Affected workflows: Issue Monster, PR Triage Agent, Daily Issues Report, Org Health Report (all sharing the same token dependency)
New signal today: Runs Fix template injection in GitHub context expressions used in workflow prompts #2895 (15:40Z) and Secure Markdown prompt rendering with environment variables #2897 (16:04Z) succeeded — first successes in ~2 weeks. This suggests the token may have been briefly available or a configuration change was made temporarily. Warrants investigation.
Status: Tracking issue [P1] Lockdown token failures: Issue Monster, PR Triage Agent, Daily Issues Report #20315 OPEN. No automated fix path. Human escalation required.

Quality Analysis

Output Quality Distribution

Based on 7-day run data and historical outputs:

Tier	Score	Agents
Excellent (80–100)	86+	AI Moderator, Semantic Refactoring, Safe Outputs Checker, Escapi, Chroma Indexer, Team Evolution
Good (60–79)	~75	/cloclo, Contribution Check, Workflow Skill Extractor
Needs Work (<60)	—	Issue Monster (blocked by infrastructure, not agent quality)

Common quality patterns observed:

All active agents produce well-structured, actionable output
No duplication or scope creep detected in this period
Safe output limits are being respected (14 total safe items in 45 runs = 0.31 per run, well within limits)

Effectiveness Analysis

Task Completion & Efficiency

Cost efficiency (from 2026-03-13 data):

Total ecosystem cost: ~$4.28/day
Most expensive: Daily Documentation Updater ($1.41), Semantic Function Refactoring ($1.14)
Most efficient: Daily Safe Outputs Conformance Checker (3.4 min, low cost)

Runtime efficiency:

Fastest: Daily Safe Outputs Conformance Checker (3.4 min), Contribution Check (3.7 min)
Slowest: Workflow Skill Extractor (11.7 min), /cloclo interactive (13.2 min)
All within acceptable bounds; no runaway agents detected

Safe output usage:

14 safe items across 45 runs = efficient, targeted output creation
No over-creation pattern detected

Behavioral Patterns

Productive Patterns ✅

AI Moderator multi-event handling: Successfully handles schedule, issues, and issue_comment events with consistent performance — demonstrates good event isolation
Schedule reliability: 9/10 regularly scheduled agents at 100% — healthy ecosystem discipline
Push-filter discipline: All agents properly skip activation on non-matching push events (no wasted compute)

Problematic Patterns ⚠️

Issue Monster cascade failure: A single missing secret (GH_AW_GITHUB_TOKEN) has caused a cascade affecting 4 workflows for 12+ days. The lockdown pattern, while intentionally secure, creates a hard dependency that blocks entire agent capabilities when the secret is unavailable. The monitoring has been in place but the root cause remains unresolved.
Hourly retry storm: Issue Monster runs every 30 minutes (or hourly), generating 20+ failed runs per day. Each run consumes ~50 seconds and GitHub Actions runner time despite immediately failing. Consider implementing exponential backoff or a circuit breaker for lockdown failures.

Coverage Analysis

Agent Coverage Map

Well-covered areas:

Code quality (Semantic Refactoring, Grumpy Code Reviewer, Security Review)
Issue triage and moderation (AI Moderator, Auto-Triage, Contribution Check)
Documentation (Daily Documentation Updater, Documentation Unbloat)
Analytics (Team Evolution Insights, Lockfile Statistics, Resource Summarizer)
Ecosystem health (Safe Outputs Checker, Workflow Health Manager, Agent Performance Analyzer)
Interactive tools (/cloclo, ACE Editor, Scout)

Known gaps (from P0 outage):

Issue creation and bulk issue management (Issue Monster blocked)
PR triage pipeline (PR Triage Agent blocked)
Daily issue reports (Daily Issues Report blocked)

Engine diversity:

Copilot: majority of scheduled agents
Claude: several analysis agents (Semantic Refactoring, etc.)
Codex: Smoke Codex (now recovered), other utilities
Good balance maintained

Recommendations

High Priority

Investigate Issue Monster anomalous successes today — Runs Fix template injection in GitHub context expressions used in workflow prompts #2895 and Secure Markdown prompt rendering with environment variables #2897 succeeded for the first time in ~2 weeks. Check if:
- GH_AW_GITHUB_TOKEN was temporarily set (then removed?)
- The lockdown configuration was briefly changed
- This is a one-time anomaly or a pattern
- Related to any configuration changes merged today
Implement circuit breaker for lockdown failures ([P1] Lockdown token failures: Issue Monster, PR Triage Agent, Daily Issues Report #20315) — Issue Monster is burning runner minutes with 20+ failed runs/day. Consider adding a pre-check job that verifies token availability before attempting activation, or reducing schedule frequency to once per day when in degraded mode.

Medium Priority

Investigate Daily Copilot PR Merged Report failure (from 2026-03-13) — Run §23062733795 failed. Confirm if this was a one-time fluke or an emerging regression.
Close resolved tracking issues — [aw] Smoke Codex failed #20285 (Smoke Codex) and [aw] Duplicate Code Detector failed #20304 (Duplicate Code Detector) both recovered on 3/11 and should be closed to reduce issue backlog noise.

Low Priority

Smoke cross-repo PR ([aw] Smoke Update Cross-Repo PR failed #20288) — Ongoing P2 infrastructure issue. Consider whether this is blocking any campaigns.

Trends

Metric	Current	3/13	3/08	Direction
Quality score	86/100	86/100	85/100	↑ stable
Effectiveness	86/100	86/100	85/100	↑ stable
Health score	72/100	72/100	70/100	↑ stable
Active agents (healthy)	~162/166	~162/166	~160/166	stable
Issue Monster status	2 successes today!	0%	0%	⚠️ investigate

The ecosystem is in a stable, mature state with a persistent P0 exception. The anomalous Issue Monster successes today are the most notable development this week.

Actions Taken This Run

Generated this performance analysis discussion
Updated shared memory (agent-performance-latest.md, shared-alerts.md)
No new improvement issues created (all issues already tracked; [P1] Lockdown token failures: Issue Monster, PR Triage Agent, Daily Issues Report #20315 covers P0)

References:

§23115466613 — This report's workflow run
§23115089985 — Issue Monster failure audit (latest)
§23062733795 — Daily Copilot PR Merged Report failure (3/13)

Analysis period: 2026-03-08 to 2026-03-15 · Next report: 2026-03-22

AI generated by Agent Performance Analyzer - Meta-Orchestrator · history

expires on Mar 16, 2026, 5:37 PM UTC

2026-03-16T18:59:01Z

github-actions[bot]
bot Mar 16, 2026
Author

This discussion was automatically closed because it expired on 2026-03-16T17:37:49.069Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent Performance Report – Week of 2026-03-15 #21099

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Agent Performance Report – Week of 2026-03-15 #21099

Uh oh!

github-actions[bot] bot Mar 15, 2026

Executive Summary

Performance Rankings

Top Performing Agents 🏆

Agents Requiring Attention 📉

Quality Analysis

Effectiveness Analysis

Behavioral Patterns

Productive Patterns ✅

Problematic Patterns ⚠️

Coverage Analysis

Recommendations

High Priority

Medium Priority

Low Priority

Trends

Actions Taken This Run

Replies: 1 comment

Uh oh!

github-actions[bot] bot Mar 16, 2026 Author

github-actions[bot]
bot Mar 15, 2026

github-actions[bot]
bot Mar 16, 2026
Author