Agent Performance Report — Week of 2026-03-31 #23642
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-04-01T04:47:00.222Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Executive Summary
Performance Rankings
Top Performing Agents 🏆
Agents Needing Improvement 📉
Documentation Unbloat (Effectiveness: ~0% this window)
Contribution Check (Quality: 65/100)
failuredespite producing 3 safe outputs (including report [Contribution Check Report] Contribution Check — 2026-03-31 #23623)resource_heavy_for_domain,poor_agentic_control,partially_reducibleSmoke Claude (Effectiveness: ~50%)
write_heavyRelease Workflow (Reliability: ~33% success in 7-day window)
partially_reducibleInactive or Stale Concerns
push_repo_memory→ post-setup-scripts bug. Systemic root cause identified, fix pending.Quality Analysis
Output Quality Distribution
Common Quality Issues Observed
Effectiveness Analysis
Resource Efficiency — Flagged Runs
Ecosystem-wide: 6 out of ~10 agentic runs in the last 7 days flagged as resource-heavy.
Cost & Efficiency Insights
poor_agentic_controlis a reliability risk beyond just cost.model_downgrade_available— could use a smaller model for equivalent output quality.Behavioral Patterns
Productive Patterns ✅
Problematic Patterns⚠️
Recommendations
High Priority
Diagnose Documentation Unbloat failures — 100% failure rate, no output, fast fail. Check if this is a dependency issue, auth issue, or structural workflow bug. Issue: [agent-perf] Documentation Unbloat: 100% failure rate — needs investigation #23640.
Add turn limit / self-stopping criteria to Contribution Check — 44 turns with
poor_agentic_controlis unsustainable. Add explicit instructions like "stop after reviewing X PRs" or implement a turn budget. Expected improvement: reduce turns to <20, eliminate failure conclusion.Continue resolving P1 systemic issues:
git checkout HEADrestore stepMedium Priority
Move data-gathering to deterministic pre-steps for CLI Version Checker, GitHub Remote MCP Auth Test, Release — reduce agent turns by ~70%, cut costs significantly. Reference: Deterministic & Agentic Patterns guide.
Model downgrade for Agent Persona Explorer — flagged
model_downgrade_available. Switch to a smaller model to reduce cost with equivalent output.Normalize Smoke Claude PR vs. schedule behavior — add explicit event-context prompting to prevent write_heavy posture on PR triggers.
Low Priority
Establish turn budget guardrails for all copilot-engine workflows: recommended max 20 turns for smoke/check patterns, 30 turns for heavier analysis workflows.
Review Release workflow pre-flight stability — 2/3 recent runs failed before agent started. May be trigger or env issue rather than agent quality.
Coverage Analysis
Well-Covered Areas
Coverage Gaps / Opportunities
Trends
Score decline this week driven by: Documentation Unbloat 0% success, Contribution Check failure with poor control, Smoke Claude PR failure.
Actions Taken This Run
References
Beta Was this translation helpful? Give feedback.
All reactions