-
Notifications
You must be signed in to change notification settings - Fork 322
[WHM] Workflow Health Dashboard — 2026-04-03 #24292
Description
Overview
Health monitoring report for 183 agentic workflows in the github/gh-aw repository. Run §23945470700 — 2026-04-03.
Score: 72/100 (↓3 from 75 last run) — increased P1 failures and stale lock files offset one resolution.
| Metric | Value |
|---|---|
| Total workflows | 183 |
| Lock files present | 183/183 ✅ |
| Stale lock files | 19 |
| P1 failures tracked | 3 |
| Resolved this run | 1 (Smoke Multi PR ✅) |
Critical Issues 🚨
1. Daily Fact About gh-aw — OLD LOCK FORMAT (P1, NEW)
- Status: 9 consecutive schedule failures (Mar 25 – Apr 3)
- Error:
Unable to resolve action 'github/gh-aw-actions@v0', unable to find version 'v0' - Root cause: Lock file compiled with old format using remote action reference;
v0tag no longer exists ongh-aw-actions(latest:v0.65.7) - Fix: Recompile
daily-fact.md→ updatesactivationto use./actions/setup(local) - Issue created: [P1] Daily Fact About gh-aw: activation fails — old lock file format uses non-existent gh-aw-actions@v0 tag #24290
- Priority: P1
2. Duplicate Code Detector — CODEX API RESTRICTION (P1)
- Status: 7 consecutive failures (Mar 28 – Apr 3)
- Error:
stream disconnected before completion: This user's access to this model has been temporarily limited for potentially suspicious activity related to cybersecurity. - Root cause: Codex (OpenAI) API safety restriction triggered by workflow's code analysis tasks
- Auto-issue: #24284 (auto-created)
- Priority: P1 — externally blocked (OpenAI safety restriction)
3. Daily Issues Report Generator — AGENT FAILURE (P1)
- Status: 11 consecutive failures (Mar 24 – Apr 3)
- Error: Agent job fails at
Fetch issues datastep - Root cause: Investigation suggests data fetch step failure (not Codex API restriction — the pattern predates Duplicate Code Detector failure)
- Auto-issue: #24266 (auto-created)
- Priority: P1 — needs investigation
Recovered This Run ✅
Smoke Multi PR — RESOLVED
- Previous P1 from 2026-04-02 run
- Issue closed: [P1] Smoke Multi PR: safe_outputs fails on schedule runs (add_comment target:triggering) #24096
- Latest run: [Custom Engine Test] Test Pull Request - Custom Engine Safe Output #622 (2026-04-03) — schedule → SUCCESS ✅
- Root cause was fixed:
status-comment: trueon schedule triggers issue resolved
Warnings ⚠️
Stale Lock Files (19)
Files where .md was modified more recently than .lock.yml. These workflows are running with potentially outdated configurations:
View 19 stale lock files
| Workflow | File |
|---|---|
| Claude Token Usage Analyzer | claude-token-usage-analyzer.md |
| Code Simplifier | code-simplifier.md |
| Copilot Token Usage Analyzer | copilot-token-usage-analyzer.md |
| Daily CLI Performance | daily-cli-performance.md |
| Daily File Diet | daily-file-diet.md |
| Daily News | daily-news.md |
| Daily Testify Uber Super Expert | daily-testify-uber-super-expert.md |
| Dependabot Go Checker | dependabot-go-checker.md |
| Discussion Task Miner | discussion-task-miner.md |
| Example Workflow Analyzer | example-workflow-analyzer.md |
| Go Logger | go-logger.md |
| Hourly CI Cleaner | hourly-ci-cleaner.md |
| Poem Bot | poem-bot.md |
| Prompt Clustering Analysis | prompt-clustering-analysis.md |
| Schema Consistency Checker | schema-consistency-checker.md |
| Semantic Function Refactor | semantic-function-refactor.md |
| Smoke Copilot | smoke-copilot.md |
| Terminal Stylist | terminal-stylist.md |
| Workflow Normalizer | workflow-normalizer.md |
Run make recompile or gh aw compile to bring lock files up to date.
Note: All 10 stale files from last run (Apr 2) were recompiled and are now current.
Smoke Claude — WATCHLIST
- Issues open: #23528, #23067
- Schedule run Apr 3: SUCCESS ✅
- Pattern: Intermittent (~25-30% failure), MCP HTTP 412s timeout
- Also: Open issue #23919 — safe-outputs filename mismatch (assigned to Copilot)
Previously P2 (team decided not_planned)
- Smoke Update Cross-Repo PR: Still failing. Root: push_repo_memory git branch bug.
- Smoke Create Cross-Repo PR: Still failing. Same root cause.
- Smoke Codex: API restricted. Team: not_planned.
- Smoke Gemini: Exit code 41. Team: not_planned.
Healthy Workflows ✅
~160+ workflows operating normally with no issues detected (based on recent schedule run sampling).
Systemic Issues
-
Codex API restrictions: Duplicate Code Detector (and potentially others) blocked by OpenAI safety restrictions on cybersecurity-related code analysis tasks. May affect other Codex-based workflows analyzing code.
-
Stale lock files wave (19): Large batch of
.mdfiles modified without corresponding.lock.ymlupdates. Runmake recompileto batch-fix. -
Old lock file format:
daily-fact.lock.ymluses legacy remote action format (github/gh-aw-actions/setup@v0). Fix: recompile. Check for other old-format workflows periodically.
Trends
- Overall score: 72/100 (↓3 from 75)
- Score trajectory: 73 → 74 → 75 → 72 (↓)
- New failures this week: 3 newly tracked (Daily Fact, Duplicate Code Detector, Daily Issues Report)
- Resolved: 1 (Smoke Multi PR)
- Stale lock files: 19 (↑9) — spike likely from active PR/feature work on .md files
Actions Taken
- Created P1 issue for Daily Fact About gh-aw (old format): [P1] Daily Fact About gh-aw: activation fails — old lock file format uses non-existent gh-aw-actions@v0 tag #24290
- Confirmed Duplicate Code Detector auto-issue: #24284
- Confirmed Daily Issues Report auto-issue: #24266
- Updated shared memory (workflow-health-latest.md)
Last updated: 2026-04-03T12:03Z
Next check: 2026-04-04T12:00Z
Run: §23945470700
Generated by Workflow Health Manager - Meta-Orchestrator · ● 5.7M · ◷
- expires on Apr 4, 2026, 12:15 PM UTC