Skip to content

[WHM] Workflow Health Dashboard — 2026-04-04 #24477

@github-actions

Description

@github-actions

Overview

Health monitoring report for 179 agentic workflows in the github/gh-aw repository. Run §23978397450 — 2026-04-04.

Score: 70/100 (↓2 from 72 last run) — API rate limiting is a new systemic issue; ongoing P1s persist; stale lock files reduced from 19→13.

Metric Value
Total workflows 179
Lock files present 179/179 ✅
Stale lock files 13 ⚠️ (↓6 from 19)
P1 failures tracked 2 (ongoing)
New systemic issues 1 (API rate limiting)
Resolved this run 0

Critical Issues 🚨

1. Daily Issues Report Generator — AGENT FAILURE (P1, ongoing)

  • Status: 13+ consecutive schedule failures (since Mar 24)
  • Error: Agent job fails (likely at Fetch issues data step)
  • Auto-issue: #24461 (open, Apr 4) — previous [aw] Daily Issues Report Generator failed #24266 closed as not_planned Apr 3
  • Recommendation: Root cause still unresolved; investigate step-level logs in recent run §23976894248
  • Priority: P1

2. Duplicate Code Detector — CODEX API RESTRICTION (P1, ongoing)

  • Status: 8+ consecutive failures (since Mar 28)
  • Error: This user's access to this model has been temporarily limited for potentially suspicious activity related to cybersecurity.
  • Auto-issue: #24471 (open, Apr 4) — previous [aw] Duplicate Code Detector failed #24284 closed as not_planned Apr 3
  • Note: Codex API safety restriction is externally controlled; team had previously marked not_planned
  • Priority: P1 — externally blocked

New Systemic Issue: API Rate Limiting ⚠️

Pattern detected — multiple workflows failing at pre_activation step with API rate limit exceeded for installation:

Workflow Run Time (UTC) Stage
Issue Monster §23971928758 05:07 pre_activation
Daily CLI Performance Agent §23972374984 05:35 pre_activation
Agentic Maintenance §23971979119 05:11 zizmor-scan (API rate limit)

Error: API rate limit exceeded for installation. Request ID: ... during pre_activation check runs API.

Root cause: Many workflows scheduled at the same time window (~05:00-05:40 UTC) hitting the GitHub installation API rate limit simultaneously.

Impact: Issue Monster (~15% failure rate over 40 runs); pre_activation failures cascade to no workflow execution.

Recommendation: Stagger schedule times for high-frequency workflows; or retry on rate limit errors in the pre_activation step.


Warnings ⚠️

Stale Lock Files (13)

Down from 19 last run (good progress), but 13 workflows still running outdated compiled definitions:

View 13 stale lock files
Workflow File
Tidy tidy.md
Daily Security Red Team daily-security-red-team.md
Agentic Observability Kit agentic-observability-kit.md
Layout Spec Maintainer layout-spec-maintainer.md
Dev Hawk dev-hawk.md
Firewall firewall.md
Prompt Clustering Analysis prompt-clustering-analysis.md
GPClean gpclean.md
Weekly Safe Outputs Spec Review weekly-safe-outputs-spec-review.md
Release release.md
Daily CLI Tools Tester daily-cli-tools-tester.md
Video Analyzer video-analyzer.md
Daily Malicious Code Scan daily-malicious-code-scan.md

Run make recompile or gh aw compile to update.

Previously Marked not_planned (P2, stable)

Team decision to not investigate further:

  • Smoke Codex: Codex API restriction
  • Smoke Gemini: Exit code 41
  • Smoke Create Cross-Repo PR: push_repo_memory git branch bug
  • Smoke Update Cross-Repo PR: Same root cause

Intermittent Failures (Single-run, monitoring)

View single-run failures this cycle
Workflow Run Failure Notes
Workflow Normalizer §23966459696 safe_outputs job Artifact uploaded OK; processing error
Auto-Triage Issues §23957755831 safe_outputs job Artifact uploaded OK; processing error
Daily Observability Report §23966346682 agent job Docker build completed; agent failure
Super Linter Report §23949152392 EACCES on super-linter.log Permission error uploading artifact
GitHub MCP Structural Analysis §23945544719 1/1 runs Isolated failure
Contribution Check §23947460225 1/6 runs Likely rate limit

Healthy Workflows ✅

~160+ workflows operating normally. Notable recent successes:

  • Terminal Stylist, Issue Monster (85% success), PR Triage Agent, Smoke Agent workflows all passing.

Systemic Issues Summary

  1. API Rate Limiting: ~05:00-05:40 UTC window saturating GitHub installation API; affects pre_activation for high-frequency workflows. Multiple independent failures on Apr 4. Recommend schedule staggering.

  2. Codex API restrictions (ongoing): Any Codex workflow performing security/code analysis may trigger OpenAI safety check. Currently affecting Duplicate Code Detector; watch for others.

  3. Safe-outputs processing errors (new, isolated): Workflow Normalizer and Auto-Triage Issues both had safe_outputs job failures despite artifact upload succeeding. May indicate an issue with safe-output processing (downstream webhook or label validation).

  4. Stale lock files (13): Reduced from 19 but still elevated. Likely from active .md edits. Run make recompile.


Trends

  • Overall score: 70/100 (↓2 from 72)
  • Score trajectory: 73 → 74 → 75 → 72 → 70 (↓↓)
  • New failures this cycle: API rate limit cluster (+1 systemic pattern)
  • Resolved this cycle: 0
  • Stale lock files: 13 (↓6 from 19) ✅

Actions Taken

  • Confirmed Daily Issues Report auto-issue: #24461
  • Confirmed Duplicate Code Detector auto-issue: #24471
  • Identified new API rate limit systemic pattern (no prior issue — part of this dashboard)
  • Updated shared memory

Last updated: 2026-04-04T12:00Z
Next check: 2026-04-05T12:00Z
Run: §23978397450

Note

🔒 Integrity filter blocked 1 item

The following item were blocked because they don't meet the GitHub integrity level.

  • #19099 search_issues: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by Workflow Health Manager - Meta-Orchestrator · ● 2.7M ·

  • expires on Apr 5, 2026, 12:06 PM UTC

Metadata

Metadata

Labels

cookieIssue Monster Loves Cookies!

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions