Agentic Workflow Audit — 2026-03-30 #23592

2026-03-30T21:21:37Z

github-actions[bot]
bot Mar 30, 2026

Daily audit covering the last 24 hours of agentic workflow runs in github/gh-aw. 27 runs observed (23 completed, 4 still in progress at audit time).

Summary

Metric	Value
Total runs	27
Successes	15 (65%)
Failures	8 (35%)
Total tokens	10.04M
Total cost	$4.38
Avg duration	~8 min

Workflow Health Timeline

The 65% success rate is below the expected baseline. Three distinct failure patterns were identified: lockdown check rejections, engine startup errors, and agentic control issues. Issue Monster ran reliably across 5 event-triggered runs (all succeeded).

Token & Cost Breakdown

Sergo ($1.56) and Copilot Agent Prompt Clustering ($1.34) together account for 66% of the day's cost. Four workflows have resource_heavy_for_domain (high) assessments, suggesting model downgrade or pre-computation opportunities exist.

Failure Analysis

❌ Engine Startup Errors (3 runs — high severity)

Three workflows failed with "6 engine error messages in agent-stdio.log" and 0 turns / 0 tokens consumed, indicating the agent engine never started:

Run	Workflow	Trigger	Time (UTC)
§23764332247	AI Moderator	`issues:opened`	19:47
§23765952389	AI Moderator	`issue_comment`	20:25
§23766371340	Changeset Generator	`pull_request`	20:32

Pattern: All three triggered on repository events (not scheduled). The engine errors occurred before any inference was attempted. This may indicate an API authentication failure, engine binary issue, or resource provisioning problem affecting event-driven runs during this window.

Recommendation: Check if the engine version or secrets were rotated around 19:47–20:32 UTC. Review agent-stdio.log from these runs for the specific error codes.

🔒 Lockdown Check Failures (3 runs — low severity, expected)

Three workflows failed due to lockdown_check_failed=true, triggered between 20:26–20:34 UTC:

Run	Workflow	Time (UTC)
§23766131436	Daily Documentation Updater	20:28
§23766371298	Smoke Claude	20:34
§23766371329	Smoke Codex	20:34

These failures are expected during a lockdown/freeze window. No action required for the failures themselves, but if the lockdown was unplanned or longer than expected, the smoke tests and doc updater may need to be re-run manually.

⚠️ Agentic Control Issues (2 runs — medium/high severity)

Auto-Triage Issues (§23761615004) — failure

Ran 25 turns consuming 686K tokens
Produced 0 safe outputs (no issues created, no noops called)
Agent concluded success internally, but the harness marked the run as failed due to missing safe outputs
Assessments: resource_heavy_for_domain (high), poor_agentic_control (medium)
Root cause: The agent likely exhausted its work without calling any safeoutputs tool. This is the rejig docs #1 cause of workflow failures per the safe-outputs guidelines.
Recommendation: Add explicit instructions to call noop when no triageable issues are found. Consider adding partially_reducible pre-computation steps to reduce turns.

Daily DIFC Integrity-Filtered Events Analyzer (§23765157332) — failure

Agent concluded success but workflow conclusion is failure
0 turns, 0 tokens — likely failed in a pre-agent or post-agent step
Assessment: resource_heavy_for_domain (high) despite no token usage
Recommendation: Review the detection or upload_assets step for this workflow; it may fail silently when no filtered events exist.

Performance Concerns

Resource-Heavy Runs (high severity assessments)

8 runs flagged as resource_heavy_for_domain (high):

Workflow	Tokens	Cost	Turns	Recommendation
Sergo - Serena Go Expert	2.44M	$1.56	—	Consider `claude-haiku-4-5` for simpler Go questions
Copilot Agent Prompt Clustering	1.94M	$1.34	—	Move data collection to deterministic pre-step
Release	1.00M	$0.00	—	Review if all turns are necessary
Static Analysis Report	760K	$0.86	—	Pre-compute static data; use smaller model
Step Name Alignment	683K	$0.62	—	High cost for alignment task
Auto-Triage Issues	686K	$0.00	25	`poor_agentic_control` — reduce turns, add noop
Daily Workflow Updater	620K	$0.00	15	`poor_agentic_control` — move to deterministic steps
Daily DIFC	0	$0.00	0	Workflow-level failure

Cross-cutting recommendation: For runs with partially_reducible (medium) assessments, move data-gathering logic to deterministic pre-activation steps. This can reduce turns by ~50% per the assessment estimates.

Patterns & Recommendations

Priority	Pattern	Affected Workflows	Action
🔴 High	Engine startup failures on event-triggered runs	AI Moderator, Changeset Generator	Investigate API auth / engine errors around 19:47–20:32 UTC
🟠 Medium	Missing `noop` call when no work found	Auto-Triage Issues	Add `noop` call as fallback when no issues to triage
🟡 Medium	Resource-heavy runs (8 runs)	Multiple	Evaluate model downgrade to `claude-haiku-4-5` or `gpt-4.1-mini`
🟡 Medium	Post-agent step failure	Daily DIFC Analyzer	Debug detection/upload_assets step for zero-event case
🟢 Low	Lockdown check rejections	Smoke Claude, Smoke Codex, Daily Doc Updater	Expected; re-run if lockdown was resolved

References:

§23761615004 — Auto-Triage (0 safe outputs)
§23764332247 — AI Moderator (engine error)
§23765915409 — Sergo (top cost $1.56)

AI generated by Agentic Workflow Audit Agent · history

expires on Mar 31, 2026, 9:21 PM UTC

2026-03-31T21:34:18Z

github-actions[bot]
bot Mar 31, 2026
Author

This discussion has been marked as outdated by Agentic Workflow Audit Agent.

A newer discussion is available at Discussion #23784.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agentic Workflow Audit — 2026-03-30 #23592

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Agentic Workflow Audit — 2026-03-30 #23592

Uh oh!

github-actions[bot] bot Mar 30, 2026

Summary

Workflow Health Timeline

Token & Cost Breakdown

Failure Analysis

Performance Concerns

Patterns & Recommendations

Replies: 1 comment

Uh oh!

github-actions[bot] bot Mar 31, 2026 Author

github-actions[bot]
bot Mar 30, 2026

github-actions[bot]
bot Mar 31, 2026
Author