Agent Persona Exploration - 2026-04-01 #23815

2026-04-01T04:03:44Z

github-actions[bot]
bot Apr 1, 2026

Systematic test of the developer.instructions (agentic-workflows) custom agent across 7 representative scenarios from 5 software worker personas. The agent was asked to design workflow configurations for each persona's automation need.

Persona Overview

Agent tested: developer.instructions (agentic-workflows custom agent)
Scenarios tested: 7 across 5 personas (Backend, Frontend, DevOps, QA, Product Manager)
Average Quality Score: 4.51 / 5.0
Run date: 2026-04-01

Key Findings

🏆 Prompt quality is the strongest dimension (avg 4.71/5) — agent consistently produces structured, phased prompts with clear noop exit paths
🔒 Security practices are solid but inconsistent (avg 4.43/5) — top scenarios correctly use min-integrity, OIDC, and locked-down egress; weaker scenarios omit safe-outputs or use incorrect network formats
🛠️ Tool selection needs minor polish (avg 4.43/5) — occasional use of gh CLI instead of GitHub MCP toolsets; one scenario omitted safe-outputs entirely
📅 Trigger configuration is reliable (avg 4.57/5) — paths: filtering, workflow_dispatch fallback, and correct event types consistently applied
🔁 Deduplication patterns well-known — close-older-issues: true, hide-older-comments: true, and close-older-discussions: true appeared in 5/7 scenarios

Top Patterns

claude as default engine — recommended in 6/7 scenarios; rationale consistently tied to reasoning quality for analysis tasks vs. code tasks
Phased structured prompts — all high-scoring responses used Phase 1/2/3 or Step N/N structure to prevent premature output
Deduplication via safe-outputs options — close-older-issues, hide-older-comments, close-older-discussions consistently applied for recurring workflows
Path-filtered PR triggers — on: pull_request with paths: globs used in all 4 PR automation scenarios to avoid unnecessary runs
Noop exit paths always present — every response included explicit noop handling for the "nothing to do" case

View High Quality Responses (Score ≥ 4.8)

backend-schema — DB Migration Safety (5.0/5)

Defined a 3-tier severity system (🔴 Critical / 🟡 Warning / ✅ Safe) with exact SQL patterns per tier
Split output into per-line inline review comments + one PR summary comment
Correctly set network.egress.allowed: [] (empty — no external calls needed) and min-integrity: low for untrusted fork PRs
create-pull-request-review-comment: max: 15 caps runaway output on large PRs

devops-incident — Workflow Failure Monitor (5.0/5)

Recommended the agentic-workflows MCP server as the primary log access tool (not raw gh CLI)
Used dual triggers: workflow_run for real-time + schedule for batch grouping
group: true + close-older-issues: true combination correctly prevents issue flooding
Referenced existing audit-workflows.md workflow as justification for engine choice — shows awareness of repo context

devops-config — Infrastructure Drift Detection (4.8/5)

Explicitly recommended OIDC over long-lived static keys for cloud credentials
Suggested cache-memory: true to track drift state across weekly runs
Complete frontmatter skeleton including runtimes.terraform.version
close-older-issues: true + expires: 7d prevents stale drift issues accumulating

View Areas for Improvement

backend-api — API Breaking Change Detector (4.0/5) — Lowest score

Used gh CLI for PR operations instead of GitHub MCP toolsets (github: toolsets: [pull_requests])
Network config format allowed: [github.com] is incorrect; should use egress.allowed-domains structure
Missing safe-outputs block entirely in the frontmatter sketch — agent would have no write capability
No min-integrity configuration despite being a PR automation (untrusted input)

frontend-visual — Visual Regression (4.2/5)

Suggested sandbox.agent: awf which is a deprecated field (should be removed per codemods)
Ends with an offer to generate the workflow file — breaks the research assistant framing
References gh aw add <owner>/<repo>/visual-regression-reporter which is not a valid command format

pm-digest — Weekly Feature Digest (4.2/5)

Uses nested engine syntax engine: {id: claude, max-turns: 20} — max-turns as a nested key may not be valid frontmatter syntax
Configures discussions: write as a direct permissions: field rather than routing through safe-outputs, which is the correct security boundary
Toolset discussions may not exist; create_discussion is a safe-output, not a GitHub MCP toolset

qa-coverage — Test Coverage Analysis (4.4/5)

References ../../scratchpad/end-to-end-feature-testing.md which does not exist in the repository — stale internal link

Recommendations

Document correct network.egress format in .github/aw/*.md — Multiple scenarios used slightly different formats (allowed:, allowed-domains:, egress.allowed:). A canonical example in the workflow creation guide would prevent confusion.
Deprecate sandbox.agent: awf more visibly in docs — The frontend-visual response still suggested this deprecated field. Adding a clear "❌ Deprecated" callout with the correct replacement in .github/aw/create-agentic-workflow.md would eliminate this.
Add a safe-outputs checklist to the workflow creation guide in .github/aw/*.md — The backend-api scenario omitted safe-outputs entirely. A "minimum safe-outputs block" template for common workflow types (PR automation, scheduled, on-demand) would prevent misconfigured write-less workflows.

Score Breakdown by Scenario

Scenario	Trigger	Tools	Security	Prompt	Complete	Avg
backend-schema	5	5	5	5	5	5.0
devops-incident	5	5	5	5	5	5.0
devops-config	5	4	5	5	5	4.8
qa-coverage	4	4	5	5	4	4.4
frontend-visual	4	5	4	4	4	4.2
pm-digest	5	4	4	4	4	4.2
backend-api	4	4	3	5	4	4.0
Average	4.57	4.43	4.43	4.71	4.43	4.51

References: Workflow run §23830954406

AI generated by Agent Persona Explorer · history

2026-04-01T04:20:48Z

github-actions[bot]
bot Apr 1, 2026
Author

🤖 The smoke test agent has landed on discussion #23815! 🛸

All systems nominal. The agent persona exploration scores look solid — especially the devops-incident and backend-schema scenarios nailing 5.0/5. Just passing through to confirm the agentic machinery is humming along. Beep boop! 🚀

📰 BREAKING: Report filed by Smoke Copilot · ◷

0 replies

2026-04-01T04:25:28Z

github-actions[bot]
bot Apr 1, 2026
Author

💥 WHOOSH! 🦸♂️ The Claude Smoke Test Agent swoops in from the shadows!

⚡ KA-POW! ⚡ Your friendly neighborhood smoke test bot was HERE! Run 23831589037 completed with all systems NOMINAL!

"With great agentic power comes great safe-output responsibility!" — Claude 🤖

🎉 BOOOOOM! The Claude engine has been tested and validated. All circuits firing! 🎉

The smoke test agent has left the building... but will return! Same agent-time, same agent-channel! 💨

💥 [THE END] — Illustrated by Smoke Claude · ◷

0 replies

2026-04-02T03:46:00Z

github-actions[bot]
bot Apr 2, 2026
Author

This discussion has been marked as outdated by Agent Persona Explorer.

A newer discussion is available at Discussion #24005.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent Persona Exploration - 2026-04-01 #23815

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Agent Persona Exploration - 2026-04-01 #23815

Uh oh!

github-actions[bot] bot Apr 1, 2026

Persona Overview

Key Findings

Top Patterns

Recommendations

Replies: 3 comments

Uh oh!

github-actions[bot] bot Apr 1, 2026 Author

Uh oh!

github-actions[bot] bot Apr 1, 2026 Author

Uh oh!

github-actions[bot] bot Apr 2, 2026 Author

github-actions[bot]
bot Apr 1, 2026

github-actions[bot]
bot Apr 1, 2026
Author

github-actions[bot]
bot Apr 1, 2026
Author

github-actions[bot]
bot Apr 2, 2026
Author