Agent Persona Exploration - 2026-03-23 #22354
Closed
Replies: 2 comments 1 reply
-
|
/q the agent personna is meant to update the files in .github/aw/*.md. it is NOT meants to use or consider the AGENTS.md in this project. |
Beta Was this translation helpful? Give feedback.
1 reply
-
|
This discussion has been marked as outdated by Agent Persona Explorer. A newer discussion is available at Discussion #22584. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Persona Overview
developer.instructions(agentic-workflows proxy)Key Findings
pip installentirelycache-memoryfor persistent state (flaky test quarantine) solves cross-run learning elegantlyrepo-memoryfield) appear fully resolved — not observed in any of 8 runsTop Patterns
pull_request(4×),issues: labeled(2×),workflow_run: completed + if:failure(1×),schedule + workflow_dispatch(1×)pull_requests/issues/repos(8/8),bash(5/8),cache-memory(1/8 explicit)permissions+safe-outputsfor all writes (8/8);strict:falseused once with clear justification (issues:writefor incident reporter)View Perfect Scores — Top 5 Scenarios (5.0/5.0)
BE-2 · Database Issue Auto-Triage (
issues: [opened, edited])Keyword detection → severity classification → labeled comment. Standout:
label allowlistin frontmatter prevents rogue labels;hide-older-commentscollapses previous triage on re-edits. Full 4-step structured flow.FE-1 · Bundle Size Diff on JS/CSS PRs (
pull_request + paths)Auto-detects build tool before running. Reports raw + gzip sizes in table form. PASS/FAIL badge at 5KB threshold.
concurrency: cancel-in-progressfor rapid push sequences. Handles 0-entry-point repos gracefully.QA-1 · Flaky Test Detector from CI Logs (
workflow_run: CI failure)10-step bash pre-processing collects logs, parses 4 test output formats (Go, JUnit XML, pytest, generic), fetches 30 historical runs, cross-references failure frequency. Agent then classifies
LIKELY_FLAKY/GENUINE_FAILURE. Persistentcache-memoryquarantine list survives across branches.QA-2 · Epic Test Plan Draft (
issues: labeled "epic")Posts 7-section test plan (objectives, scope, happy paths, edge cases, risks, NFRs, open questions).
rate-limit: max:5, window:60guards against bulk-label events.checkout: falsespeeds startup. Gracefulnoopfor non-software epics.BE-1 · OpenAPI Breaking Change Detector (
pull_request paths: api/**)Inline Python (stdlib
jsononly) detects 5 categories of breaking changes. Posts up to 15 inline diff comments +REQUEST_CHANGESorCOMMENT. Provides 4 migration strategies (versioning, deprecation, make-optional, accept-both) with copy-paste examples.View Areas for Improvement (Scores 4.4–4.8)
PM-2 · PR TL;DR Summary — 4.4/5.0 (
pull_request: labeled "ready-for-review")if:condition (no wasted runs for other labels)DO-1 · Workflow Failure Incident Reporter — 4.8/5.0 (
workflow_run: completed)strict: falserequired becauseissues: writeconflicts with strict mode — this is a known system limitation, not an agent error, but worth trackinggithub.eventexpressions not on the compiler's allowed list (resolved via API calls inside script, but adds complexity)PM-1 · Weekly Feature Digest — 4.8/5.0 (
schedule: monday 09:00)close-older-discussions: true+close-older-keyprevents digest accumulationexpires: 7dauto-cleanup keeps the Releases category tidyshared/reporting.md(possibly an internal helper that may not exist in all repos)discussionstoolset listed butcreate_discussiongoes through safe-outputs — minor redundancyRecommendations
Document the pre-processing bash step pattern — Add to
AGENTS.mdor a skill: for log-heavy workflows (workflow_run, CI analysis), use a bash pre-step to extract/download data before the agent runs. This dramatically reduces turn count and cost.Add
rate-limitguidance for issue-label triggers — Therate-limit: max:5, window:60pattern (seen in QA-2) prevents bulk-label storms from spawning many concurrent agent runs. This should be standard advice for allissues: labeledandpull_request: labeledworkflows.Clarify
review_commentsvscommentstoolsets for PR summarization — PM-2 scored lower because the agent didn't explicitly distinguish inline review threads from general comments. Adding a note to the instructions about theget_review_commentsvsget_commentsdistinction would improve PR analysis workflows.Score Trend
References: §23417375175
Beta Was this translation helpful? Give feedback.
All reactions