[Pelis Agent Factory Advisor] Agentic Workflow Maturity Assessment & Recommendations (2026-04-03) #1639

2026-04-03T03:37:54Z

github-actions[bot]
bot Apr 3, 2026

📊 Executive Summary

gh-aw-firewall has achieved an impressive Level 4 agentic workflow maturity with 27 compiled agentic workflows covering security, testing, observability, issue management, CI fault investigation, and documentation. The repository is one of the most agentically mature repositories observed — unsurprisingly, since it is the AWF infrastructure. Top opportunities lie in meta-workflow health monitoring, issue triage automation, continuous code quality, and breaking change detection, none of which are currently covered.

🎓 Patterns Learned from Pelis Agent Factory

Key patterns and best practices from the Pelis Agent Factory and the githubnext/agentics repository:

Category	Pattern	Key Insight
Specialization	One workflow per concern	Focused agents outperform monoliths
Meta-agents	Audit Workflows, Workflow Health Manager	Agents that watch agents are highly valuable — "who watches the watchers?"
skip-if-match	Prevents duplicate parallel runs	Prevents chaos when same workflow triggers repeatedly
cache-memory	Persistent state across runs	Enables deduplication, trend tracking (e.g. Issue Duplication Detector)
Discussion outputs	Read-only analyses → discussions	Keeps issue tracker clean; discussions better for reports
PR outputs	Code changes → pull requests	Humans review before merging; 69–100% merge rates observed
Causal chains	Agent A → issue → Agent B → PR	Workflows triggering other workflows compound value
CI Doctor	Analyzes failures automatically	69% merge rate; faster organizational response than any manual process
Changeset generator	Automates version/changelog	78% merge rate; eliminates manual release ceremony
Breaking Change Checker	Monitors backward-compatibility	Catches regressions before users do
Portfolio Analyst	Cost and token optimization	Turns raw LLM spend into actionable insights

Comparison to this repo: This repository already implements many of the most important patterns (CI Doctor ✅, security guard ✅, smoke tests ✅, doc maintainer ✅, issue monster ✅, token analyzers ✅). The gaps are in continuous code quality, meta-monitoring, and foundational issue triage.

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`security-guard`	Reviews PRs for security regressions	PR open/sync	✅ Excellent - Claude engine, security-focused
`security-review`	Daily threat modeling	Daily	✅ Comprehensive with cache-memory
`dependency-security-monitor`	CVE monitoring, dep updates	Daily	✅ Good coverage
`secret-digger-claude/copilot/codex`	Hourly secrets scanning (3 engines)	Hourly cron	✅ Impressive multi-engine coverage
`smoke-claude`	Claude smoke test	Every 12h + PR	✅ Core validation
`smoke-copilot`	Copilot smoke test	Every 12h + PR	✅ Core validation
`smoke-codex`	Codex smoke test	Every 12h + PR	⚠️ Currently failing
`smoke-services`	Services smoke test	Every 12h + PR	⚠️ Currently failing
`smoke-chroot`	Chroot smoke test	PR (path-scoped)	✅ Good scoping
`ci-doctor`	CI failure investigation	workflow_run failures	✅ Covers 27 workflows
`ci-cd-gaps-assessment`	CI coverage gap analysis	Daily	✅ Self-monitoring
`doc-maintainer`	Docs sync with code changes	Daily	✅ Active
`issue-monster`	Dispatches issues to Copilot	Issue open + hourly	✅ Core automation
`issue-duplication-detector`	Detects duplicate issues	Issue open	✅ Uses cache-memory
`firewall-issue-dispatcher`	Cross-repo issue tracking (gh-aw→here)	Every 6h	✅ Unique cross-repo pattern
`claude-token-usage-analyzer`	Claude token trend analysis	Daily	✅ Feeds optimizer
`copilot-token-usage-analyzer`	Copilot token trend analysis	Daily	✅ Feeds optimizer
`claude-token-optimizer`	Proposes Claude cost reductions	Workflow run (after analyzer)	✅ Causal chain pattern
`copilot-token-optimizer`	Proposes Copilot cost reductions	Workflow run (after analyzer)	✅ Causal chain pattern
`cli-flag-consistency-checker`	CLI flags vs. docs consistency	Weekly	✅ Good hygiene
`test-coverage-improver`	Security-critical test coverage	Weekly	✅ Security-focused scope
`update-release-notes`	Updates notes after release	Release published	✅ Reactive automation
`build-test`	Build validation on PRs	PR open/sync	✅ Standard quality gate
`plan`	ChatOps `/plan` command	slash command	✅ Interactive capability
`pelis-agent-factory-advisor`	This workflow	Daily	✅ Meta-advisory
`security-review`	Daily security review + threat modeling	Daily	✅ Thorough
`build-test`	PR build validation	PR	✅ Standard

🚀 Actionable Recommendations

P0 — Implement Immediately

[P0] Issue Triage Agent

What: Automatically label incoming issues (bug, enhancement, documentation, question, security, firewall-config, etc.) and post a brief explanatory comment mentioning the author.

Why: This is the "hello world" of agentic workflows. The repository has issue-monster for dispatch and issue-duplication-detector for deduplication — but zero automated labeling. PRs already have security-guard; issues need equivalent first-response automation. Every unlabeled issue entering the tracker is a small paper cut that compounds.

How: Trigger on issues: opened. Use a GitHub-focused agent that reads issue title/body, checks the codebase context (firewall, docker, squid, iptables, CLI, docs), applies a relevant label, and comments with a brief explanation. Add awf-triage and domain-specific labels like squid, iptables, docker, cli, security.

Effort: Low — can be adapted directly from the Pelis issue-triage-agent

---
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
safe-outputs:
  add-labels:
    allowed: [bug, enhancement, documentation, question, security, squid, iptables, docker, cli, api-proxy, chroot, performance]
  add-comment:
    max: 1
timeout-minutes: 5
---
# Issue Triage Agent
...

P1 — Plan for Near-Term

[P1] Workflow Health Manager (Meta-Agent)

What: A meta-agent that monitors the health of all 27 agentic workflows — tracking success/failure rates, stale workflows, unhandled errors, and declining performance — and creates issues when workflows degrade.

Why: The repository has 27 workflows but no agent watching over them collectively. The CI Doctor handles CI failures reactively; this would proactively monitor the entire agentic ecosystem. Pelis's Workflow Health Manager created 40 issues and led to 34 PRs. Given the current failing smoke-codex and smoke-services workflows, this gap is already causing noise.

How: Daily schedule. Read all workflow run histories using agentic-workflows + actions tools. Compute failure rates, detect workflows that haven't run recently, flag consistently failing workflows, correlate failures with recent code changes. Create issues with workflow-health label for degraded workflows.

Effort: Medium

[P1] Daily Malicious Code Scan

What: Daily scan of recent commits (last 24–48h) for suspicious code patterns — unexpected network calls, credential harvesting, eval/exec with dynamic args, dependency injection — with particular focus on this security-critical codebase.

Why: This repository is a security firewall tool. Supply chain attacks and accidental security regressions are uniquely high-risk here. The secret diggers scan for credentials; this is a complementary behavioral scan. Pelis had a dedicated workflow for this. The security-review workflow is comprehensive but broad; a focused behavioral scan of recent changes would be a useful layer.

Effort: Low-Medium — use bash tools to git-diff recent commits, then have the agent analyze changes

[P1] Breaking Change Checker

What: Monitor PRs and recent commits for changes to the public API/CLI that might break existing users — changed flag semantics, removed options, modified Docker Compose schemas, changed environment variable names, updated firewall behavior.

Why: AWF is a CLI tool with external users who depend on stable behavior. The CLI flag consistency checker covers docs sync; this covers backward compatibility. Given the repository has 40k+ workflow runs, many downstream users depend on stable interfaces. Pelis's Breaking Change Checker created alert issues before regressions reached production.

How: Trigger on PR open/sync. Analyze changed files in src/cli.ts, src/types.ts, src/docker-manager.ts, action.yml, and container entrypoints. Check for removed/renamed flags, changed defaults, modified Docker Compose schemas, altered environment variable names.

Effort: Medium

[P1] Automatic Code Simplifier

What: Daily agent that analyzes recently modified TypeScript files (last 3–5 commits) and proposes PRs with simplifications — early returns, extracting helper functions, removing redundant logic, using standard library patterns.

Why: Pelis's Code Simplifier had an 83% merge rate and consistently improved code quality after rapid development sessions. This repo has complex code in src/docker-manager.ts, src/cli.ts, and containers/agent/entrypoint.sh — prime candidates for incremental simplification. The doc-maintainer and test-coverage-improver already show this repository values continuous improvement agents.

Effort: Low-Medium — trigger daily, focused on src/**/*.ts, skip test files and generated files

P2 — Consider for Roadmap

[P2] Schema & Type Consistency Checker

What: Weekly agent that checks for drift between TypeScript type definitions (src/types.ts), documentation (docs/), CLI help text (src/cli.ts), and action.yml — ensuring all surfaces stay synchronized.

Why: This repo has complex configuration types (WrapperConfig, SquidConfig, DockerComposeConfig) that are described in multiple places. Pelis's Schema Consistency Checker created 55 analysis discussions catching drift that would have taken days to notice manually. With 27+ CLI flags and growing configuration complexity, drift is inevitable without automation.

Effort: Medium

[P2] Agentic Portfolio Analyst (General)

What: Weekly meta-agent that analyzes the overall health and cost-effectiveness of all agentic workflows — not just token usage, but merge rates, action rates, discussion creation, issue resolution velocity, and ROI per workflow.

Why: The repo has excellent per-engine token analyzers but no holistic portfolio view. Which workflows are delivering value? Which are running but never creating issues or PRs? A portfolio view helps prioritize which workflows to enhance or retire. Pelis's Portfolio Analyst identified unnecessarily chatty agents costing money.

Effort: Medium — build on existing token analysis infrastructure

[P2] Issue Arborist (Sub-issue Linker)

What: Periodic agent that finds related open issues and links them as GitHub sub-issues to create hierarchy and improve organization.

Why: Given the cross-repo firewall-issue-dispatcher already creates tracking issues from github/gh-aw, the issue tracker likely has related issues that would benefit from grouping. Pelis's Issue Arborist created 18 parent issues and 77 reports. With growing AWF adoption, issue volume will increase.

Effort: Low

[P2] Documentation Noob Tester

What: Weekly agent that role-plays as a new user and follows the installation/quick-start documentation step by step, flagging ambiguous or missing steps.

Why: AWF has complex installation requirements (Docker, sudo, npm link, iptables). New user friction is likely high. The doc-maintainer syncs with code but doesn't test from a novice perspective. Pelis's version had a 43% merge rate but still contributed 9 merged PRs by surfacing genuinely confusing steps.

Effort: Medium — requires bash tools to simulate install steps

P3 — Future Ideas

[P3] Changeset / Version Bump Automation

What: Automated changelog entry and version bump proposal triggered by merges to main between releases, analyzing commit messages and labeling the change type.

Why: update-release-notes runs after a release is published; this would automate the pre-release work. Pelis's Changeset had a 78% merge rate. Particularly valuable as release frequency increases.

[P3] Daily Workflow Updater (Dependency Updater for Workflows)

What: Periodic agent that checks for newer versions of actions and AWF workflows referenced in lock files and proposes updates.

Why: The agentics-maintenance.yml handles some of this, but a dedicated agentic workflow could be more contextual.

[P3] Container Security Scanner Enhancement

What: Agent that analyzes Dockerfile/container configurations for security best practices and generates improvement PRs.

Why: The containers in containers/ are security-critical. An agent specialized in container hardening (non-root users, minimal base images, capability drops) could complement the existing security review.

📈 Maturity Assessment

Dimension	Score	Notes
Issue Management	3/5	Dedup ✅, dispatch ✅, cross-repo ✅ — but no triage/labeling
Code Quality	2/5	CLI consistency ✅ — no simplifier, no refactoring, no schema checker
Testing/Validation	5/5	5 smoke tests, test coverage improver, build validation
Security	5/5	Secret diggers x3 engines hourly, security guard on PRs, daily review, dep monitor
Documentation	4/5	Doc maintainer ✅ — no noob tester, no unbloat
Observability	3/5	Token analyzers ✅ — no portfolio analyst, no meta workflow health monitor
Operations/Release	3/5	Release notes ✅ — no changeset automation
Overall	4/5	Advanced, production-grade agentic ecosystem

Current Level: 4 — Advanced, production-grade (>25 workflows, multi-engine, cross-repo, causal chains)
Target Level: 5 — Comprehensive factory (add meta-monitoring, continuous code quality, full issue lifecycle)
Gap to 5: Add issue triage, workflow health manager, code simplifier, breaking change checker

🔄 Comparison with Best Practices

What It Does Exceptionally Well

Multi-engine smoke tests — Running Claude, Copilot, Codex, and Services smoke tests every 12h is beyond what Pelis described; this is best-in-class validation
Security depth — 3 secret diggers running every hour across 3 AI engines is exceptional; the security-guard on every PR is exemplary
Cross-repo automation — The firewall-issue-dispatcher bridging github/gh-aw and gh-aw-firewall is a sophisticated pattern not seen in the reference examples
Token optimization pipeline — The analyzer → optimizer causal chain (twice, for Claude and Copilot) is a mature cost-management pattern
Self-referential testing — Smoke testing a firewall tool using that firewall tool demonstrates deep operational commitment

What Could Improve

No issue triage — The most foundational workflow in Pelis is missing; every new issue needs manual labeling
No meta-monitoring — 27 workflows with no health monitor; failures go unnoticed until CI Doctor catches post-hoc failures
No continuous code quality — No simplification or refactoring agent; the codebase grows complex without automated cleanup
No backward-compatibility guard — Critical for a tool with external users

Unique Opportunities (Domain-Specific)

The firewall/security domain creates unique opportunities other repos don't have:

Firewall rule regression testing — An agent that verifies iptables rules haven't been accidentally relaxed
Container escape scenario tester — An agent that validates key security invariants haven't been violated by recent changes
AWF self-hosting validation — Verifying that AWF itself can be run securely through AWF (dogfooding the security model)

📝 Notes Saved to Cache Memory

Saved analysis to /tmp/gh-aw/cache-memory/pelis-patterns.md for future runs. Next run should:

Check if issue triage agent was added
Track whether smoke-codex and smoke-services failures were resolved
Monitor if workflow health manager was implemented
Note any new workflows added since this run

AI generated by Pelis Agent Factory Advisor

expires on Apr 10, 2026, 3:37 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Assessment & Recommendations (2026-04-03) #1639

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Assessment & Recommendations (2026-04-03) #1639

Uh oh!

github-actions[bot] bot Apr 3, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

[P0] Issue Triage Agent

P1 — Plan for Near-Term

[P1] Workflow Health Manager (Meta-Agent)

[P1] Daily Malicious Code Scan

[P1] Breaking Change Checker

[P1] Automatic Code Simplifier

P2 — Consider for Roadmap

[P2] Schema & Type Consistency Checker

[P2] Agentic Portfolio Analyst (General)

[P2] Issue Arborist (Sub-issue Linker)

[P2] Documentation Noob Tester

P3 — Future Ideas

[P3] Changeset / Version Bump Automation

[P3] Daily Workflow Updater (Dependency Updater for Workflows)

[P3] Container Security Scanner Enhancement

📈 Maturity Assessment

🔄 Comparison with Best Practices

What It Does Exceptionally Well

What Could Improve

Unique Opportunities (Domain-Specific)

📝 Notes Saved to Cache Memory

Replies: 0 comments

github-actions[bot]
bot Apr 3, 2026