[Pelis Agent Factory Advisor] Agentic Workflow Opportunities — Q1 2026 Analysis #1509

2026-03-31T03:39:27Z

github-actions[bot]
bot Mar 31, 2026

📊 Executive Summary

The gh-aw-firewall repository demonstrates exceptional agentic workflow maturity — one of the most automated security tool repositories I've analyzed. With ~20 active agentic workflows spanning security, documentation, testing, and CI/CD, the team has embraced the "let's create a workflow for that" philosophy. However, several high-value Pelis Agent Factory patterns are missing, particularly around issue triage, code quality automation, workflow observability, and domain-specific firewall health monitoring.

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns from the Documentation Site

From reading the Pelis Agent Factory blog series, I identified these core principles:

Specialization over monolithic agents — Many focused workflows beat one "do-everything" agent
Meta-agents are essential — Agents that watch other agents become invaluable at scale (>10 workflows)
Causal chains multiply impact — Workflows that create issues → which trigger Copilot → which create PRs compound value
Read-only analysts are safe starters — Discussion-generating workflows add value with near-zero risk
Guardrails enable experimentation — skip-if-match and max: limits prevent runaway agents
Cache memory enables continuity — Persistent memory across runs enables trend analysis and deduplication

How This Repo Compares

Pattern	Status
Issue triage automation	❌ Missing
Code quality automation (simplify/refactor)	❌ Missing
Fault investigation (CI Doctor)	✅ Present
Documentation maintenance	✅ Present
Security scanning	✅ Excellent (3 hourly red-team agents!)
Workflow observability / meta-agents	❌ Missing
Issue/PR management helpers	⚠️ Partial (Monster, no Arborist/Mergefest)
Test coverage automation	✅ Present
Cross-repo integration	✅ Present (Dispatcher)
Dependency security	✅ Present

Agentics Reference Repository Patterns

The githubnext/agentics repository contains additional patterns not yet present here: daily-test-improver, code-simplifier, duplicate-code-detector, grumpy-reviewer, contribution-guidelines-checker, weekly-issue-summary, daily-accessibility-review, daily-perf-improver, and vex-generator.

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`build-test.md`	Build + test suite in agent	PR, dispatch	✅ Good — multi-runtime
`ci-cd-gaps-assessment.md`	Analyze CI/CD gaps	Daily	✅ Good
`ci-doctor.md`	Investigate CI failures	On workflow failure	✅ Excellent — causal chain
`cli-flag-consistency-checker.md`	CLI flags vs docs drift	Weekly	✅ Good
`dependency-security-monitor.md`	CVE monitoring	Daily	✅ Good
`doc-maintainer.md`	Docs sync with code	Daily	✅ Good
`firewall-issue-dispatcher.md`	Cross-repo issue routing	Every 6h	✅ Unique/excellent
`issue-duplication-detector.md`	Deduplicate issues	On new issue	✅ Good — uses cache-memory
`issue-monster.md`	Assign issues to Copilot	Hourly + new issue	✅ Good — task dispatcher
`pelis-agent-factory-advisor.md`	This advisor	Daily	✅ Meta
`plan.md`	Project planning	Slash command	✅ Good — interactive
`secret-digger-*.md` (×3)	Red-team secret hunting	Hourly (3 engines!)	✅✅ Exceptional
`security-guard.md`	PR security review	On PR	✅ Good
`security-review.md`	Daily threat modeling	Daily	✅ Excellent
`smoke-*.md` (×4)	Engine smoke tests	PR + dispatch	✅ Good — multi-engine
`test-coverage-improver.md`	Security-focused test coverage	Weekly	✅ Good

🚀 Actionable Recommendations

P0 — Implement Immediately

[P0] Issue Triage Agent

What: Automatically label new issues using predefined categories, leave a comment explaining the label, and provide initial triage context for maintainers.

Why: This is the "hello world" of agentic workflows per Pelis documentation. Every new issue currently lands with zero automated triage. For a security tool, proper labeling (security, firewall-bypass, documentation, enhancement, question) would help prioritize work and let the Issue Monster make better decisions. Implementation is minimal — no write permissions beyond labels and comments.

How: Add a workflow triggered on issues: [opened, reopened] with read permissions + safe-outputs: add-labels and add-comment.

Effort: Low (1-2 hours)

---
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
safe-outputs:
  add-labels:
    allowed: [bug, security, documentation, enhancement, question, 
              firewall-bypass, container-security, performance, help-wanted,
              good-first-issue, breaking-change]
  add-comment: {}
---
# Issue Triage Agent
For each unlabeled issue in $\{\{ github.repository }}, analyze the title and body
in the context of the AWF firewall codebase and apply an appropriate label...

[P0] Breaking Change Checker

What: When a PR changes CLI flags, public API interfaces, or configuration schema, automatically detect potentially breaking changes and alert maintainers with an issue or PR comment.

Why: gh-aw-firewall is a CLI tool with a published interface used across many workflows. Breaking --allow-domains, --enable-api-proxy, or other flags silently would break every user. The cli-flag-consistency-checker catches documentation drift but not backward incompatibilities. A breaking change checker would specifically analyze changed TypeScript interfaces, removed/renamed CLI flags, and Docker Compose schema changes.

How: Triggered on PR, read the diff, look for removals/renames in src/cli.ts flags, src/types.ts interfaces, action.yml inputs, and docker-compose.yml structure.

Effort: Low-Medium (2-3 hours)

P1 — Plan for Near-Term

[P1] Workflow Health Manager (Meta-Agent)

What: A weekly meta-agent that reviews all other agentic workflow runs, identifies patterns (frequent failures, high cost, zero output), and creates issues/PRs to fix them.

Why: With 20 workflows now, manual monitoring is impossible. The Pelis team found this workflow created 40 issues with a 25-PR causal chain. Recent runs show Smoke Codex and Dependency Security Monitor failing — a health manager would auto-triage these. This is the difference between an observable agent factory and a black box.

How: Use agentic-workflows tool (already available!) to pull logs for all workflows, identify failure patterns, cost outliers, and zero-output "zombie" workflows.

Effort: Medium (3-5 hours)

---
on:
  schedule: weekly
tools:
  agentic-workflows:
  github:
    toolsets: [default, actions]
  cache-memory:
    key: workflow-health
safe-outputs:
  create-issue:
    title-prefix: "[Workflow Health] "
    max: 5
  create-discussion:
    title-prefix: "[Workflow Health Report] "
---
# Workflow Health Manager
Analyze all agentic workflow runs from the past week using agentic-workflows.logs...

[P1] Container Image Freshness Monitor

What: Weekly check that base Docker images used in the firewall containers (ubuntu:22.04, ubuntu/squid, Node.js versions) have no critical security patches pending, and propose updates when needed.

Why: This is a security tool — using stale base images with unpatched CVEs would be ironic. The containers form the security boundary, so their freshness is especially critical. This is a domain-specific gap not covered by dependency-security-monitor.md (which focuses on npm packages).

How: Check Docker Hub API for latest digest/tag of each base image, compare against what's pinned in containers/*/Dockerfile, create an issue if newer security patches are available.

Effort: Medium

[P1] Firewall Escape Trend Analyzer

What: Weekly analysis of results from all three secret-digger runs (Claude, Codex, Copilot), tracking which escape techniques were attempted, identifying trends, and flagging new attack vectors.

Why: Three red-team agents run hourly. Their results exist in workflow logs and discussions, but nobody aggregates the trends. Are escape attempts increasing? Are new techniques appearing? Did a recent code change create a new bypass? This analyzer would create the intelligence layer on top of the raw red-team data.

How: Use agentic-workflows.logs + cache-memory to aggregate escape attempt patterns across runs, create a weekly trend discussion.

Effort: Medium

[P1] Code Simplifier

What: Daily workflow that analyzes recently modified TypeScript files and creates PRs with simplifications that preserve functionality while improving clarity.

Why: TypeScript codebases benefit greatly from simplification automation. The Pelis team's Code Simplifier achieved 83% merge rate. In security code, unnecessary complexity is itself a vulnerability surface — simpler iptables logic, clearer domain validation, and cleaner container configuration reduce the chance of subtle bugs.

How: Trigger daily, look at commits from last 24h, identify changed .ts files, propose simplifications via create-pull-request.

Effort: Low (adapting from reference implementation)

P2 — Consider for Roadmap

[P2] Issue Arborist

What: Periodically links related issues as sub-issues, creating parent/child relationships between related security issues, feature requests, and bug clusters.

Why: The issue tracker tends to accumulate unrelated-looking issues that are actually related (e.g., multiple issues about DNS handling, or different IPv6 bypass attempts). The Arborist created 18 parent issues in the Pelis factory, dramatically improving organization.

Effort: Medium

[P2] Mergefest (Main Branch Merger)

What: Automatically merge the main branch into open PRs that have been open for more than 2 days without conflicts.

Why: Long-lived PRs (smoke test fixes, test coverage, documentation) frequently fall behind main. Automated merging reduces the "please merge main" ceremony and prevents integration surprises.

Effort: Low

[P2] Documentation Noob Tester

What: A workflow that reads the README and docs as a "first-time user" would, identifies confusing steps, and creates issues for documentation improvements.

Why: AWF's installation process (Docker, sudo requirements, iptables) is complex. A noob tester would catch places where the docs assume too much knowledge. The Pelis team's version had 43% merge rate through a causal chain.

Effort: Low-Medium

[P2] Weekly Issue/Activity Summary

What: A weekly read-only workflow that summarizes open issues by category, PR activity, and workflow health into a single discussion post.

Why: Makes the repository more legible for occasional contributors and maintainers who don't track GitHub daily. The githubnext/agentics repo has a weekly-issue-summary.md that serves this purpose.

Effort: Low

[P2] Contribution Guidelines Checker

What: On new PRs from external contributors, check that the PR follows contribution guidelines (conventional commits, tests included, docs updated) and post a friendly guidance comment.

Why: The repo has CONTRIBUTING.md but no automated enforcement. Contributors from the broader AWF community may not know the conventions. Early feedback reduces back-and-forth review cycles.

Effort: Low

P3 — Future Ideas

[P3] Documentation Unbloat

Reduce verbosity in documentation files, making them more scannable. The Pelis team's version had 85% merge rate. Current docs are well-maintained but some sections (AGENTS.md, README) could be more concise.

[P3] Duplicate Code Detector

Identify duplicate patterns in the TypeScript codebase using semantic analysis. Given the codebase is ~6000 LOC, there may be opportunities to extract utilities.

[P3] Daily Accessibility Review for Docs-Site

The Astro Starlight docs-site could be periodically tested for accessibility (using Playwright/axe). The Pelis factory's daily-accessibility-review.md is a reference implementation.

[P3] VEX Generator

Automatically generate VEX (Vulnerability Exploitability eXchange) documents for CVEs reported in dependencies, documenting whether the vulnerability is actually exploitable in AWF's deployment context.

[P3] Sub Issue Closer

Automatically close sub-issues when their parent issue is resolved. Low effort, reduces issue tracker noise.

📈 Maturity Assessment

	Score
Current Level	4/5 — Advanced
Target Level	4.5/5 — Expert
Gap	Workflow observability, code quality automation, issue triage

Current level description: The repository has exceptional security automation (hourly red-team agents in 3 LLMs!), solid CI/fault investigation, documentation maintenance, and cross-repo integration. This puts it well ahead of most repositories.

To reach Expert level:

Add workflow health manager (observe the observers)
Add issue triage (close the incoming-work loop)
Add breaking change detection (protect the public interface)
Add code quality automation (continuous simplification)

🔄 Comparison with Best Practices

What This Repository Does Exceptionally Well

Security-first automation: Running 3 red-team agents hourly (one per LLM engine) is unique and reflects the repository's domain expertise being applied to its own operations
Cross-repo integration: The firewall-issue-dispatcher.md pulling issues from github/gh-aw is a sophisticated pattern not common in smaller projects
Multi-engine smoke testing: Testing against Claude, Codex, and Copilot simultaneously catches engine-specific regressions
Issue Monster + CI Doctor causal chain: The combination of automated issue dispatch and CI failure investigation creates a self-healing CI loop

What Could Be Improved

Incoming work management: New issues arrive without triage — labels, priority, or initial context
Observability of the agents themselves: No workflow monitors the health of the 20 workflows as a system
Code quality automation: No agent proposes simplifications or refactors, leaving that entirely to human code review
Trend analysis: Raw data from security agents (secret-diggers) is not aggregated into trends

Unique Opportunities Given the Domain

This repository is its own best use case for AWF. Some uniquely applicable automations:

Run the doc-maintainer and other agents inside AWF to validate their own firewall configurations
Smoke-test new domain allowlists against real CI workflows before merging
Use the CI Doctor's findings to improve the firewall's own error reporting

📝 Notes for Future Runs

Cache memory updated at /tmp/gh-aw/cache-memory/notes.txt with:

Current workflow inventory (20 agentic workflows)
Key gaps identified
Current test coverage baseline (38.39% statements)
Recent failing workflows: Smoke Codex, Dependency Security Monitor

AI generated by Pelis Agent Factory Advisor

expires on Apr 7, 2026, 3:39 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Opportunities — Q1 2026 Analysis #1509

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Opportunities — Q1 2026 Analysis #1509

Uh oh!

github-actions[bot] bot Mar 31, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns from the Documentation Site

How This Repo Compares

Agentics Reference Repository Patterns

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

[P0] Issue Triage Agent

[P0] Breaking Change Checker

P1 — Plan for Near-Term

[P1] Workflow Health Manager (Meta-Agent)

[P1] Container Image Freshness Monitor

[P1] Firewall Escape Trend Analyzer

[P1] Code Simplifier

P2 — Consider for Roadmap

[P2] Issue Arborist

[P2] Mergefest (Main Branch Merger)

[P2] Documentation Noob Tester

[P2] Weekly Issue/Activity Summary

[P2] Contribution Guidelines Checker

P3 — Future Ideas

[P3] Documentation Unbloat

[P3] Duplicate Code Detector

[P3] Daily Accessibility Review for Docs-Site

[P3] VEX Generator

[P3] Sub Issue Closer

📈 Maturity Assessment

🔄 Comparison with Best Practices

What This Repository Does Exceptionally Well

What Could Be Improved

Unique Opportunities Given the Domain

📝 Notes for Future Runs

Replies: 0 comments

github-actions[bot]
bot Mar 31, 2026