[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor: Agentic Workflow Maturity Report (2026-04-02) #1598

2026-04-02T03:37:16Z

github-actions[bot]
bot Apr 2, 2026

📊 Executive Summary

gh-aw-firewall is one of the most agentic repositories in the GitHub ecosystem — 24 compiled agentic workflows spanning security scanning, smoke testing across three AI engines, CI health, documentation, and cross-repo integration. The repo demonstrates Level 4/5 maturity. The primary opportunities are in meta-agent observability (no workflow health monitor), PR/issue organization (no arborist or PR updater), and domain-specific automation unique to a firewall project (escape testing, image freshness, VEX generation).

🎓 Patterns Learned from Pelis Agent Factory

The Pelis Agent Factory operates 100+ workflows across several categories. Key patterns observed:

Category	Pattern	Relevance to This Repo
Meta-agents	Audit Workflows / Workflow Health Manager — agents that watch other agents	Missing here
Issue org	Issue Arborist — auto-links related issues as sub-issues	Missing here
PR hygiene	Mergefest — auto-merges main into long-lived PRs	Missing here
Cost control	Portfolio Analyst — identifies token/cost inefficiencies	Missing here
CI optimization	CI Coach — separately optimizes CI pipeline speed	Missing here
Breaking changes	Breaking Change Checker — alerts on backward-incompatible changes	Missing here
Docs testing	Docs Noob Tester — tests docs from a first-time user perspective	Partial (doc-maintainer exists)
Specialization	Multiple focused agents > one monolithic agent	Already well-implemented
skip-if-match	Prevent flooding when issues/PRs already pending	Used in several workflows here
Multi-engine	Claude + Codex + Copilot smoke tests are themselves agent-specific validations	Excellent usage!

From the agentics reference repository (githubnext/agentics), particularly notable patterns not yet in use here: daily-repo-chronicle (narrative history), contribution-check (PR onboarding checks), q.md (interactive Q&A), vex-generator (container vulnerability docs).

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`build-test`	Verifies PRs build & pass CI	PR opened/sync	✅ Core
`ci-doctor`	Investigates CI failures	Workflow failure on main	✅ High value
`ci-cd-gaps-assessment`	Identifies CI/CD coverage gaps	Daily	✅ Good meta
`cli-flag-consistency-checker`	Checks CLI flag docs consistency	Weekly	✅ Domain-specific
`copilot-token-usage-analyzer`	Tracks LLM token costs	Daily	✅ Analytics
`dependency-security-monitor`	CVE detection + dep update PRs	Daily	✅ Security
`doc-maintainer`	Syncs docs with code changes	Daily	✅ Docs
`firewall-issue-dispatcher`	Cross-repo AWF issue tracking	Every 6h	✅ Unique
`issue-duplication-detector`	Finds duplicate issues	Issue opened	✅ Hygiene
`issue-monster`	Assigns issues to Copilot agent	Hourly + issue opened	✅ Orchestrator
`pelis-agent-factory-advisor`	This advisor workflow	Daily	✅ Meta
`plan`	Slash command for planning	Comment `/plan`	✅ ChatOps
`secret-digger-claude/codex/copilot`	Scans for exposed secrets (3 engines)	Hourly each	✅ Unique multi-engine
`security-guard`	PR security review	PR opened/sync	✅ Critical
`security-review`	Deep daily security + threat model	Daily	✅ Comprehensive
`smoke-claude/codex/copilot/services/chroot`	End-to-end agent smoke tests	12h + PR	✅ Excellent
`test-coverage-improver`	Adds tests for coverage gaps	Weekly	✅ Quality
`update-release-notes`	Enriches release notes from diff	On release	✅ Release

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Firewall Escape Test Agent

What: A dedicated agent that periodically attempts to escape the firewall sandbox from within, testing bypass vectors.

Why: The security-review workflow explicitly references a "Firewall Escape Test Agent" as complementary to its threat model analysis — but it doesn't exist yet! This creates a gap in the security testing loop. As the reference implementation of the firewall technology, this repo should dogfood active escape testing.

How: Schedule a workflow that runs inside an AWF-sandboxed environment using --build-local, attempts known bypass vectors (DNS exfiltration, IPv6 leakage, port knocking, HTTPS_PROXY bypass, large POST bodies for data exfiltration), and reports results as a daily discussion.

Effort: Medium

---
description: Daily active firewall escape testing — runs bypass attempts and reports findings
on:
  schedule: daily
  workflow_dispatch:
safe-outputs:
  create-discussion:
    title-prefix: "[Firewall Escape Test] "
    category: "general"
---
# Firewall Escape Test Agent
Attempt to escape the AWF firewall using known bypass vectors...

P0.2: Issue Triage Agent

What: Automatically labels new issues with bug, enhancement, documentation, question, security, performance when opened.

Why: Issues are currently routed to issue-monster (assignment) and issue-duplication-detector (dedup), but nothing labels them. Unlabeled issues are harder to triage, search, and prioritize. This is the "hello world" of agentic workflows and the Pelis pattern shows it works immediately.

How: Trigger on issues: [opened, reopened] with read-only issue permissions + safe-outputs: add-labels.

Effort: Low

P1 — Plan for Near-Term

P1.1: Workflow Health Monitor (Meta-Agent)

What: A weekly meta-agent that audits all agentic workflow runs, identifies failures, flakiness, high token costs, and stale/underperforming workflows.

Why: With 24 workflows running continuously, no one is watching the watchers. The Pelis Factory's "Audit Workflows" agent created 93 audit reports and opened 9 issues that led to PRs. Token optimization alone (catching overly verbose agents) has measurable cost impact.

How: Use agenticworkflows.logs + agenticworkflows.audit tools. Create a weekly discussion summarizing: failure rates per workflow, avg turn counts, token costs, any workflow that hasn't run in 7+ days. Create issues for workflows with >50% failure rate.

Effort: Low-Medium

---
on:
  schedule: weekly
tools:
  agentic-workflows:
  cache-memory: true
safe-outputs:
  create-discussion:
    title-prefix: "[Workflow Audit] "
  create-issue:
    title-prefix: "⚠️ Workflow Health: "
---

P1.2: Breaking Change Checker

What: Triggered on PRs and pushes to main, checks for backward-incompatible changes in the CLI interface, container API, or public action interface.

Why: cli-flag-consistency-checker runs weekly and checks documentation gaps, but nothing actively alerts when a change is backward-incompatible (e.g., a flag is removed, a Docker network changes from 172.30.0.0/24). The Pelis Factory's Breaking Change Checker creates alert issues that prevent user-facing regressions from reaching production.

How: Trigger on pull_request + analyze action.yml, src/cli.ts, src/docker-manager.ts for breaking changes. Create a comment on PRs that introduce incompatibilities.

Effort: Medium

P1.3: Docker Base Image Freshness Monitor

What: Weekly check of container base images (ubuntu/squid, ubuntu:22.04, node:*) for available updates or security patches.

Why: The firewall containers use pinned base images. Without a freshness agent, the containers can quietly accumulate CVEs. This is domain-critical for a security product.

How: Trigger weekly. Use bash to inspect Dockerfiles, compare current pinned digest vs. latest available. Create issues for stale images with CVE counts from Docker Hub or a registry API.

Effort: Medium

P1.4: PR Auto-Update (Mergefest-style)

What: Automatically merges main into open PRs to keep them current.

Why: Long-lived feature PRs in this repo (e.g., smoke-* changes that take multiple iterations) frequently need main merged in. Manual merges are a friction point. The Pelis Mergefest workflow is an orchestrator that keeps PRs current and reduces "stale PR" syndrome.

Effort: Low (well-understood pattern from Pelis Factory)

P1.5: Issue Arborist

What: Weekly agent that links related issues as sub-issues, grouping related bugs, features, and security findings.

Why: The issue tracker has distinct clusters (Docker/container issues, domain filtering issues, API proxy issues, documentation). Organizing these into parent/sub-issue hierarchies improves planning and tracking. The Pelis Factory Arborist created 77 discussion reports and 18 parent issues.

Effort: Low-Medium

P2 — Consider for Roadmap

P2.1: Changeset / Version Bump Agent

What: When a PR is merged to main, analyze the nature of changes (breaking/feature/fix) and propose a version bump + changelog entry.

Why: update-release-notes enriches notes after a release is published, but nothing automates the version determination or changelog generation before release. The Pelis Factory's Changeset workflow achieved 78% merge rate (22/28 PRs).

Effort: Medium

P2.2: CI Coach (Pipeline Optimization)

What: Monthly agent analyzing CI pipeline durations, identifying slow jobs, unnecessary dependencies, and optimization opportunities.

Why: The test suite has grown significantly (integration tests, smoke tests for 3 engines, chroot tests). A dedicated CI Coach — separate from CI Doctor (which investigates failures) — focuses on performance optimization. The Pelis Factory CI Coach achieved 100% merge rate (9/9 PRs).

Effort: Medium

P2.3: Interactive Q&A Agent (`/ask` command)

What: Slash command that answers questions about the AWF codebase, configuration, domain lists, and troubleshooting.

Why: plan command exists, but there's no quick "ask the repo" ChatOps capability. Given the complexity of AWF (Docker, iptables, Squid, chroot), an interactive Q&A agent would help contributors get answers quickly. Inspired by q.md and repo-ask.md from the agentics reference repo.

Effort: Low

P2.4: Documentation "Noob Tester"

What: Monthly agent that reads documentation from a "first-time user" perspective, identifies confusing or missing steps, and creates issues.

Why: doc-maintainer keeps docs synchronized with code, but doesn't test them from a user perspective. New users of the firewall frequently struggle with iptables permissions, Docker setup, and the --allow-domains syntax. The Pelis Factory Noob Tester produced 9 merged PRs (43% causal chain merge rate).

Effort: Low

P2.5: VEX Generator for Container Images

What: After each release, generate a Vulnerability Exploitability eXchange (VEX) document for the published container images, documenting which CVEs are not actually exploitable given AWF's security posture.

Why: As a security product, AWF should proactively document its CVE posture. Trivy/Grype scans on ubuntu:22.04 will always find CVEs; a VEX document explains which ones don't apply. This is directly inspired by the vex-generator.md in the agentics reference repo and demonstrates security maturity.

Effort: Medium-High

P3 — Future Ideas

P3.1: Weekly Repository Chronicle

What: Weekly narrative summary of what happened in the repo — PRs merged, issues closed, workflows run, notable discussions.

Why: Visibility into the combined human+agent work stream. Inspired by daily-repo-chronicle.md from agentics. Useful for async team communication.

P3.2: Contribution Onboarding Checker

What: Triggered on first-time contributor PRs, runs a checklist of contribution guidelines, offers guidance on common patterns.

Why: This is a complex security/infrastructure repo. New contributors often miss the requirement to run postprocess-smoke-workflows.ts after modifying smoke .md files, or forget to add workflows to ci-doctor.md's watched list.

P3.3: Portfolio Analyst (Token Cost Optimization)

What: Weekly analysis of token usage across all workflows to identify cost optimization opportunities.

Why: copilot-token-usage-analyzer already does daily analysis — this would be a deeper quarterly analysis focusing on workflow tuning. May be redundant with existing workflow; worth evaluating overlap first.

P3.4: Security Compliance Tracker with Deadlines

What: Tracks open security issues with SLA deadlines (Critical: 7 days, High: 30 days, Medium: 90 days), escalates approaching deadlines.

Why: dependency-security-monitor and security-review surface security issues but don't track SLA compliance. The Pelis Factory Security Compliance workflow managed deadline tracking for vulnerability remediation campaigns.

📈 Maturity Assessment

Dimension	Score	Notes
Coverage	5/5	All major areas covered: security, CI, docs, release, issue mgmt
Multi-engine	5/5	Claude + Codex + Copilot all used, smoke tests for each
Security automation	5/5	Exceptional: 3 secret diggers, daily review, PR guard, dep monitor
Meta-observability	2/5	Missing: workflow health monitor, audit workflows, portfolio analyst
Issue organization	3/5	Dedup + assignment but no labeling, no arborist
PR hygiene	2/5	No auto-update, no breaking change checker
Release automation	3/5	Notes updated but no changeset/version bump
Domain-specific innovation	4/5	firewall-issue-dispatcher and smoke tests are excellent; escape testing missing

Current Level: 4/5 — Advanced agentic workflow consumer, leading-edge security automation, multi-engine validation.

Target Level: 4.5/5 — Add meta-observability layer and PR hygiene workflows; implement the missing domain-specific escape testing.

Gap: ~5 workflows needed to reach target.

🔄 Comparison with Pelis Factory Best Practices

What this repo does exceptionally well:

Multi-engine validation: Secret diggers running across Claude/Codex/Copilot simultaneously is a unique pattern not commonly seen elsewhere
Domain-specific security: The security-review and security-guard workflows are tailored to the firewall's threat model with deep technical depth
Cross-repo integration: firewall-issue-dispatcher bridging gh-aw → gh-aw-firewall is a sophisticated integration pattern
skip-if-match: Properly used to prevent flooding when pending work exists
cache-memory: Used in CI Doctor and smoke tests for knowledge persistence

What could be improved vs. Pelis Factory:

No workflow health monitor: With 24 workflows, this is the most critical gap — no one is watching the watchers
No meta-observability: Pelis Factory's Audit Workflows (93 discussions!) and Portfolio Analyst provide insights that are missing here
Issue labeling gap: Every workflow in Pelis Factory that creates issues labels them; this repo's triage loop has a labeling gap
No escape test agent: The security-review workflow references this agent as if it exists, creating a phantom dependency

Unique opportunities from the security/firewall domain:

Firewall escape testing is uniquely valuable dogfooding — running bypass attempts against your own product
Container image CVE posture documentation (VEX) is more relevant here than for most repos
The awf logs summary command enables firewall telemetry in workflow summaries, which no other workflow currently exploits

📝 Session notes saved to /tmp/gh-aw/cache-memory/pelis-advisor-notes.md for next run. Previous run: first execution.

Generated by Pelis Agent Factory Advisor

AI generated by Pelis Agent Factory Advisor

expires on Apr 9, 2026, 3:37 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor: Agentic Workflow Maturity Report (2026-04-02) #1598

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor: Agentic Workflow Maturity Report (2026-04-02) #1598

Uh oh!

github-actions[bot] bot Apr 2, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Firewall Escape Test Agent

P0.2: Issue Triage Agent

P1 — Plan for Near-Term

P1.1: Workflow Health Monitor (Meta-Agent)

P1.2: Breaking Change Checker

P1.3: Docker Base Image Freshness Monitor

P1.4: PR Auto-Update (Mergefest-style)

P1.5: Issue Arborist

P2 — Consider for Roadmap

P2.1: Changeset / Version Bump Agent

P2.2: CI Coach (Pipeline Optimization)

P2.3: Interactive Q&A Agent (/ask command)

P2.4: Documentation "Noob Tester"

P2.5: VEX Generator for Container Images

P3 — Future Ideas

P3.1: Weekly Repository Chronicle

P3.2: Contribution Onboarding Checker

P3.3: Portfolio Analyst (Token Cost Optimization)

P3.4: Security Compliance Tracker with Deadlines

📈 Maturity Assessment

🔄 Comparison with Pelis Factory Best Practices

Replies: 0 comments

github-actions[bot]
bot Apr 2, 2026

P2.3: Interactive Q&A Agent (`/ask` command)