[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report & Recommendations (March 2026) #1478

2026-03-28T03:33:02Z

github-actions[bot]
bot Mar 28, 2026

📊 Executive Summary

This repository has a mature and security-focused agentic workflow collection — 22 agentic workflows covering security scanning, CI fault investigation, dependency monitoring, and multi-engine smoke testing. The main gaps are in workflow observability (no meta-monitoring agent), issue lifecycle management (no triage/labeling), and release automation (no changeset generator). Given this repo is a firewall for agents, it has a unique opportunity to demonstrate eating its own dog food at every layer.

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns from the Documentation Site

The Pelis Agent Factory blog series (19 parts) revealed these high-value patterns:

Specialization > Monoliths: Separate, focused workflows outperform single large agents. Each workflow does one thing well.
Meta-Agents Are Essential: When you have 10+ workflows, a "Workflow Health Manager" that monitors all other workflows becomes critical — it's the agent that watches the agents.
Causal Chain Automation: Many workflows don't PR directly but create issues that trigger the Issue Monster, which assigns to Copilot. This staged approach prevents parallelism conflicts.
Audit Workflows Pattern: A meta-agent that reviews all recent workflow runs, identifies failures/anomalies, and creates improvement issues. Generated 93 discussions in gh-aw — the most prolific workflow.
CI Doctor at 69% merge rate: Investigation agents that diagnose failures and propose fixes have very high merge rates because they do the tedious diagnostic work humans hate.
Changeset Generator at 78% merge rate: Automating version bump PRs from commit analysis removes friction at release time.
Read-Only Analysts: Some workflows are deliberately read-only (create discussions), providing visibility without risk. These are safe to add first.
skip-if-match: Pattern to prevent duplicate concurrent runs when a similar PR/issue already exists.

From the Agentics Reference Repository

Daily Test Improver: Identifies coverage gaps and incrementally implements new tests.
Workflow Health Manager: Monitors all other agentic workflows and creates improvement issues.

How This Repo Compares

This repo already uses many Factory patterns well (secret diggers with 3 engines, security-guard on PRs, CI doctor). The main missing layer is observability (no workflow health monitor) and issue management (no triage agent).

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`build-test.md`	Multi-language CI build/test suite	PR, workflow_dispatch	✅ Good — broad runtime coverage
`ci-cd-gaps-assessment.md`	Daily CI/CD pipeline gap analysis	Schedule daily	✅ Good — produces actionable discussions
`ci-doctor.md`	Investigates CI failures, proposes fixes	workflow_run failure	✅ Strong — monitors 20+ workflows, but list is manually maintained
`cli-flag-consistency-checker.md`	Weekly CLI/docs flag discrepancy detection	Schedule weekly	✅ Good
`dependency-security-monitor.md`	Daily vulnerability detection + patch PRs	Schedule daily	✅ Comprehensive — covers CVEs and patch bundling
`doc-maintainer.md`	Daily documentation sync with code changes	Schedule daily, skip-if-match	✅ Good — uses `skip-if-match` correctly
`firewall-issue-dispatcher.md`	Cross-repo: dispatches `awf`-labeled issues from gh-aw	Schedule every 6h	✅ Unique cross-repo pattern
`issue-duplication-detector.md`	Detects duplicates on new issue	issues.opened	✅ Uses cache-memory for persistence
`issue-monster.md`	Assigns issues to Copilot coding agent	issues.opened, schedule hourly	✅ Good orchestrator pattern
`pelis-agent-factory-advisor.md`	This workflow	Schedule	✅ Meta-level
`plan.md`	/plan slash command	slash_command	✅ Good ChatOps pattern
`secret-digger-claude.md`	Hourly container secret scan (Claude)	Cron :05	✅ Excellent — red-teaming the product
`secret-digger-codex.md`	Hourly container secret scan (Codex)	Cron :10	✅ Multi-engine coverage
`secret-digger-copilot.md`	Hourly container secret scan (Copilot)	Cron	✅ Multi-engine coverage
`security-guard.md`	PR review for security boundary changes	PR	✅ Domain-specific security review
`security-review.md`	Daily comprehensive security + threat modeling	Schedule daily	✅ Comprehensive with web-fetch and cache
`smoke-chroot.md`	Chroot integration smoke test	PR, schedule	✅ Good — exercises actual AWF chroot
`smoke-claude.md`	Claude engine smoke test	PR, schedule	✅ End-to-end with real agent
`smoke-codex.md`	Codex engine smoke test	PR, schedule	✅ Multi-engine coverage
`smoke-copilot.md`	Copilot engine smoke test	PR, schedule	✅ Multi-engine coverage
`test-coverage-improver.md`	Weekly test coverage gap analysis + PRs	Schedule weekly	✅ Focuses on security-critical paths
`update-release-notes.md`	Enhances release notes on publish	release.published	✅ Good — triggered by actual release events

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Issue Triage Agent

What: Auto-label new issues with bug, feature, enhancement, documentation, question, help-wanted, security when they have no labels. Leave a brief comment explaining the label and suggesting next steps.

Why: This is the "hello world" of agentic workflows per the Factory — immediately useful, very low risk. This repo has 20+ open issues with no labels. It reduces maintainer triage burden on every new issue.

How: Simple issues.opened trigger, read-only analysis, add-labels + add-comment safe outputs. Configure min-integrity: none for a public repo so all contributors' issues are visible.

Effort: Low (1 hour)

Example:

---
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
    min-integrity: none
safe-outputs:
  add-labels:
    allowed: [bug, feature, enhancement, documentation, question, security, help-wanted]
  add-comment:
    max: 1
timeout-minutes: 5
---
# Issue Triage Agent
Analyze issue #$\{\{ github.event.issue.number }} in $\{\{ github.repository }}.
Apply one label based on content. Comment explaining the label and how the issue may be addressed.

P0.2: Workflow Health Manager

What: A daily meta-agent that monitors the health of all other agentic workflows in this repository. It scans recent runs, detects patterns like repeated failures, stale workflows, disabled workflows, and high error rates, then creates improvement issues.

Why: With 22+ agentic workflows (and growing), there's no visibility into which ones are failing consistently, which are stale, which are costing the most tokens, or which are producing low-quality outputs. The Factory's "Audit Workflows" agent generated 93 discussions and 9 issues — it became the most valuable observability tool. Given that this repo is a firewall product, having a meta-layer monitoring the health of the agent ecosystem is both practical and on-brand.

How: Daily schedule, reads actions logs, uses agentic-workflows tool, cache-memory for trend detection across runs, creates issues for persistent failures.

Effort: Medium (4–6 hours)

P1 — Plan for Near-Term

P1.1: Breaking Change Checker

What: On every PR, detect changes to the CLI flags, public API, Docker container interfaces, environment variables, or Squid configuration schema that could break existing users. Create an alert issue or comment flagging the breaking change with migration guidance.

Why: AWF has real production users. Breaking changes to --allow-domains syntax, the Docker Compose structure, the iptables init container interface, or exported environment variables (HTTP_PROXY, SQUID_PROXY_HOST, etc.) require communication. The Factory's Breaking Change Checker creates alert issues. This is especially important for a security tool where unexpected behavior changes could silently break user pipelines.

How: pull_request trigger, analyze diff against src/cli.ts (flags), src/docker-manager.ts (compose API, env vars), src/squid-config.ts (config schema), and docs/. Create an issue with [breaking-change] label if detected.

Effort: Medium

P1.2: Container Image Security Scanner

What: Daily workflow that scans the three published Docker images (squid, agent, api-proxy) for known vulnerabilities using Trivy or Grype via bash. Creates issues for HIGH/CRITICAL CVEs with remediation guidance.

Why: AWF publishes container images to GHCR. These images run with elevated privileges (NET_ADMIN, SYS_CHROOT) in user environments. A vulnerability in the Squid or Ubuntu base image is high-severity. The dependency-security-monitor.md covers npm dependencies but not Docker image layers. This is a gap unique to this repo's published artifact type.

How: Daily schedule, run trivy image ghcr.io/github/gh-aw-firewall/agent:latest via bash, parse JSON output, create issues for findings above severity threshold.

Effort: Medium

P1.3: Changeset / Version Bump Automation

What: Weekly workflow that analyzes unreleased commits since the last tag, determines the appropriate semver bump (major/minor/patch) based on conventional commit types, and proposes a PR that updates package.json version and CHANGELOG.md.

Why: The repo uses conventional commits (enforced by commitlint). The Factory's Changeset workflow achieved 78% PR merge rate. The existing update-release-notes.md only runs after a release is published — it doesn't drive the release process. A Changeset agent would close the loop by proposing releases proactively.

How: Weekly schedule with skip-if-match to avoid duplicate PRs. Uses git log + git tag bash commands to find unreleased commits. Determines bump type from commit prefixes (feat: → minor, fix: → patch, BREAKING CHANGE: → major).

Effort: Medium

P1.4: Static Analysis Report

What: Daily workflow that runs zizmor, poutine, and actionlint on the compiled .lock.yml workflow files and posts findings as a discussion, creating issues for HIGH/CRITICAL findings.

Why: This repo has ~25 compiled lock files, many with complex permissions and network configurations. The Factory's Static Analysis Report generated 57 discussions + 12 Zizmor security reports. Given AWF is a security product, ensuring its own workflows are secure (no script injection, no unpinned actions) is essential. zizmor catches expression injection issues that are especially relevant for workflows that process issue/PR content.

How: Daily schedule, bash commands for zizmor, poutine, actionlint, structured JSON output parsed into a discussion. Issue creation for critical findings.

Effort: Low-Medium (tools already available)

P2 — Consider for Roadmap

P2.1: Issue Arborist

What: Periodic workflow that finds related open issues and links them as sub-issues, building a dependency tree. For AWF, this would group issues by component (squid, agent, api-proxy, CLI) or by feature area.

Why: The repo has 20+ open issues that are likely related (e.g., proxy compatibility issues, chroot path issues). The Factory's Arborist created 18 parent issues and 77 discussion reports. Organizing issues reduces duplication and helps prioritize.

Effort: Medium

P2.2: Audit Workflows Meta-Agent

What: Weekly meta-agent that reviews all agentic workflow run logs, identifies success/failure patterns, costly runs, and quality degradation, then produces an analytics discussion.

Why: The Factory's Audit Workflows agent was its most prolific (93 discussions, 9 issues). With 22 workflows and growing, having quantified metrics on merge rates, failure rates, and cost-per-workflow enables informed optimization decisions.

Effort: Medium-High

P2.3: Documentation Noob Tester

What: Weekly workflow that reads the repository documentation (README, docs/) from a "first-time user" perspective and identifies confusing steps, missing prerequisites, outdated examples, and broken commands. Creates issues for findings.

Why: AWF has a complex setup (Docker, iptables, sudo, chroot). New users face multiple stumbling blocks. The Factory's Docs Noob Tester created 9 merged PRs. This is particularly valuable for a security tool where misconfiguration could leave users with false sense of security.

Effort: Low

P2.4: Enhanced CI Doctor — Auto-Discovery of Monitored Workflows

What: Modify ci-doctor.md to automatically discover workflow names from the repository instead of requiring manual maintenance of the workflows: list.

Why: The current ci-doctor.md has a hardcoded list of ~21 workflow names. When new workflows are added, this list must be updated manually (and can fall out of sync). With 22+ workflows, this is already a maintenance burden.

How: Add a pre-step that runs gh workflow list to dynamically populate the monitored workflows list, or generate the list from the workflow YAML files. This could be a simple bash pre-step in the lock file.

Effort: Low

P3 — Future Ideas

P3.1: AWF Ecosystem Health Report

What: Monthly report on the health of the AWF-powered agent ecosystem across the organization — aggregating firewall statistics, blocked domain patterns, and agent success rates from the awf logs summary command.

Why: AWF collects rich telemetry. A periodic report on "what domains are agents trying to reach?" and "which agent types are most productive?" would inform product decisions.

Effort: High (requires opt-in from users)

P3.2: Integration Test Flakiness Tracker

What: Agent that tracks flaky integration tests over time using cache-memory, identifies patterns in which tests fail intermittently, and creates issues for persistent flakiness.

Why: The repo has Docker-based integration tests that are susceptible to race conditions (network timing, container startup). The open issue #1354 (proxy env var leakage in tests) suggests flakiness is a real concern.

Effort: Medium

P3.3: Proxy Compatibility Monitor

What: Monthly agent that checks open issues and discussions for proxy compatibility complaints (Zig, Maven, custom tools) and creates consolidated tracking issues or documentation PRs with workarounds.

Why: Open issues #1429 (Zig), historical Maven issues, and others suggest this is a recurring pain point. Proactive monitoring and documentation would reduce support burden.

Effort: Low

📈 Maturity Assessment

Dimension	Current (1-5)	Notes
Security Automation	5	Exceptional — secret diggers (3 engines), security guard, daily review, dependency monitoring
CI/CD Fault Investigation	4	CI doctor is excellent; missing breaking change checker
Documentation Maintenance	3	doc-maintainer present but no noob tester, no docs unbloat
Issue Management	3	Duplication detector + monster good; missing triage/labeling
Release Automation	3	update-release-notes exists; missing changeset generator
Workflow Observability	2	No meta-monitoring, no audit-workflows, no metrics collector
Code Quality	2	test-coverage-improver present; no code simplifier, no schema checker
Overall	3.5/5	Advanced but missing observability and issue lifecycle management

Current Level: 4 — Advanced (well beyond typical repos; strong security focus)

Target Level: 5 — Expert (full observability, complete lifecycle automation, meta-layer monitoring)

Gap: The primary gap is the observability layer — there's no meta-agent watching the watchers. With 22 workflows, this becomes increasingly critical. The secondary gap is issue lifecycle (no auto-triage) which is the simplest possible first step.

🔄 Comparison with Best Practices

What This Repo Does Well ✅

Security-first automation: Three secret-digger engines running hourly is beyond what any Factory workflow did — perfect for a security product.
Multi-engine smoke testing: Smoke tests for Claude, Codex, Copilot, and chroot — comprehensive coverage.
Cross-repo dispatching: firewall-issue-dispatcher.md is a sophisticated pattern not seen in the generic Factory examples.
Domain specialization: security-guard.md is deeply tailored to AWF's specific threat model (iptables, Squid config, container boundaries) rather than generic code review.
skip-if-match: Correctly used in doc-maintainer.md and test-coverage-improver.md to prevent duplicate PRs.
Cache memory: issue-duplication-detector.md uses cache-memory for persistence — a Factory best practice.

What Could Be Improved 🔧

No observability layer: No workflow monitors the health of other workflows. With 22 agents, this is a critical gap.
Manual workflow list in CI Doctor: ci-doctor.md has a hardcoded list of monitored workflows that requires manual updates.
No issue triage: New issues aren't auto-labeled, creating manual overhead for maintainers.
Release process gap: update-release-notes.md runs after release; no agent proposes the version bump.
Container image security: npm dependencies are monitored; Docker image CVEs are not.

Unique Opportunities Given the Domain 🎯

AWF as its own guinea pig: The workflows can run inside AWF with specific domain whitelists, demonstrating the product's own security model. The secret diggers already do this beautifully.
Firewall-specific observability: An agent that analyzes awf logs summary data from all CI runs and reports on domain access patterns would be uniquely valuable.
Proxy compatibility oracle: An agent that proactively tests whether new package managers / build tools work through the Squid proxy would reduce user-reported issues.

📝 Notes for Future Runs

Stored in /tmp/gh-aw/cache-memory/advisor-notes.md

22 agentic workflows as of March 2026
Existing open issues include CI/CD failures for Secret Digger (Codex/Copilot) and Issue Duplication Detector — these may indicate workflow reliability issues worth investigating via the proposed Workflow Health Manager
Issue Agent container lacks TTY, breaking color detection and interactive output #1427 (TTY/color detection) and Squid proxy intercepts localhost/loopback traffic, breaking applications that bind to 127.0.0.1 #1426 (Squid intercepting localhost) are active technical debt items that could benefit from automated regression tracking
The ci-doctor.md monitored workflow list had ~21 entries; adding new workflows requires manual update of this list

AI generated by Pelis Agent Factory Advisor

expires on Apr 4, 2026, 3:33 AM UTC

2026-04-04T05:09:13Z

github-actions[bot]
bot Apr 4, 2026
Author

This discussion was automatically closed because it expired on 2026-04-04T03:33:02.293Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report & Recommendations (March 2026) #1478

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report & Recommendations (March 2026) #1478

Uh oh!

github-actions[bot] bot Mar 28, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns from the Documentation Site

From the Agentics Reference Repository

How This Repo Compares

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Issue Triage Agent

P0.2: Workflow Health Manager

P1 — Plan for Near-Term

P1.1: Breaking Change Checker

P1.2: Container Image Security Scanner

P1.3: Changeset / Version Bump Automation

P1.4: Static Analysis Report

P2 — Consider for Roadmap

P2.1: Issue Arborist

P2.2: Audit Workflows Meta-Agent

P2.3: Documentation Noob Tester

P2.4: Enhanced CI Doctor — Auto-Discovery of Monitored Workflows

P3 — Future Ideas

P3.1: AWF Ecosystem Health Report

P3.2: Integration Test Flakiness Tracker

P3.3: Proxy Compatibility Monitor

📈 Maturity Assessment

🔄 Comparison with Best Practices

What This Repo Does Well ✅

What Could Be Improved 🔧

Unique Opportunities Given the Domain 🎯

📝 Notes for Future Runs

Replies: 1 comment

Uh oh!

github-actions[bot] bot Apr 4, 2026 Author

github-actions[bot]
bot Mar 28, 2026

github-actions[bot]
bot Apr 4, 2026
Author