[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor: Agentic Workflow Maturity Report (2026-03-27) #1473

2026-03-27T03:39:31Z

github-actions[bot]
bot Mar 27, 2026

📊 Executive Summary

gh-aw-firewall is running 22 compiled agentic workflows — an exceptionally mature collection that rivals Peli's Agent Factory in several key categories. Security automation is a particular strength, with multi-engine red-teaming, daily security reviews, and dependency monitoring already in place. The primary gaps are meta-level observability (no workflow-monitoring-the-workflows agent), code quality automation (no continuous simplifier), and a few domain-specific opportunities unique to a security/firewall tool.

🎓 Patterns Learned from Pelis Agent Factory

From the Documentation Site

The key patterns in the factory, in rough order of leverage:

Category	Key Pattern	Value Delivered
Fault Investigation	CI Doctor — investigates failures and proposes PRs	9/13 proposed PRs merged (69%)
Code Quality	Code Simplifier — daily automated cleanup PRs	5/6 PRs merged (83%)
Code Quality	Duplicate Code Detector — semantic analysis	76/96 PRs merged (79%)
Testing	CLI Consistency Checker	80/102 PRs merged (78%)
Meta-observability	Audit Workflows — meta-agent monitoring all agents	93 discussions, 9 issues
Meta-observability	Workflow Health Manager	40 issues, 34 PRs merged
Operations	Changeset Generator — auto versioning	22/28 PRs merged (78%)
Security	Daily Malicious Code Scan	Proactive supply chain defense
Issue Management	Issue Arborist — links related issues	18 parent issues created
Analytics	Portfolio Analyst — cost optimization	Identified overly chatty agents

From the Agentics Reference Repository

daily-test-improver pattern: incremental test coverage additions via daily PRs
Shared workflow import pattern (this repo already uses this well via shared/)
cache-memory for persistent cross-run state (already in use here)

Comparison to Current Implementation

This repo already matches or exceeds Pelis patterns in security automation. The gap is primarily in meta-observability and code quality automation.

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`build-test`	Build & test suite validation	PR open/sync	✅ Well-configured, multi-language
`ci-doctor`	CI failure investigator	`workflow_run` failure	✅ Good, but hardcoded workflow list is brittle
`ci-cd-gaps-assessment`	Daily CI/CD gaps analysis	Daily	✅ Good meta-analysis
`cli-flag-consistency-checker`	CLI docs vs code sync	Weekly	✅ Solid
`dependency-security-monitor`	CVE detection & dep updates	Daily	✅ Comprehensive
`doc-maintainer`	Docs sync with code	Daily	✅ Has skip-if-match guard
`firewall-issue-dispatcher`	Cross-repo issue sync	Every 6h	✅ Unique cross-repo pattern
`issue-duplication-detector`	Duplicate issue detection	Issue opened	✅ Uses cache-memory well
`issue-monster`	Assigns issues to Copilot agent	Hourly + issue opened	✅ Task dispatcher pipeline
`plan`	/plan slash command	Slash command	✅ Interactive ChatOps pattern
`secret-digger-claude/codex/copilot`	Red team escape testing	Hourly (3 engines)	✅ Standout — multi-engine, unique to this repo
`security-guard`	PR security review	PR open/sync	✅ Claude engine, focused
`security-review`	Comprehensive security analysis	Daily	✅ Evidence-based, thorough
`smoke-claude/codex/copilot/chroot`	End-to-end smoke tests	PR + Every 12h	✅ Standout — multi-engine smoke testing
`test-coverage-improver`	Security-focused test coverage	Weekly	✅ Appropriate scope
`update-release-notes`	Release notes enrichment	On release	✅
`pelis-agent-factory-advisor`	This advisor	Daily	✅ Meta-awareness

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Firewall Escape Summary Report

What: A daily aggregation workflow that reads the last 24h of secret-digger-claude, secret-digger-codex, and secret-digger-copilot runs and produces a single consolidated "Escape Attempt Report" discussion, tracking which escape vectors were tried, which succeeded, and trends over time.

Why: Three red-team agents run hourly across three engines, generating enormous signal. Currently there is no aggregation layer — each run produces individual findings but there's no consolidated view of "what was discovered this week" or "which engine found things the others missed." This is a high-value signal that is currently being lost.

How: New workflow firewall-escape-report.md triggered daily, using agentic-workflows tool to fetch logs from the three secret-digger-* workflows, then producing a structured markdown discussion with findings grouped by severity, engine, and attack vector.

Effort: Low (uses existing agentic-workflows tool pattern already seen in security-review.md)

---
description: Daily aggregated report of red-team firewall escape attempts across all three engines
on:
  schedule: daily
  workflow_dispatch:
tools:
  agentic-workflows:
  cache-memory: true
safe-outputs:
  create-discussion:
    title-prefix: "[Escape Report] "
    category: "general"
timeout-minutes: 15
---
# Firewall Escape Summary Report
Analyze the last 24h of secret-digger-claude, secret-digger-codex, and secret-digger-copilot runs...

P0.2: Issue Triage Agent

What: Automatically label and respond to new issues with appropriate labels (bug, feature, security, documentation, question, help-wanted, good-first-issue), providing a brief comment explaining the label and next steps.

Why: The issue-monster assigns issues to Copilot but doesn't triage them. issue-duplication-detector detects duplicates but doesn't label. New issues currently arrive unlabeled unless a human manually triages them. This creates friction for contributors and makes filtering issues impossible.

How: New issue-triage.md triggering on issues: [opened, reopened], using GitHub issues toolset to analyze content, apply labels, and leave a brief orienting comment. For a security/firewall tool, special attention to security label detection.

Effort: Low (standard pattern from Pelis factory with direct gh aw add-wizard template)

---
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
    min-integrity: none
safe-outputs:
  add-labels:
    allowed: [bug, feature, enhancement, documentation, question, help-wanted, good-first-issue, security, performance]
  add-comment: {}
timeout-minutes: 5
---
# Issue Triage Agent
For the newly opened issue, analyze its content and apply exactly one label...

P1 — Plan for Near-Term

P1.1: Audit Workflows Meta-Agent

What: A meta-agent that runs daily and audits the runs of all 22 agentic workflows — checking success rates, error patterns, cost/token trends, and identifying workflows that are failing silently or producing low-quality outputs.

Why: With 22 workflows running continuously, the team needs observability into the agent ecosystem itself. The Pelis factory's Audit Workflows meta-agent created 93 discussions and 9 actionable issues — it became the "central nervous system" of their factory. This repo is at the stage where meta-observability becomes critical.

How: New audit-workflows.md triggering daily, using the agentic-workflows tool (which already exists in security-review.md and ci-cd-gaps-assessment.md) to download logs, analyze run patterns, detect failures, and post a discussion.

Effort: Medium (requires careful prompt engineering to stay within token limits with 22 workflows)

P1.2: Breaking Change Checker

What: A workflow triggered on PRs that identifies changes which could break the public API/CLI interface — new required flags, removed options, changed behavior, Docker image changes, or container API changes.

Why: AWF has users (GitHub Actions workflows) depending on its CLI interface and container images. A breaking change in a security tool that doesn't announce itself is dangerous. The current security-guard focuses on security posture, not backward compatibility. The Pelis factory's Breaking Change Checker had 100% actionable issue rate.

How: New breaking-change-checker.md on PR open/sync, analyzing diffs in src/cli.ts, src/types.ts, and container entrypoint.sh for backward-incompatible changes. Creates an alert issue when detected.

Effort: Medium

P1.3: Container Image CVE Scanner

What: A daily workflow that scans the three AWF container images (squid, agent, api-proxy) for known CVEs using tools like Trivy or Grype, creates issues for HIGH/CRITICAL findings, and proposes base image updates.

Why: The containers are the core security surface of AWF. Base images (ubuntu/squid:latest, ubuntu:22.04) accumulate CVEs over time. The dependency-security-monitor covers npm dependencies but not container image vulnerabilities. This is a critical blind spot for a security-critical tool.

How: Traditional GitHub Actions workflow (.yml) or new agentic workflow triggering daily, using Trivy/Grype CLI in bash, with safe-outputs: create-issue for HIGH/CRITICAL findings.

Effort: Medium (requires container tooling setup but pattern is well-established)

P1.4: Fix `ci-doctor` Hardcoded Workflow List

What: The ci-doctor.md has a hardcoded list of ~26 workflow names in the workflow_run.workflows trigger. When new workflows are added, this list must be manually updated, making it fragile.

Why: Currently when a new workflow is added (as has clearly been happening frequently), it's easy to forget to add it to ci-doctor's monitoring list. The workflow currently doesn't monitor several recently-added workflows.

How: GitHub Actions doesn't support wildcards in workflow_run.workflows, but two mitigations exist:

Add a comment/lint check that warns when .github/workflows/*.yml count > ci-doctor monitored count
Add the agentic-workflows tool to ci-doctor so it can also proactively check recent failed runs of workflows not in its trigger list

Effort: Low

P2 — Consider for Roadmap

P2.1: Code Simplifier

What: A daily agent that analyzes recently modified TypeScript files, identifies complexity (deeply nested conditionals, repeated patterns, verbose error handling), and proposes PRs with simplifications.

Why: The TypeScript codebase (src/) has grown significantly. Rapid development leads to accretive complexity. The Pelis factory's Code Simplifier had an 83% merge rate. For a security tool, simpler code is also more auditable code.

How: New code-simplifier.md triggered daily, using bash tool to find files changed in the last 3 days, analyzing for TypeScript-specific simplification opportunities (array methods, early returns, type narrowing), creating draft PRs.

Effort: Medium

P2.2: Changeset / Auto-Versioning Agent

What: An agent that, after merges to main, analyzes the accumulated commits since the last release tag, determines the appropriate semver bump (patch/minor/major based on commit types), and proposes a PR updating CHANGELOG.md and package.json version.

Why: The update-release-notes workflow runs on release publish, but there's no agent that proactively suggests when to cut a release or prepares the changelog. The Pelis factory's Changeset workflow had a 78% merge rate and significantly reduced release friction.

How: New changeset.md running weekly or on-demand, using git commands to analyze commits since last tag, proposing a changelog PR.

Effort: Medium

P2.3: Portfolio Analyst

What: A weekly agent that analyzes the cost and token usage across all agentic workflows, identifies which workflows are expensive relative to their output quality, and proposes optimizations (shorter prompts, smaller models, reduced frequency).

Why: Running 22 workflows continuously (including 3 hourly red-team agents!) has real token cost implications. The Pelis factory's Portfolio Analyst identified "overly chatty" agents costing money unnecessarily. With the repo running 3 × hourly secret-diggers, cost awareness is important.

How: New portfolio-analyst.md running weekly, using the agentic-workflows tool to pull logs and metrics, generating a cost analysis discussion.

Effort: Medium

P2.4: Issue Arborist

What: An agent that periodically analyzes open issues to identify clusters of related issues, links them as sub-issues under a parent, and creates parent "tracking issues" for theme-related work.

Why: As issue volume grows from the various automated triage and dispatcher agents, the issue tracker can become cluttered without hierarchical organization. The Pelis factory's Issue Arborist created 18 parent issues and 77 organization reports.

How: New issue-arborist.md running weekly, using GitHub issues search to cluster related open issues, creating parent issues for groups.

Effort: Medium

P3 — Future Ideas

P3.1: Mergefest (Auto-merge Main into PRs)

What: A workflow that automatically merges the main branch into open, non-draft PRs to keep them current, triggered by pushes to main.

Why: Long-lived PRs (like smoke tests or doc improvements) frequently go stale. Manual rebasing is ceremony. The Pelis factory's Mergefest was an orchestrator workflow that eliminated the "please merge main" dance.

Effort: Low (but requires contents: write permission on PRs)

P3.2: Daily Malicious Code Scan

What: A daily agent that reviews code changes from the past 24h for suspicious patterns — unexpected network calls, obfuscated code, unusual capability escalations, or supply chain attack patterns.

Why: AWF processes user-supplied commands and runs agent code. A malicious contribution could compromise the firewall. The Pelis factory's Daily Malicious Code Scan added a defense layer against supply chain attacks.

Effort: Low-Medium

P3.3: Schema Consistency Checker

What: A weekly agent that checks for drift between TypeScript types in src/types.ts, CLI flags in src/cli.ts, Docker Compose configuration in src/docker-manager.ts, and their documentation counterparts.

Why: The TypeScript types, CLI help text, and documentation all describe the same configuration surface. When one drifts from another, users get confusing errors. The Pelis factory's Schema Consistency Checker created 55 analysis discussions catching drift that would have taken days to notice manually.

Effort: Medium

📈 Maturity Assessment

Dimension	Current Level	Notes
Issue Management	4/5	Has dispatch, dedup, plan command — missing triage labeling
Code Quality	2/5	Doc maintenance, CLI consistency — missing simplifier, refactoring
Testing & Validation	4/5	Multi-engine smoke, coverage improver, build test
Security	5/5	Best-in-class: multi-engine red team, daily review, dep monitoring, security guard
Operations & Release	3/5	Release notes auto-update — missing changeset/auto-versioning
Observability & Meta	2/5	CI Doctor, gaps assessment — missing audit meta-agent
Cross-repo	4/5	Firewall issue dispatcher is a sophisticated pattern

Current Level: 4/5 (Advanced)

The repository has a notably advanced agentic workflow collection, especially for a project of its size. The security automation (multi-engine red team, three smoke test engines, daily security review) is genuinely best-in-class and goes beyond what Pelis documented.

Target Level: 5/5 (Factory-Grade)

Gap to Close: The main gap is meta-observability — the repository doesn't have an agent watching all the agents. With 22 workflows now running, this becomes increasingly important for operational hygiene.

🔄 Comparison with Best Practices

What this repository does exceptionally well

Multi-engine testing: 3 red-team agents × 3 smoke test engines is a unique pattern that goes beyond Pelis
Cross-repo integration: firewall-issue-dispatcher (bidirectional with github/gh-aw) is a sophisticated pattern
Shared imports: shared/ directory with mcp-pagination.md, secret-audit.md, version-reporting.md, gh.md is textbook Pelis factory pattern
Cache-memory usage: Multiple workflows use persistent cache for issue deduplication and investigation history
Skip-if-match guards: Prevents runaway duplicate PRs (doc-maintainer, test-coverage-improver)
Security focus: The depth of security automation (daily secrets scan, red team, dep monitor, security guard, security review) is remarkable

What could improve

No meta-agent: The absence of an "Audit Workflows" style meta-agent is the single biggest gap
Issue triage labeling: Issues arrive without labels, making filtering impossible
Code quality automation: Unlike Pelis factory (which had Code Simplifier + Duplicate Detector + CLI Consistency with 78-83% merge rates), there's no continuous code quality improvement agent
Breaking change detection: Missing for a tool with external users

Unique opportunities given the domain

The firewall escape reports pattern (P0.1 above) is entirely unique to this project — no equivalent exists in Pelis factory
Container CVE scanning (P1.3) is especially critical since AWF IS a security tool — vulnerabilities in its own containers would be ironic
The multi-engine comparison in smoke tests could be extended to generate "engine comparison reports" — which agent engine is most effective at staying within the firewall?

📝 Notes for Future Runs

Stored in /tmp/gh-aw/cache-memory/notes.txt. Key items to track over time:

Whether ci-doctor's hardcoded workflow list gets updated as new workflows are added
Progress on implementing audit workflows meta-agent
Whether issue triage agent gets added
Token cost trends as the 22-workflow collection continues running

AI generated by Pelis Agent Factory Advisor

expires on Apr 3, 2026, 3:39 AM UTC

2026-04-03T05:16:21Z

github-actions[bot]
bot Apr 3, 2026
Author

This discussion was automatically closed because it expired on 2026-04-03T03:39:31.070Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor: Agentic Workflow Maturity Report (2026-03-27) #1473

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor: Agentic Workflow Maturity Report (2026-03-27) #1473

Uh oh!

github-actions[bot] bot Mar 27, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

From the Documentation Site

From the Agentics Reference Repository

Comparison to Current Implementation

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Firewall Escape Summary Report

P0.2: Issue Triage Agent

P1 — Plan for Near-Term

P1.1: Audit Workflows Meta-Agent

P1.2: Breaking Change Checker

P1.3: Container Image CVE Scanner

P1.4: Fix ci-doctor Hardcoded Workflow List

P2 — Consider for Roadmap

P2.1: Code Simplifier

P2.2: Changeset / Auto-Versioning Agent

P2.3: Portfolio Analyst

P2.4: Issue Arborist

P3 — Future Ideas

P3.1: Mergefest (Auto-merge Main into PRs)

P3.2: Daily Malicious Code Scan

P3.3: Schema Consistency Checker

📈 Maturity Assessment

🔄 Comparison with Best Practices

What this repository does exceptionally well

What could improve

Unique opportunities given the domain

📝 Notes for Future Runs

Replies: 1 comment

Uh oh!

github-actions[bot] bot Apr 3, 2026 Author

github-actions[bot]
bot Mar 27, 2026

P1.4: Fix `ci-doctor` Hardcoded Workflow List

github-actions[bot]
bot Apr 3, 2026
Author