[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations (March 2026) #1423

2026-03-25T03:31:47Z

github-actions[bot]
bot Mar 25, 2026

This report analyzes the repository's agentic workflow ecosystem, drawing on patterns from Peli's Agent Factory and the githubnext/agentics reference repository.

📊 Executive Summary

The gh-aw-firewall repository operates at a Level 4/5 agentic workflow maturity — significantly above average, with 19 specialized agentic workflows covering security red-teaming, documentation maintenance, CI/CD health, dependency monitoring, and smoke testing. The main remaining gaps are: automated issue triage, proactive test coverage improvement (currently at 38% — barely above thresholds), code simplification agents, and a meta-agent monitoring the health of all other agents.

🎓 Patterns Learned from Peli's Agent Factory

After crawling the full Peli's Agent Factory blog series, the key patterns that stand out are:

Core Principles

Specialization over monolith: Dozens of narrowly-focused agents outperform one do-everything agent
Guardrails enable innovation: Strict safe-outputs constraints make it safe to run agents that propose real changes
Meta-agents are essential: The Workflow Health Manager (40 issues, 34 downstream PRs) watching all other agents is uniquely valuable at scale
Causal chain value: Agents creating issues that other agents then act on (Issue Monster + Copilot coding agent) multiplies impact
Trust but verify: Testing and validation workflows run continuously because what worked yesterday can silently break today

Key Workflow Categories with Proven ROI

Category	Top Performer	Merge Rate
Code Quality	CLI Consistency Checker	78% (80/102 PRs)
Duplicate Detection	Duplicate Code Detector	79% (76/96 PRs)
CI Optimization	CI Coach	100% (9/9 PRs)
Test Quality	Daily Testify Expert	100% causal chain rate
Documentation	Multi-Device Docs Tester	100% (2/2 PRs)

Patterns from `githubnext/agentics`

The reference repository uses daily-test-improver.md — an incremental test coverage agent that runs daily, identifies gaps, and implements new tests autonomously. With only 38% statement coverage here, this is the single highest-ROI workflow missing from this repo.

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`issue-monster`	Assigns issues to Copilot coding agent one at a time	hourly + on issue open	✅ Excellent — sophisticated rate limiting
`security-guard`	Reviews PRs for security regressions	on PR open/sync	✅ Excellent — domain-specific security context
`doc-maintainer`	Daily doc sync with code changes	daily	✅ Good — covers 7-day lookback
`ci-doctor`	Investigates CI failures and creates diagnostic issues	on workflow_run failure	✅ Excellent — proactive investigation
`ci-cd-gaps-assessment`	Assesses CI/CD coverage gaps	daily	✅ Good
`cli-flag-consistency-checker`	Flags CLI/docs inconsistencies	weekly	✅ Good — proven pattern from Pelis
`security-review`	Comprehensive daily threat modeling	daily	✅ Excellent — evidence-based approach
`issue-duplication-detector`	Detects duplicate issues with persistent cache	on issue open	✅ Good — uses cache-memory correctly
`pelis-agent-factory-advisor`	This workflow	daily	✅ Meta-awareness
`secret-digger-claude/copilot/codex`	Red-team container escape testing	hourly (3 engines)	✅ Unique to this repo — highly domain-appropriate
`build-test`	Multi-runtime build test suite	on PR	✅ Comprehensive runtime coverage
`smoke-claude/codex/copilot/chroot`	Engine smoke tests	12h + on PR	✅ Critical for a firewall product
`dependency-security-monitor`	CVE detection + safe update PRs	daily	✅ Good
`plan`	Slash command for issue breakdown	`/plan` comment	✅ Good ChatOps pattern

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Issue Triage Agent

What: Automatically label and comment on new issues when they're opened.

Why: This is the "hello world" of agentic workflows and the single highest-impact missing workflow. Every new issue currently arrives unlabeled. Maintainers must manually classify bug, enhancement, documentation, question, security etc. This is pure ceremony that an agent handles in seconds.

How: Add on issues: [opened, reopened] trigger. Use the issues and labels GitHub toolsets. Output labels from a predefined set (bug, feature, documentation, security, performance, question). Leave a comment explaining the classification with context about how it might be addressed.

Effort: Low — canonical pattern from Pelis with well-understood implementation.

---
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels, repos]
safe-outputs:
  add-labels:
    allowed: [bug, feature, documentation, security, performance, question, good-first-issue, help-wanted]
  add-comment:
    max: 1
timeout-minutes: 5
---
# Issue Triage Agent
Analyze newly opened issues in $\{\{ github.repository }}. For each unlabeled issue:
1. Read the title and body carefully
2. Research the codebase to understand context (this is a network firewall/security tool)
3. Apply the most appropriate label from the allowed set
4. Comment with a brief explanation of the label and any initial thoughts on addressing the issue

P0.2: Daily Test Coverage Improver

What: A daily agent that analyzes the current test coverage (38% statements, 31% branches), identifies the highest-value untested code paths, and either creates a PR with new tests or an issue describing the coverage gap.

Why: Coverage is barely above the minimum thresholds (38% vs 38% threshold for statements, 31% vs 30% for branches). The githubnext/agentics repository's daily-test-improver.md demonstrates this pattern. With a security-critical codebase, low branch coverage in docker-manager.ts, cli.ts, and host-iptables.ts is a real risk.

How: Run npm test -- --coverage --json to get current coverage, identify files with the lowest branch coverage, write targeted tests for the most critical untested paths, create a draft PR. Focus on src/docker-manager.ts (complex logic, currently low coverage), src/host-iptables.ts (security-critical, 55% branch coverage), and src/cli.ts (main entry point).

Effort: Medium — requires understanding Jest + TypeScript + Docker mocking patterns already used in the test suite.

P1 — Plan for Near-Term

P1.1: Workflow Health Manager (Meta-Agent)

What: A meta-agent that monitors the health of all other agentic workflows in this repository — checks for stale workflows that haven't run recently, high failure rates, missing compilations, and outdated dependencies.

Why: At 19 agentic workflows and growing, a meta-agent becomes critical. The Pelis Agent Factory's Workflow Health Manager created 40 issues and 34 downstream PRs (14 merged), making it one of the most impactful workflows in the factory. The current agentics-maintenance.yml is a conventional workflow, not an intelligent agent. Signs this is needed: the CI Doctor monitors a hardcoded list of workflow names that must be manually maintained.

How: Use agentic-workflows tool to get status of all workflows. Check last-run timestamps, failure rates, and whether lock files are up to date. Create issues for: workflows that haven't run in 7+ days, workflows with >50% failure rate in last 30 runs, workflows where the .md is newer than the .lock.yml (recompile needed).

Effort: Medium.

P1.2: Breaking Change Checker

What: A workflow that runs on every PR and detects backward-incompatible changes in the public API — CLI flags, environment variables, Docker Compose schema, container behavior.

Why: The Pelis Factory's Breaking Change Checker created issues like #14113 flagging CLI version updates before they reached production. For awf, breaking changes in CLI flags, env variable names, or the Docker Compose API can silently break users' automation. The security-guard focuses on security regressions; this would focus on UX/API regressions.

How: On PR trigger, diff src/cli.ts against main to identify removed/renamed CLI options. Check src/docker-manager.ts for changes to environment variable names, port numbers, or compose schema. Check container entrypoint scripts for behavioral changes. Create an issue with breaking change label if detected.

Effort: Medium.

P1.3: Smoke Test Consolidation Report

What: A daily/weekly agent that aggregates results from all 4 smoke test workflows (smoke-claude, smoke-codex, smoke-copilot, smoke-chroot) into a single health dashboard discussion.

Why: Currently, the 4 smoke tests run independently. Understanding the overall health of the firewall across all engines requires checking 4 separate workflow run histories. A daily consolidated report with pass/fail trends would make the product health visible at a glance and catch patterns (e.g., "Claude smoke tests have been flaky for 3 days").

How: Use agentic-workflows + GitHub Actions tools to gather recent run results for all 4 smoke workflows. Generate a markdown table with pass/fail counts per engine per day, trend arrows, and any recurring error patterns. Post as a discussion.

Effort: Low — read-only analysis workflow.

P2 — Consider for Roadmap

P2.1: Automatic Code Simplifier

What: A daily agent that analyzes recently modified TypeScript code and creates PRs with simplifications — extracted helper functions, simplified boolean expressions, reduced nesting, more idiomatic patterns.

Why: Pelis Factory's Code Simplifier achieved 83% merge rate (5/6 PRs). This codebase has complex orchestration logic in docker-manager.ts (800+ lines) and cli.ts that tends to accumulate complexity over time. The agent cleans up after rapid development sessions.

How: Use git log --since=7.days to find recently modified .ts files. Analyze for simplification opportunities. Create a draft PR with [simplify] prefix and ai-generated label. Cap at 1 PR per run to avoid overwhelming maintainers.

Effort: Medium.

P2.2: Daily Static Analysis Report (Discussion)

What: A daily agent that runs zizmor, poutine, and actionlint on all workflow files and posts a structured discussion with findings.

Why: These tools currently run on PRs via agenticworkflows-compile with --zizmor --actionlint --poutine, but there's no persistent daily report showing trends. The Pelis Factory's Static Analysis Report created 57 analysis discussions and 12 Zizmor security reports, providing a continuous security audit trail. For a firewall product, workflow security is especially important.

How: Daily schedule, run the three tools, parse output, post discussion with [Static Analysis] prefix. Use cache-memory to compare with previous day's findings and highlight new issues.

Effort: Low — the tools are already installed and used.

P2.3: Schema Consistency Checker

What: A weekly agent that verifies consistency between TypeScript interfaces (src/types.ts), CLI option parsing (src/cli.ts), documentation (docs/), and the README.

Why: The Pelis Factory's Schema Consistency Checker created 55 analysis discussions, catching drift that would take days to notice manually. This repo's WrapperConfig interface is the source of truth for configuration, but it frequently drifts from the CLI's --help output and documentation. The cli-flag-consistency-checker covers some of this, but a schema-level check would be complementary.

How: Extract the WrapperConfig TypeScript interface programmatically. Cross-reference with all CLI .option() calls. Cross-reference with docs. Report inconsistencies (missing docs for flags, flags documented but not in interface, etc.).

Effort: Low.

P3 — Future Ideas

P3.1: Issue Arborist — Link Related Issues

What: A weekly agent that identifies clusters of related issues (e.g., multiple issues about DNS handling, or multiple issues about a specific container) and organizes them into parent/sub-issue hierarchies.

Why: The Pelis Factory's Issue Arborist created 18 parent issues and 77 discussion reports. As this repo matures, the issue tracker will grow and benefit from better organization.

Effort: Medium (requires careful reasoning about issue relationships).

P3.2: Daily Malicious Code Scan

What: Reviews recent commits (past 24h) for suspicious patterns — obfuscated code, unexpected network calls in shell scripts, unusual permission escalation patterns.

Why: Pelis Factory runs this daily on their codebase. Especially relevant here since the codebase manages iptables rules, Docker privileges, and container security. A supply chain attack introducing a backdoor in a firewall would be particularly dangerous.

Effort: Low to medium.

P3.3: Secret Digger Results Aggregator

What: A daily agent that reads the output of all 3 hourly secret-digger runs (Claude, Copilot, Codex) and posts a consolidated daily security status discussion comparing what each engine found.

Why: Currently the 3 engines run independently with no cross-comparison. Correlating findings (e.g., "all 3 engines failed to find secrets today" vs "Claude found X but Codex missed it") would yield insights about engine-specific capabilities.

Effort: Low — read-only aggregation using cache-memory.

📈 Maturity Assessment

Dimension	Score	Notes
Issue Management	3/5	No triage agent; duplication detection and issue monster are good
Code Quality	3/5	CLI consistency checker exists; no simplifier or test improver
Security Automation	5/5	Exceptional — secret diggers, security guard, dependency monitor, security review all present
CI/CD Health	4/5	CI doctor, gaps assessment, smoke tests — missing health dashboard
Documentation	4/5	Doc maintainer and CLI checker; no multi-device testing
Meta-Observation	2/5	No workflow health manager; this advisor is the only meta-workflow
Test Coverage	2/5	Only 38% coverage; no proactive improvement agent

Current Level: 4/5 — This repository has exceptional security automation that goes far beyond typical repos (hourly red-team agents, daily threat modeling, real-time PR security review). The main gaps are in proactive code improvement and meta-level health monitoring.

Target Level: 5/5 — Add issue triage, test coverage improvement, workflow health management, and code simplification.

Gap: Primarily the "continuous improvement" category — agents that proactively make the codebase better over time rather than just monitoring it.

🔄 Comparison with Pelis Agent Factory Best Practices

What This Repo Does Exceptionally Well

Domain-specific security automation: The triple-engine secret diggers running hourly are unique and brilliant — nowhere in the Pelis Factory documentation is there anything comparable
Security-aware PR review: The security-guard.md has deep domain knowledge about iptables, Squid, container security, and capability dropping — far more specialized than generic PR review agents
Firewall product dogfooding: Using the firewall's own network constraints (AWF) in smoke test workflows is elegant meta-use
ChatOps with /plan: The slash command workflow is well-implemented with parent/sub-issue creation

What It Could Improve

Issue triage is conspicuously missing — every other workflow category is covered but new issues arrive unlabeled
Test coverage is dangerously low for a security-critical tool (38% statements, 31% branches)
No meta-agent watching agent health — with 19 workflows, the hardcoded CI Doctor workflow list is already maintenance burden

Unique Opportunities from the Security Domain

Firewall compliance verification agent: Weekly agent that verifies the documented security properties (no DNS exfiltration, no container escape, correct domain blocking) are still enforced by the implementation
Regression threat modeling: When PRs touch security-critical files, automatically model what new attack surfaces were introduced (distinct from the security-guard's "regression" check — this would be forward-looking threat analysis)

📝 Notes for Future Runs

Analysis stored in /tmp/gh-aw/cache-memory/pelis-advisor-notes.json. Key baseline metrics for tracking:

Test coverage baseline: 38.39% statements, 31.78% branches (March 2026)
Agentic workflow count: 19 workflows
Top gap: Issue Triage Agent (P0) not yet implemented
Second gap: Test Coverage Improver (P0) — coverage barely above thresholds

AI generated by Pelis Agent Factory Advisor

expires on Apr 1, 2026, 3:31 AM UTC

2026-04-01T03:46:15Z

github-actions[bot]
bot Apr 1, 2026
Author

This discussion was automatically closed because it expired on 2026-04-01T03:31:46.941Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations (March 2026) #1423

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis &amp; Recommendations (March 2026) #1423

Uh oh!

github-actions[bot] bot Mar 25, 2026

📊 Executive Summary

🎓 Patterns Learned from Peli's Agent Factory

Core Principles

Key Workflow Categories with Proven ROI

Patterns from githubnext/agentics

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Issue Triage Agent

P0.2: Daily Test Coverage Improver

P1 — Plan for Near-Term

P1.1: Workflow Health Manager (Meta-Agent)

P1.2: Breaking Change Checker

P1.3: Smoke Test Consolidation Report

P2 — Consider for Roadmap

P2.1: Automatic Code Simplifier

P2.2: Daily Static Analysis Report (Discussion)

P2.3: Schema Consistency Checker

P3 — Future Ideas

P3.1: Issue Arborist — Link Related Issues

P3.2: Daily Malicious Code Scan

P3.3: Secret Digger Results Aggregator

📈 Maturity Assessment

🔄 Comparison with Pelis Agent Factory Best Practices

What This Repo Does Exceptionally Well

What It Could Improve

Unique Opportunities from the Security Domain

📝 Notes for Future Runs

Replies: 1 comment

Uh oh!

github-actions[bot] bot Apr 1, 2026 Author

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations (March 2026) #1423

github-actions[bot]
bot Mar 25, 2026

Patterns from `githubnext/agentics`

github-actions[bot]
bot Apr 1, 2026
Author