[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations (March 2026) #1423
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-04-01T03:31:46.941Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
This report analyzes the repository's agentic workflow ecosystem, drawing on patterns from Peli's Agent Factory and the
githubnext/agenticsreference repository.📊 Executive Summary
The
gh-aw-firewallrepository operates at a Level 4/5 agentic workflow maturity — significantly above average, with 19 specialized agentic workflows covering security red-teaming, documentation maintenance, CI/CD health, dependency monitoring, and smoke testing. The main remaining gaps are: automated issue triage, proactive test coverage improvement (currently at 38% — barely above thresholds), code simplification agents, and a meta-agent monitoring the health of all other agents.🎓 Patterns Learned from Peli's Agent Factory
After crawling the full Peli's Agent Factory blog series, the key patterns that stand out are:
Core Principles
safe-outputsconstraints make it safe to run agents that propose real changesKey Workflow Categories with Proven ROI
Patterns from
githubnext/agenticsThe reference repository uses
daily-test-improver.md— an incremental test coverage agent that runs daily, identifies gaps, and implements new tests autonomously. With only 38% statement coverage here, this is the single highest-ROI workflow missing from this repo.📋 Current Agentic Workflow Inventory
issue-monstersecurity-guarddoc-maintainerci-doctorci-cd-gaps-assessmentcli-flag-consistency-checkersecurity-reviewissue-duplication-detectorpelis-agent-factory-advisorsecret-digger-claude/copilot/codexbuild-testsmoke-claude/codex/copilot/chrootdependency-security-monitorplan/plancomment🚀 Actionable Recommendations
P0 — Implement Immediately
P0.1: Issue Triage Agent
What: Automatically label and comment on new issues when they're opened.
Why: This is the "hello world" of agentic workflows and the single highest-impact missing workflow. Every new issue currently arrives unlabeled. Maintainers must manually classify
bug,enhancement,documentation,question,securityetc. This is pure ceremony that an agent handles in seconds.How: Add on
issues: [opened, reopened]trigger. Use theissuesandlabelsGitHub toolsets. Output labels from a predefined set (bug,feature,documentation,security,performance,question). Leave a comment explaining the classification with context about how it might be addressed.Effort: Low — canonical pattern from Pelis with well-understood implementation.
P0.2: Daily Test Coverage Improver
What: A daily agent that analyzes the current test coverage (38% statements, 31% branches), identifies the highest-value untested code paths, and either creates a PR with new tests or an issue describing the coverage gap.
Why: Coverage is barely above the minimum thresholds (38% vs 38% threshold for statements, 31% vs 30% for branches). The
githubnext/agenticsrepository'sdaily-test-improver.mddemonstrates this pattern. With a security-critical codebase, low branch coverage indocker-manager.ts,cli.ts, andhost-iptables.tsis a real risk.How: Run
npm test -- --coverage --jsonto get current coverage, identify files with the lowest branch coverage, write targeted tests for the most critical untested paths, create a draft PR. Focus onsrc/docker-manager.ts(complex logic, currently low coverage),src/host-iptables.ts(security-critical, 55% branch coverage), andsrc/cli.ts(main entry point).Effort: Medium — requires understanding Jest + TypeScript + Docker mocking patterns already used in the test suite.
P1 — Plan for Near-Term
P1.1: Workflow Health Manager (Meta-Agent)
What: A meta-agent that monitors the health of all other agentic workflows in this repository — checks for stale workflows that haven't run recently, high failure rates, missing compilations, and outdated dependencies.
Why: At 19 agentic workflows and growing, a meta-agent becomes critical. The Pelis Agent Factory's Workflow Health Manager created 40 issues and 34 downstream PRs (14 merged), making it one of the most impactful workflows in the factory. The current
agentics-maintenance.ymlis a conventional workflow, not an intelligent agent. Signs this is needed: the CI Doctor monitors a hardcoded list of workflow names that must be manually maintained.How: Use
agentic-workflowstool to get status of all workflows. Check last-run timestamps, failure rates, and whether lock files are up to date. Create issues for: workflows that haven't run in 7+ days, workflows with >50% failure rate in last 30 runs, workflows where the.mdis newer than the.lock.yml(recompile needed).Effort: Medium.
P1.2: Breaking Change Checker
What: A workflow that runs on every PR and detects backward-incompatible changes in the public API — CLI flags, environment variables, Docker Compose schema, container behavior.
Why: The Pelis Factory's Breaking Change Checker created issues like #14113 flagging CLI version updates before they reached production. For
awf, breaking changes in CLI flags, env variable names, or the Docker Compose API can silently break users' automation. Thesecurity-guardfocuses on security regressions; this would focus on UX/API regressions.How: On PR trigger, diff
src/cli.tsagainst main to identify removed/renamed CLI options. Checksrc/docker-manager.tsfor changes to environment variable names, port numbers, or compose schema. Check container entrypoint scripts for behavioral changes. Create an issue with breaking change label if detected.Effort: Medium.
P1.3: Smoke Test Consolidation Report
What: A daily/weekly agent that aggregates results from all 4 smoke test workflows (
smoke-claude,smoke-codex,smoke-copilot,smoke-chroot) into a single health dashboard discussion.Why: Currently, the 4 smoke tests run independently. Understanding the overall health of the firewall across all engines requires checking 4 separate workflow run histories. A daily consolidated report with pass/fail trends would make the product health visible at a glance and catch patterns (e.g., "Claude smoke tests have been flaky for 3 days").
How: Use
agentic-workflows+ GitHub Actions tools to gather recent run results for all 4 smoke workflows. Generate a markdown table with pass/fail counts per engine per day, trend arrows, and any recurring error patterns. Post as a discussion.Effort: Low — read-only analysis workflow.
P2 — Consider for Roadmap
P2.1: Automatic Code Simplifier
What: A daily agent that analyzes recently modified TypeScript code and creates PRs with simplifications — extracted helper functions, simplified boolean expressions, reduced nesting, more idiomatic patterns.
Why: Pelis Factory's Code Simplifier achieved 83% merge rate (5/6 PRs). This codebase has complex orchestration logic in
docker-manager.ts(800+ lines) andcli.tsthat tends to accumulate complexity over time. The agent cleans up after rapid development sessions.How: Use
git log --since=7.daysto find recently modified.tsfiles. Analyze for simplification opportunities. Create a draft PR with[simplify]prefix andai-generatedlabel. Cap at 1 PR per run to avoid overwhelming maintainers.Effort: Medium.
P2.2: Daily Static Analysis Report (Discussion)
What: A daily agent that runs
zizmor,poutine, andactionlinton all workflow files and posts a structured discussion with findings.Why: These tools currently run on PRs via
agenticworkflows-compilewith--zizmor --actionlint --poutine, but there's no persistent daily report showing trends. The Pelis Factory's Static Analysis Report created 57 analysis discussions and 12 Zizmor security reports, providing a continuous security audit trail. For a firewall product, workflow security is especially important.How: Daily schedule, run the three tools, parse output, post discussion with
[Static Analysis]prefix. Use cache-memory to compare with previous day's findings and highlight new issues.Effort: Low — the tools are already installed and used.
P2.3: Schema Consistency Checker
What: A weekly agent that verifies consistency between TypeScript interfaces (
src/types.ts), CLI option parsing (src/cli.ts), documentation (docs/), and the README.Why: The Pelis Factory's Schema Consistency Checker created 55 analysis discussions, catching drift that would take days to notice manually. This repo's
WrapperConfiginterface is the source of truth for configuration, but it frequently drifts from the CLI's--helpoutput and documentation. Thecli-flag-consistency-checkercovers some of this, but a schema-level check would be complementary.How: Extract the
WrapperConfigTypeScript interface programmatically. Cross-reference with all CLI.option()calls. Cross-reference with docs. Report inconsistencies (missing docs for flags, flags documented but not in interface, etc.).Effort: Low.
P3 — Future Ideas
P3.1: Issue Arborist — Link Related Issues
What: A weekly agent that identifies clusters of related issues (e.g., multiple issues about DNS handling, or multiple issues about a specific container) and organizes them into parent/sub-issue hierarchies.
Why: The Pelis Factory's Issue Arborist created 18 parent issues and 77 discussion reports. As this repo matures, the issue tracker will grow and benefit from better organization.
Effort: Medium (requires careful reasoning about issue relationships).
P3.2: Daily Malicious Code Scan
What: Reviews recent commits (past 24h) for suspicious patterns — obfuscated code, unexpected network calls in shell scripts, unusual permission escalation patterns.
Why: Pelis Factory runs this daily on their codebase. Especially relevant here since the codebase manages iptables rules, Docker privileges, and container security. A supply chain attack introducing a backdoor in a firewall would be particularly dangerous.
Effort: Low to medium.
P3.3: Secret Digger Results Aggregator
What: A daily agent that reads the output of all 3 hourly secret-digger runs (Claude, Copilot, Codex) and posts a consolidated daily security status discussion comparing what each engine found.
Why: Currently the 3 engines run independently with no cross-comparison. Correlating findings (e.g., "all 3 engines failed to find secrets today" vs "Claude found X but Codex missed it") would yield insights about engine-specific capabilities.
Effort: Low — read-only aggregation using cache-memory.
📈 Maturity Assessment
Current Level: 4/5 — This repository has exceptional security automation that goes far beyond typical repos (hourly red-team agents, daily threat modeling, real-time PR security review). The main gaps are in proactive code improvement and meta-level health monitoring.
Target Level: 5/5 — Add issue triage, test coverage improvement, workflow health management, and code simplification.
Gap: Primarily the "continuous improvement" category — agents that proactively make the codebase better over time rather than just monitoring it.
🔄 Comparison with Pelis Agent Factory Best Practices
What This Repo Does Exceptionally Well
security-guard.mdhas deep domain knowledge about iptables, Squid, container security, and capability dropping — far more specialized than generic PR review agents/plan: The slash command workflow is well-implemented with parent/sub-issue creationWhat It Could Improve
Unique Opportunities from the Security Domain
📝 Notes for Future Runs
Analysis stored in
/tmp/gh-aw/cache-memory/pelis-advisor-notes.json. Key baseline metrics for tracking:Beta Was this translation helpful? Give feedback.
All reactions