[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations (2026-03-23) #1397

2026-03-23T03:35:01Z

github-actions[bot]
bot Mar 23, 2026

📊 Executive Summary

gh-aw-firewall has a strong, security-focused agentic workflow foundation with 20+ workflows covering security, CI/CD health, documentation, smoke testing, and release automation. The repository is operating at maturity level 3.5/5 — well above average for an open-source project. The primary gaps are in continuous code quality improvement, issue organization/triage, and meta-monitoring of workflows themselves. Adding these would push the repo to level 4+ and close remaining automation gaps.

🎓 Patterns Learned from Pelis Agent Factory

The Pelis Agent Factory ran 100+ specialized agentic workflows in github/gh-aw. Key patterns that stand out:

Pattern 1: Specialized > Monolithic

Rather than one big "do everything" agent, Pelis uses many narrow-purpose workflows. Each workflow does one thing well (CI Doctor vs. Schema Checker vs. Breaking Change Checker). This repo has already adopted this pattern effectively.

Pattern 2: Continuous Improvement Agents

Agents that run daily and propose incremental improvements via PRs compound in value over time. The Code Simplifier achieved 83% merge rate and the Duplicate Code Detector hit 79% merge rate — sustained practical value. This repo has doc-maintainer and test-coverage-improver but lacks code quality simplification agents.

Pattern 3: Meta-Monitoring (Workflow Health Manager)

A meta-agent that monitors other agents created 40 issues and contributed to 19 merged PRs. No equivalent exists here. The ci-doctor covers CI failures but not workflow health/configuration drift.

Pattern 4: Issue Organization Pipeline

Pelis uses a pipeline of issue management: Issue Triage → Issue Arborist (sub-issue linking) → Issue Monster (task dispatch) → Sub-Issue Closer. This repo has the Issue Monster and duplicate detector but is missing triage labeling and sub-issue organization.

Pattern 5: Validation-Only Analysts

Some workflows are pure read-only analysts (Schema Consistency Checker, Blog Auditor, Static Analysis Report) that produce discussion reports rather than code changes. These are low-risk, high-insight tools. This repo uses this pattern well with ci-cd-gaps-assessment, security-review, and cli-flag-consistency-checker.

How This Repo Compares

Pattern	Pelis Factory	gh-aw-firewall
Security workflows	✅ Multiple	✅ Excellent coverage
CI/CD monitoring	✅ CI Doctor	✅ CI Doctor + Gap Assessment
Documentation	✅ 7 workflows	✅ Doc Maintainer + CLI Checker
Continuous code improvement	✅ Simplifier, Refactorer, Style	❌ Missing
Issue triage	✅ Triage Agent	❌ Missing
Meta-monitoring	✅ Workflow Health Manager	❌ Missing
Release automation	✅ Changeset generator	✅ Release notes updater
Smoke testing	✅ Firewall workflow	✅ Three engine smoke tests

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`security-guard`	PR security regression review	PR opened/sync	✅ Core strength
`security-review`	Daily threat modeling + evidence	Daily schedule	✅ Core strength
`secret-digger-claude/codex/copilot`	Red team secret scanning (3 engines)	Daily schedule	✅ Unique triple-engine coverage
`dependency-security-monitor`	CVE monitoring + patch PRs	Daily schedule	✅ Well-configured
`smoke-claude/codex/copilot`	Engine smoke tests	12h + PR	✅ Good coverage
`smoke-chroot`	Chroot feature validation	PR	✅ Targeted
`build-test`	Build & test suite	PR	✅ Standard
`ci-doctor`	CI failure investigation	workflow_run	✅ High-value
`ci-cd-gaps-assessment`	CI/CD gap analysis	Daily schedule	✅ Meta-insight
`doc-maintainer`	Doc sync with code changes	Daily schedule	✅ Good
`cli-flag-consistency-checker`	CLI flag vs docs consistency	Weekly schedule	✅ Domain-specific
`test-coverage-improver`	Test coverage PRs	Weekly schedule	✅ Security-focused
`issue-monster`	Assign issues to Copilot agent	Issues + 1h	✅ Task dispatcher
`issue-duplication-detector`	Detect duplicate issues	Issue opened	✅ Useful
`plan`	/plan slash command	slash_command	✅ Interactive
`update-release-notes`	AI-generated release notes	Release published	✅ Good automation
`pelis-agent-factory-advisor`	This workflow	Daily schedule	✅ Meta-analysis

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Issue Triage Agent

What: Automatically label newly opened issues as bug, feature, documentation, question, security, performance, or help-wanted.

Why: The repo currently has no automatic labeling. Issues like #1356 (PATH propagation), #1355 (memory limits), and #1354 (proxy env leakage) are opened without labels, making triage manual. An AWF-specific triage agent can recognize patterns like "container", "iptables", "Squid", "domain allowlist" to assign accurate labels.

How: Create .github/workflows/issue-triage.md triggered on issues: [opened, reopened]. Include AWF-domain knowledge: security issues (keywords: CVE, vulnerability, bypass, escape), performance issues (timeout, slow, memory), networking issues (domain, Squid, proxy, iptables), and feature requests.

Effort: Low

Example frontmatter:

on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
safe-outputs:
  add-labels:
    allowed: [bug, feature, documentation, question, security, performance, networking, help-wanted, good-first-issue]
  add-comment: {}
timeout-minutes: 5

P0.2: Workflow Health Manager

What: A meta-workflow that monitors all other agentic workflows in the repo, detects failures, configuration drift, and inactive/broken workflows, then creates issues.

Why: The repo already has 20+ workflows. Several recent issues (#1388, #1379, #1378) are workflow failures that needed manual identification. A dedicated health manager would catch these proactively. In Pelis Factory, this created 40 issues and contributed to 19 merged PRs.

How: Create .github/workflows/workflow-health-manager.md running daily. Check recent workflow runs via agenticworkflows-logs, identify failures, check for workflows with stale trigger lists (ci-doctor's workflow_run list), and create issues with diagnostic context.

Effort: Low-Medium

P1 — Plan for Near-Term

P1.1: Breaking Change Detector

What: Monitors PRs that touch the public CLI interface (src/cli.ts), action.yml, WrapperConfig types (src/types.ts), or container entrypoints for backward-incompatible changes.

Why: AWF is used as a GitHub Action. Breaking changes to action.yml inputs, CLI flags, or the WrapperConfig API can silently break downstream users. No workflow currently catches this. The Pelis Breaking Change Checker created alert issues for CLI version updates before they reached users.

How: Trigger on PRs modifying action.yml, src/cli.ts, src/types.ts, containers/*/entrypoint.sh. Use git diff to analyze what changed, compare with the last release tag, and comment on the PR if breaking changes are detected.

Effort: Medium

P1.2: Continuous Code Simplifier

What: Daily agent that analyzes recently modified TypeScript files for simplification opportunities and creates PRs.

Why: The codebase is growing complex — src/docker-manager.ts is ~1100 lines and src/cli.ts handles many responsibilities. After rapid feature additions, complexity tends to accumulate. The Pelis Code Simplifier achieved 83% merge rate on 6 PRs in a short time. AWF's TypeScript codebase would benefit from the same treatment.

How: Create .github/workflows/code-simplifier.md running daily. Focus on recently modified .ts files, look for: deeply nested conditionals, repeated patterns (env var building, Docker config generation), opportunities to extract helper functions. Skip test files and generated files.

Effort: Medium

P1.3: Docs Site Auditor

What: Validates the Astro Starlight docs site (docs-site/) for accuracy against the codebase — checking that documented CLI flags match src/cli.ts, container architecture descriptions match src/docker-manager.ts, and example commands still work.

Why: The repo has both README.md (covered by doc-maintainer) and a separate Astro docs site in docs-site/. The docs site content is not validated by any existing workflow. CLI flag documentation, architecture diagrams, and code examples can drift silently.

How: Create .github/workflows/docs-site-auditor.md running weekly. Cross-reference docs-site/src/content/docs/ markdown files with src/cli.ts for flag accuracy, check action.yml inputs against docs, and validate code examples are syntactically correct.

Effort: Medium

P1.4: Integration Test Coverage Reporter

What: Weekly workflow that generates a structured report on integration test coverage gaps, cross-referencing the test inventory in docs/INTEGRATION-TESTS.md with actual tests in tests/.

Why: The repo has docs/INTEGRATION-TESTS.md with a "gap analysis" section, but coverage is tracked manually. Test Coverage Improver creates PRs but doesn't produce a readable coverage map. A dedicated reporter would surface coverage debt and guide prioritization. The Pelis Workflow Health Manager pattern is analogous here.

How: Create .github/workflows/integration-test-reporter.md running weekly. Compare tests/integration/ directory with the gap analysis doc, identify security-critical paths (DLP, API proxy credential isolation, domain bypass attempts) not covered by tests, and post a discussion.

Effort: Low-Medium

P2 — Consider for Roadmap

P2.1: Issue Arborist (Sub-Issue Organizer)

What: Groups related open issues into parent-child hierarchies using GitHub's sub-issue feature.

Why: The issue tracker has multiple related issues (e.g., #1295 Node.js global tools, #1356 PATH propagation, #1328 common API base all relate to environment setup). Arborist would create parent issues and link them, making roadmap planning cleaner.

Effort: Medium

P2.2: Schema Consistency Checker

What: Validates that WrapperConfig (in src/types.ts), CLI flags (in src/cli.ts), action.yml inputs, and documentation all stay in sync.

Why: AWF has three "sources of truth" for its configuration interface. Adding a new feature (e.g., --enable-dlp) requires updating types.ts, cli.ts, action.yml, README.md, and docs. Drift is common and hard to catch. The Pelis Schema Checker created 55 analysis discussions catching drift between JSON schemas and code.

Effort: Medium

P2.3: Changeset Generator

What: Automatically prepares changelog entries and version bump suggestions after each merged PR.

Why: The current update-release-notes workflow runs after a release is published. A Changeset Generator would proactively suggest changelog entries for PRs as they merge, reducing the manual burden at release time. The Pelis Changeset workflow achieved 78% merge rate on 28 PRs.

Effort: Medium

P2.4: Daily Malicious Code Scanner

What: Reviews recently merged code changes for suspicious patterns: obfuscated code, unusual network calls, embedded strings that could be commands, or changes to security-critical paths (iptables rules, Squid config generation, credential handling).

Why: AWF is a security infrastructure tool. Malicious code introduced here could undermine the firewall for all users. Given the repository's domain (it is the security layer), this is especially high-value. The Pelis Daily Malicious Code Scan monitors for supply chain attack patterns.

Effort: Low-Medium

P3 — Future Ideas

P3.1: Performance Regression Monitor

What: Tracks AWF startup time (container initialization, Squid healthcheck) and flags PRs that significantly increase it.

Why: Issue #1376 notes integration tests taking 37+ minutes. Issue #240 asks for a performance benchmarking suite. An automated performance monitor would catch regressions before they accumulate.

Effort: High (requires benchmark infrastructure)

P3.2: Docs Noob Tester

What: Simulates a first-time user following the README and docs, identifying confusing or broken onboarding steps.

Why: AWF requires Docker, specific iptables permissions, and sudo. The onboarding path has multiple failure modes. A noob tester would systematically check the getting-started experience.

Effort: High (needs sandbox environment)

P3.3: Container Security Scanner Workflow

What: An agentic wrapper around docker scout or Trivy that runs daily on the published GHCR container images and creates issues for new CVEs.

Why: The ci-doctor references a "Container Security Scan" workflow in its trigger list, but no .md file exists for it — suggesting it's a standard YAML workflow without agentic analysis. An AI agent could provide deeper analysis and remediation suggestions beyond raw CVE counts.

Effort: Medium

📈 Maturity Assessment

Dimension	Level	Evidence
Security automation	⭐⭐⭐⭐⭐ (5/5)	3 secret-diggers, security-guard, daily security-review, dep-monitor
CI/CD health	⭐⭐⭐⭐ (4/5)	CI Doctor, gap assessment; missing breaking-change detection
Documentation	⭐⭐⭐⭐ (4/5)	doc-maintainer, cli-flag-checker; missing docs-site auditor
Code quality	⭐⭐ (2/5)	test-coverage-improver only; no simplifier/refactorer
Issue management	⭐⭐⭐ (3/5)	issue-monster, duplication-detector; missing triage, arborist
Release automation	⭐⭐⭐ (3/5)	release notes updater; no changeset generator
Meta-monitoring	⭐⭐ (2/5)	pelis-advisor; no workflow health manager

Overall Current Level: 3.5/5 — Above average, particularly strong on security automation given the domain.

Target Level: 4.5/5 — Achievable by adding P0+P1 items above.

Gap to close: Continuous code quality (P1.2), issue triage (P0.1), meta-monitoring (P0.2), and breaking change detection (P1.1) are the four additions that would have the most compound effect.

🔄 Comparison with Pelis Best Practices

What gh-aw-firewall Does Exceptionally Well

Triple-engine secret scanning (Claude + Codex + Copilot) is unique and shows genuine security commitment
Daily comprehensive threat modeling with security-review goes beyond what most repos do
Domain-specific validation (cli-flag-consistency-checker understands AWF CLI semantics)
Smoke test coverage for three AI engines validates the core product continuously

What Could Improve

Continuous code quality: Zero simplification/refactoring agents. src/docker-manager.ts at ~1100 lines and growing is a prime candidate
Issue labeling: No triage workflow means issues arrive unlabeled, adding manual overhead
Workflow self-monitoring: The meta-level is missing — recent workflow failures ([aw] Dependency Security Monitor failed #1388, [agentics] Secret Digger (Codex) failed #1379, [agentics] Secret Digger (Copilot) failed #1378) show this gap
CI trigger list maintenance: ci-doctor.md has a hardcoded list of workflows to monitor that needs manual updating when new workflows are added

Unique Opportunities Given the Security Domain

The Breaking Change Detector (P1.1) is especially important for AWF because it's used as a GitHub Action — breaking action.yml inputs silently affects all downstream users
The Malicious Code Scanner (P2.4) is particularly high-value here because AWF is the security infrastructure — a compromised AWF undermines all agent sandboxing
A future AWF Self-Test workflow could run AWF on itself to validate that the firewall correctly blocks its own agent's unauthorized network access — a meta-security test unique to this domain

📝 Cache Memory Updated

Notes for this analysis have been saved to /tmp/gh-aw/cache-memory/pelis-advisor-notes.md with the workflow inventory, gaps identified, and maturity scores for tracking over time.

AI generated by Pelis Agent Factory Advisor

expires on Mar 30, 2026, 3:35 AM UTC

2026-03-30T03:44:28Z

github-actions[bot]
bot Mar 30, 2026
Author

This discussion was automatically closed because it expired on 2026-03-30T03:35:00.858Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations (2026-03-23) #1397

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations (2026-03-23) #1397

Uh oh!

github-actions[bot] bot Mar 23, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

Pattern 1: Specialized > Monolithic

Pattern 2: Continuous Improvement Agents

Pattern 3: Meta-Monitoring (Workflow Health Manager)

Pattern 4: Issue Organization Pipeline

Pattern 5: Validation-Only Analysts

How This Repo Compares

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Issue Triage Agent

P0.2: Workflow Health Manager

P1 — Plan for Near-Term

P1.1: Breaking Change Detector

P1.2: Continuous Code Simplifier

P1.3: Docs Site Auditor

P1.4: Integration Test Coverage Reporter

P2 — Consider for Roadmap

P2.1: Issue Arborist (Sub-Issue Organizer)

P2.2: Schema Consistency Checker

P2.3: Changeset Generator

P2.4: Daily Malicious Code Scanner

P3 — Future Ideas

P3.1: Performance Regression Monitor

P3.2: Docs Noob Tester

P3.3: Container Security Scanner Workflow

📈 Maturity Assessment

🔄 Comparison with Pelis Best Practices

What gh-aw-firewall Does Exceptionally Well

What Could Improve

Unique Opportunities Given the Security Domain

📝 Cache Memory Updated

Replies: 1 comment

Uh oh!

github-actions[bot] bot Mar 30, 2026 Author

github-actions[bot]
bot Mar 23, 2026

github-actions[bot]
bot Mar 30, 2026
Author