[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report & Recommendations (March 2026) #1478
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-04-04T03:33:02.293Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Executive Summary
This repository has a mature and security-focused agentic workflow collection — 22 agentic workflows covering security scanning, CI fault investigation, dependency monitoring, and multi-engine smoke testing. The main gaps are in workflow observability (no meta-monitoring agent), issue lifecycle management (no triage/labeling), and release automation (no changeset generator). Given this repo is a firewall for agents, it has a unique opportunity to demonstrate eating its own dog food at every layer.
🎓 Patterns Learned from Pelis Agent Factory
Key Patterns from the Documentation Site
The Pelis Agent Factory blog series (19 parts) revealed these high-value patterns:
skip-if-match: Pattern to prevent duplicate concurrent runs when a similar PR/issue already exists.From the Agentics Reference Repository
How This Repo Compares
This repo already uses many Factory patterns well (secret diggers with 3 engines, security-guard on PRs, CI doctor). The main missing layer is observability (no workflow health monitor) and issue management (no triage agent).
📋 Current Agentic Workflow Inventory
build-test.mdci-cd-gaps-assessment.mdci-doctor.mdcli-flag-consistency-checker.mddependency-security-monitor.mddoc-maintainer.mdskip-if-matchcorrectlyfirewall-issue-dispatcher.mdawf-labeled issues from gh-awissue-duplication-detector.mdissue-monster.mdpelis-agent-factory-advisor.mdplan.mdsecret-digger-claude.mdsecret-digger-codex.mdsecret-digger-copilot.mdsecurity-guard.mdsecurity-review.mdsmoke-chroot.mdsmoke-claude.mdsmoke-codex.mdsmoke-copilot.mdtest-coverage-improver.mdupdate-release-notes.md🚀 Actionable Recommendations
P0 — Implement Immediately
P0.1: Issue Triage Agent
What: Auto-label new issues with
bug,feature,enhancement,documentation,question,help-wanted,securitywhen they have no labels. Leave a brief comment explaining the label and suggesting next steps.Why: This is the "hello world" of agentic workflows per the Factory — immediately useful, very low risk. This repo has 20+ open issues with no labels. It reduces maintainer triage burden on every new issue.
How: Simple
issues.openedtrigger, read-only analysis,add-labels+add-commentsafe outputs. Configuremin-integrity: nonefor a public repo so all contributors' issues are visible.Effort: Low (1 hour)
Example:
P0.2: Workflow Health Manager
What: A daily meta-agent that monitors the health of all other agentic workflows in this repository. It scans recent runs, detects patterns like repeated failures, stale workflows, disabled workflows, and high error rates, then creates improvement issues.
Why: With 22+ agentic workflows (and growing), there's no visibility into which ones are failing consistently, which are stale, which are costing the most tokens, or which are producing low-quality outputs. The Factory's "Audit Workflows" agent generated 93 discussions and 9 issues — it became the most valuable observability tool. Given that this repo is a firewall product, having a meta-layer monitoring the health of the agent ecosystem is both practical and on-brand.
How: Daily schedule, reads
actionslogs, usesagentic-workflowstool,cache-memoryfor trend detection across runs, creates issues for persistent failures.Effort: Medium (4–6 hours)
P1 — Plan for Near-Term
P1.1: Breaking Change Checker
What: On every PR, detect changes to the CLI flags, public API, Docker container interfaces, environment variables, or Squid configuration schema that could break existing users. Create an alert issue or comment flagging the breaking change with migration guidance.
Why: AWF has real production users. Breaking changes to
--allow-domainssyntax, the Docker Compose structure, the iptables init container interface, or exported environment variables (HTTP_PROXY,SQUID_PROXY_HOST, etc.) require communication. The Factory's Breaking Change Checker creates alert issues. This is especially important for a security tool where unexpected behavior changes could silently break user pipelines.How:
pull_requesttrigger, analyze diff againstsrc/cli.ts(flags),src/docker-manager.ts(compose API, env vars),src/squid-config.ts(config schema), anddocs/. Create an issue with[breaking-change]label if detected.Effort: Medium
P1.2: Container Image Security Scanner
What: Daily workflow that scans the three published Docker images (
squid,agent,api-proxy) for known vulnerabilities using Trivy or Grype viabash. Creates issues for HIGH/CRITICAL CVEs with remediation guidance.Why: AWF publishes container images to GHCR. These images run with elevated privileges (NET_ADMIN, SYS_CHROOT) in user environments. A vulnerability in the Squid or Ubuntu base image is high-severity. The
dependency-security-monitor.mdcovers npm dependencies but not Docker image layers. This is a gap unique to this repo's published artifact type.How: Daily schedule, run
trivy image ghcr.io/github/gh-aw-firewall/agent:latestvia bash, parse JSON output, create issues for findings above severity threshold.Effort: Medium
P1.3: Changeset / Version Bump Automation
What: Weekly workflow that analyzes unreleased commits since the last tag, determines the appropriate semver bump (major/minor/patch) based on conventional commit types, and proposes a PR that updates
package.jsonversion andCHANGELOG.md.Why: The repo uses conventional commits (enforced by commitlint). The Factory's Changeset workflow achieved 78% PR merge rate. The existing
update-release-notes.mdonly runs after a release is published — it doesn't drive the release process. A Changeset agent would close the loop by proposing releases proactively.How: Weekly schedule with
skip-if-matchto avoid duplicate PRs. Usesgit log+git tagbash commands to find unreleased commits. Determines bump type from commit prefixes (feat:→ minor,fix:→ patch,BREAKING CHANGE:→ major).Effort: Medium
P1.4: Static Analysis Report
What: Daily workflow that runs
zizmor,poutine, andactionlinton the compiled.lock.ymlworkflow files and posts findings as a discussion, creating issues for HIGH/CRITICAL findings.Why: This repo has ~25 compiled lock files, many with complex permissions and network configurations. The Factory's Static Analysis Report generated 57 discussions + 12 Zizmor security reports. Given AWF is a security product, ensuring its own workflows are secure (no script injection, no unpinned actions) is essential.
zizmorcatches expression injection issues that are especially relevant for workflows that process issue/PR content.How: Daily schedule, bash commands for
zizmor,poutine,actionlint, structured JSON output parsed into a discussion. Issue creation for critical findings.Effort: Low-Medium (tools already available)
P2 — Consider for Roadmap
P2.1: Issue Arborist
What: Periodic workflow that finds related open issues and links them as sub-issues, building a dependency tree. For AWF, this would group issues by component (squid, agent, api-proxy, CLI) or by feature area.
Why: The repo has 20+ open issues that are likely related (e.g., proxy compatibility issues, chroot path issues). The Factory's Arborist created 18 parent issues and 77 discussion reports. Organizing issues reduces duplication and helps prioritize.
Effort: Medium
P2.2: Audit Workflows Meta-Agent
What: Weekly meta-agent that reviews all agentic workflow run logs, identifies success/failure patterns, costly runs, and quality degradation, then produces an analytics discussion.
Why: The Factory's Audit Workflows agent was its most prolific (93 discussions, 9 issues). With 22 workflows and growing, having quantified metrics on merge rates, failure rates, and cost-per-workflow enables informed optimization decisions.
Effort: Medium-High
P2.3: Documentation Noob Tester
What: Weekly workflow that reads the repository documentation (README,
docs/) from a "first-time user" perspective and identifies confusing steps, missing prerequisites, outdated examples, and broken commands. Creates issues for findings.Why: AWF has a complex setup (Docker, iptables, sudo, chroot). New users face multiple stumbling blocks. The Factory's Docs Noob Tester created 9 merged PRs. This is particularly valuable for a security tool where misconfiguration could leave users with false sense of security.
Effort: Low
P2.4: Enhanced CI Doctor — Auto-Discovery of Monitored Workflows
What: Modify
ci-doctor.mdto automatically discover workflow names from the repository instead of requiring manual maintenance of theworkflows:list.Why: The current
ci-doctor.mdhas a hardcoded list of ~21 workflow names. When new workflows are added, this list must be updated manually (and can fall out of sync). With 22+ workflows, this is already a maintenance burden.How: Add a pre-step that runs
gh workflow listto dynamically populate the monitored workflows list, or generate the list from the workflow YAML files. This could be a simple bash pre-step in the lock file.Effort: Low
P3 — Future Ideas
P3.1: AWF Ecosystem Health Report
What: Monthly report on the health of the AWF-powered agent ecosystem across the organization — aggregating firewall statistics, blocked domain patterns, and agent success rates from the
awf logs summarycommand.Why: AWF collects rich telemetry. A periodic report on "what domains are agents trying to reach?" and "which agent types are most productive?" would inform product decisions.
Effort: High (requires opt-in from users)
P3.2: Integration Test Flakiness Tracker
What: Agent that tracks flaky integration tests over time using cache-memory, identifies patterns in which tests fail intermittently, and creates issues for persistent flakiness.
Why: The repo has Docker-based integration tests that are susceptible to race conditions (network timing, container startup). The open issue #1354 (proxy env var leakage in tests) suggests flakiness is a real concern.
Effort: Medium
P3.3: Proxy Compatibility Monitor
What: Monthly agent that checks open issues and discussions for proxy compatibility complaints (Zig, Maven, custom tools) and creates consolidated tracking issues or documentation PRs with workarounds.
Why: Open issues #1429 (Zig), historical Maven issues, and others suggest this is a recurring pain point. Proactive monitoring and documentation would reduce support burden.
Effort: Low
📈 Maturity Assessment
Current Level: 4 — Advanced (well beyond typical repos; strong security focus)
Target Level: 5 — Expert (full observability, complete lifecycle automation, meta-layer monitoring)
Gap: The primary gap is the observability layer — there's no meta-agent watching the watchers. With 22 workflows, this becomes increasingly critical. The secondary gap is issue lifecycle (no auto-triage) which is the simplest possible first step.
🔄 Comparison with Best Practices
What This Repo Does Well ✅
firewall-issue-dispatcher.mdis a sophisticated pattern not seen in the generic Factory examples.security-guard.mdis deeply tailored to AWF's specific threat model (iptables, Squid config, container boundaries) rather than generic code review.skip-if-match: Correctly used indoc-maintainer.mdandtest-coverage-improver.mdto prevent duplicate PRs.issue-duplication-detector.mduses cache-memory for persistence — a Factory best practice.What Could Be Improved 🔧
ci-doctor.mdhas a hardcoded list of monitored workflows that requires manual updates.update-release-notes.mdruns after release; no agent proposes the version bump.Unique Opportunities Given the Domain 🎯
awf logs summarydata from all CI runs and reports on domain access patterns would be uniquely valuable.📝 Notes for Future Runs
Stored in
/tmp/gh-aw/cache-memory/advisor-notes.mdci-doctor.mdmonitored workflow list had ~21 entries; adding new workflows requires manual update of this listBeta Was this translation helpful? Give feedback.
All reactions