[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment #1364
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-03-25T22:24:09.774Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
This is an automated analysis of the CI/CD pipeline and integration test coverage in this repository, with actionable recommendations for improving PR quality measurement.
📊 Current CI/CD Pipeline Status
The repository has a well-structured, multi-layered CI/CD pipeline with 40 YAML workflows and 21 agentic (
.md) workflows — 61 total. The pipeline covers build verification, linting, type checking, unit tests, integration tests, security scanning, documentation, and end-to-end smoke testing.Workflows running on pull_request events:
build.ymllint.ymltest-integration.ymltest-integration-suite.ymltest-chroot.ymltest-examples.ymlexamples/*.shscripts end-to-endtest-action.ymltest-coverage.ymlcodeql.ymldependency-audit.ymlcontainer-scan.ymlcontainers/**path changes)pr-title.ymldocs-preview.ymllink-check.yml*.mdpath changes)build-test.mdsecurity-guard.mdsmoke-claude.mdsmoke-codex.mdsmoke-copilot.mdsmoke-chroot.md✅ Existing Quality Gates
Code Quality
tsc --noEmitamannn/action-semantic-pull-requestTesting
Security
npm audit --audit-level=highfor main + docs-siteagentandsquidcontainersDocumentation
Smoke Tests
🔍 Identified Gaps
🔴 High Priority
1. 7 Integration Test Files Not Executed in CI
Seven integration test files exist in
tests/integration/but do not match any--testPathPatternsin any CI workflow:api-target-allowlist.test.tschroot-capsh-chain.test.tschroot-copilot-home.test.tsgh-host-injection.test.tsghes-auto-populate.test.tsskip-pull.test.ts--skip-pullflag behaviorworkdir-tmpfs-hiding.test.tsSeveral of these (
chroot-capsh-chain,gh-host-injection,chroot-copilot-home) are security-critical tests that verify the firewall's isolation guarantees are not silently broken by code changes.2. Critically Low Unit Test Coverage — Core Files at Near-Zero
From
COVERAGE_SUMMARY.md:cli.tsdocker-manager.tshost-iptables.tscli.ts(entry point, signal handling, orchestration) anddocker-manager.ts(all container lifecycle logic, compose generation, bind mount config) are the two most important files and are essentially untested at the unit level. A refactor in either file could introduce regressions that slip through.3. Coverage Thresholds Are Too Low to Be Meaningful
Current thresholds: Statements 38%, Branches 30%, Functions 35%, Lines 38%. Given that
cli.tsis 0% anddocker-manager.tsis 18%, these thresholds can pass while the most important code paths have no coverage at all. The thresholds do not enforce coverage on security-critical paths.4. Container Security Scan Has a Path Filter Gap
container-scan.ymlonly triggers oncontainers/**path changes. Changes tosrc/docker-manager.tsorsrc/squid-config.tsthat alter container configuration, mount points, or capabilities do not retrigger the container scan — even though those source changes directly affect runtime security posture.🟡 Medium Priority
5. Duplicate Workflow Definition (
test-integration.yml=test-integration-suite.yml)Both files have the name
Integration Testsand identical content (4 parallel jobs: domain/network, protocol/security, container-ops, API proxy). This causes confusion in the PR check list and doubles the build cost with no added value. One should be removed or differentiated.6. Smoke Tests Are Not Automatic — Require Emoji Reaction
The agentic smoke tests (
smoke-claude.md,smoke-codex.md,smoke-copilot.md,smoke-chroot.md) run on PRs but only when a maintainer adds a specific emoji reaction (❤️, 🎉, 👀, 🚀). They do not run automatically. This means a PR that breaks the actual Claude/Copilot/Codex agent execution can merge without the smoke tests ever firing.7. Performance Benchmarks Never Run on PRs
performance-monitor.ymlonly runs on a weekly schedule. A PR that introduces a 2× container startup regression would not be caught until the following week. No performance gate exists on the PR merge path.8.
api-proxyContainer Not Scanned by Trivycontainer-scan.ymlscansawf-agentandawf-squidbut the API proxy sidecar (containers/api-proxy/) is not scanned. The API proxy handles real API credentials (OpenAI, Anthropic, Copilot tokens) and runs as a network-accessible service, making it a high-value target for CVEs.9. No SBOM (Software Bill of Materials) Generation
No workflow generates or attaches an SBOM to releases. For a security tool distributed as a Docker image and npm binary, SBOM attestation is increasingly expected for supply chain transparency. This is especially relevant since the project publishes to GHCR.
10. No Coverage Enforcement Per File or Per Module
Coverage is enforced globally (38% statements project-wide) but not per-module. A contributor could add 1000 new lines with 0% coverage to
docker-manager.tsand the global threshold would still pass, as long as other covered files compensate.🟢 Low Priority
11. No License Compliance Check
No workflow scans dependencies for license compatibility. As a tool used in enterprise/CI environments and distributed on npm/GHCR, license drift (a dependency changing from MIT to GPL/AGPL) should be automatically detected.
12. No Spell Check on Documentation
The link checker (
link-check.yml) validates URLs but there is no spell check or prose style linting on documentation. The docs site (docs-site/) targets enterprise users and engineers who may file issues for documentation errors.13. Documentation Build Not Triggered by Code Changes
docs-preview.ymlonly builds the docs whendocs-site/**,docs/**, or*.mdfiles change. A change tosrc/that adds a new CLI flag would not trigger a docs preview build. Manual verification is needed to confirm docs remain accurate after code changes.14. No Commit Message Validation in CI
commitlintis configured (viacommitlint.config.js+ husky) as a local pre-commit hook, but there is no CI enforcement. Commits merged via the GitHub UI, squash-merges from PRs, or commits from automated tools bypass the hook entirely.📋 Actionable Recommendations
R1: Add Missing Integration Tests to CI Matrix [High | Low Complexity]
Issue: 7 integration test files never run in CI.
Fix: Add the missing test patterns to
test-integration.yml:Impact: Catches regressions in security-critical isolation paths that are currently invisible to CI.
R2: Increase Coverage Thresholds and Add Per-File Minimums [High | Medium Complexity]
Issue: 38% global threshold allows critical files to have 0% coverage.
Fix: Raise global thresholds incrementally and add per-file overrides in
jest.config.js:Impact: Forces test investment in the highest-risk files.
R3: Expand Container Security Scan Trigger Paths [High | Low Complexity]
Issue: Container scan skips PRs that change container config in
src/.Fix: Add
src/**to thepaths:filter incontainer-scan.ymltrigger.Impact: Ensures every code change that could affect container security posture triggers a Trivy scan.
R4: Add Trivy Scan for API Proxy Container [Medium | Low Complexity]
Issue: API proxy container is excluded from security scanning.
Fix: Add a third
scan-api-proxyjob tocontainer-scan.ymlmirroring the existingscan-agentjob with./containers/api-proxy.Impact: Closes a CVE blind spot on the component that holds real API credentials.
R5: Remove Duplicate Integration Test Workflow [Medium | Low Complexity]
Issue:
test-integration.ymlandtest-integration-suite.ymlare identical.Fix: Delete one file; keep the one with better path filtering.
Impact: Halves unnecessary CI runtime and removes check list confusion.
R6: Make Smoke Tests Automatically Run on PRs (Opt-Out Model) [Medium | Medium Complexity]
Issue: Smoke tests only run when maintainer adds emoji reaction.
Fix: Run smoke tests automatically on PRs with
roles: maintainerto avoid burning runner minutes on external contributor PRs. Or add a required smoke test for a single agent (e.g.,smoke-copilot.md) to block merges.Impact: Prevents merging PRs that silently break the end-to-end agent execution flow.
R7: Add Performance Gate on PRs [Medium | Medium Complexity]
Issue: Performance regressions only detected weekly.
Fix: Add a lightweight startup-time benchmark step (container up + simple command) to
build.ymlor a new PR-targeted workflow. Fail if time exceeds a 2× threshold vs. a stored baseline.Impact: Catches startup regressions before they reach users.
R8: Add SBOM Generation to Release Workflow [Medium | Low Complexity]
Issue: No supply chain transparency for releases.
Fix: Add
anchore/sbom-actiontorelease.ymland attach SBOM to GitHub Release assets.Impact: Meets enterprise compliance requirements and improves supply chain security posture.
R9: Add License Compliance Scanning [Low | Low Complexity]
Issue: No license drift detection.
Fix: Add
license-checkerorlicenseeas a CI step independency-audit.yml:npx license-checker --onlyAllow 'MIT;Apache-2.0;BSD-2-Clause;BSD-3-Clause;ISC;CC0-1.0'Impact: Prevents accidental introduction of copyleft dependencies.
R10: Enforce Commitlint in CI [Low | Low Complexity]
Issue: Commit message convention only enforced locally via husky.
Fix: Add a step to
lint.ymlthat runs commitlint on the PR's commits vianpx commitlint --from origin/main --to HEAD.Impact: Ensures consistent commit history regardless of how commits are created.
📈 Metrics Summary
cli.tscoveragedocker-manager.tscoverageGenerated by automated CI/CD gap assessment workflow on 2026-03-18.
Beta Was this translation helpful? Give feedback.
All reactions