You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This report provides a comprehensive assessment of the current CI/CD pipeline state, existing quality gates, and identified gaps in PR quality measurement for this repository.
📊 Current CI/CD Pipeline Status
The repository has a mature and layered CI/CD pipeline combining traditional GitHub Actions workflows with agentic AI-powered workflows. Overall pipeline health is good, though a dependency audit failure and several low-priority docs issues were observed in recent runs.
Total workflows: ~45 (17 standard YAML + 27 agentic Markdown + lock files)
1. Critically Low Unit Test Coverage with Permissive Thresholds
Current state: Overall unit test coverage is only 38% statements / 31% branches / 37% functions. The two most critical files have near-zero coverage:
cli.ts (entry point): 0% coverage — zero tests for CLI argument parsing, signal handling, and orchestration flow
docker-manager.ts (core container logic): 18% statements, 4% function coverage — the file that manages container lifecycle is almost entirely untested
The coverage thresholds in jest.config.js are set at these existing low levels (38%/30%/35%/38%), meaning they only prevent further drops and do not drive improvement.
2. No Container/Docker Image Security Scanning on PRs
Current state: CodeQL and npm audit cover source code and JS dependencies, but container images (containers/squid/, containers/agent/, containers/api-proxy/) are not scanned for OS-level vulnerabilities on PRs. There is no Trivy, Snyk, or Grype scan integrated into the PR pipeline.
Risk: A PR could introduce a base image with critical CVEs (e.g., a vulnerable Ubuntu or Squid package) without detection until manual review or external discovery.
3. Dependency Vulnerability Audit Currently Failing on Main
Current state: The most recent push to main shows Dependency Vulnerability Audit with conclusion failure. This means the main branch currently has known high/critical vulnerabilities in npm dependencies that are blocking the audit gate—yet the branch continues to receive merges.
4. No Performance Regression Gate on PRs
Current state:performance-monitor.yml runs weekly (Monday 06:00 UTC) and creates issues if regressions are detected. However, it does not run on PRs, meaning a PR could introduce a startup time or network latency regression that won't be detected until the following Monday.
🟡 Medium Priority
5. Coverage Thresholds Should Be Raised Incrementally
Current state: Thresholds are pinned at current coverage levels (38%/30%/35%/38%). While preventing regression, they do not create pressure to improve coverage. The test-coverage-improver agentic workflow runs weekly to generate PRs, but there's no CI gate that increases the target over time.
6. No SBOM (Software Bill of Materials) Generation or Attestation
Current state: The release workflow publishes GHCR container images and npm packages without generating or attesting an SBOM. There's no anchore/sbom-action or sigstore/cosign attestation in the release pipeline or PR checks.
Risk: Supply chain security posture is weak; consumers cannot verify the provenance of published artifacts.
7. Secret Scanning Only via Scheduled Agents, Not as a Blocking PR Gate
Current state: Secret scanning happens via Secret Digger agents (Claude/Codex/Copilot) running on hourly/daily schedules. There is no synchronous secret-scanning gate (e.g., trufflesecurity/trufflehog or gitleaks/gitleaks-action) running as a required PR check. A PR with an accidentally committed secret would pass all PR checks and merge before being caught by the next scheduled scan.
8. No Mutation Testing to Validate Test Quality
Current state: Tests run with Jest but there's no mutation testing (e.g., Stryker.js) to validate whether tests would actually catch bugs. Given 0% coverage on cli.ts and near-zero on docker-manager.ts, adding tests without mutation validation risks writing tests that pass regardless of implementation correctness.
9. Smoke Tests Not Consistently Blocking PRs
Current state: Smoke test workflows (smoke-claude, smoke-codex, smoke-copilot) are configured with action_required status in recent runs, suggesting they may not always function as hard blocking gates (e.g., when runner secrets aren't available or API rate limits are hit). The smoke-chroot workflow is path-filtered, so it only runs when src/** or containers/** change—changes to other paths that affect security behavior might bypass it.
10. No Container Image Size Tracking on PRs
Current state: Container build steps in integration tests verify images build successfully, but there's no tracking of image size deltas. A PR adding unnecessary packages to containers/agent/ could significantly increase the attack surface without triggering a warning.
🟢 Low Priority
11. Link Check Only Triggers on Markdown File Changes
Current state:link-check.yml uses paths: ['**/*.md'], so it only runs when markdown files change. If a non-markdown change (e.g., renaming a file referenced in docs) breaks links, it won't be caught until the next weekly scheduled run.
12. No Spelling/Grammar Checks on Documentation
Current state: Markdownlint enforces formatting but not spelling. Typos in documentation (especially in security-sensitive instructions) pass all checks.
13. Performance Monitor Uses Unpinned Action SHAs
Current state:performance-monitor.yml uses actions/checkout@v4 and actions/setup-node@v4 without SHA pins, contrary to the pattern followed in other workflows. This creates a supply chain risk for this workflow specifically.
14. No License Compliance Checking
Current state: No FOSSA, licensee, or similar tool checks that dependencies have compatible licenses. This is relevant given the project is Apache-2.0 licensed.
15. test-integration-suite.yml Doesn't Include Chroot Tests
The chroot integration tests live in a separate test-chroot.yml and are not referenced by test-integration-suite.yml. This creates two separate integration test workflows with no shared entrypoint, making it harder to see the full integration test picture and potentially leading to duplication.
📋 Actionable Recommendations
#
Gap
Recommended Solution
Complexity
Impact
1
Low unit test coverage (cli.ts=0%, docker-manager.ts=18%)
Incrementally add unit tests for cli.ts argument parsing and docker-manager.ts container lifecycle; raise thresholds by 5% per quarter
Medium
🔴 High
2
No Docker image vulnerability scanning
Add aquasecurity/trivy-action step to build.yml to scan built images on PRs; fail on CRITICAL severity
Low
🔴 High
3
Dependency audit failing on main
Investigate and resolve current npm audit failures; consider npm audit fix or pinning resolved versions
Low
🔴 High
4
No performance gate on PRs
Add a lightweight performance check job to build.yml that measures CLI startup time and fails if >20% regression vs main
Medium
🔴 High
5
Low coverage thresholds
Update jest.config.js thresholds to 50%/40%/50%/50% as a medium-term target; use the weekly test-coverage-improver to generate the tests
Low
🟡 Medium
6
No SBOM
Add anchore/sbom-action to release.yml and optionally to PR checks to generate and attest SBOMs for container images
Low
🟡 Medium
7
Secret scanning not a blocking gate
Add gitleaks/gitleaks-action as a required PR check alongside the scheduled Secret Digger agents
Low
🟡 Medium
8
No mutation testing
Integrate @stryker-mutator/jest-runner for the well-covered modules (squid-config, logger) to validate test quality
Medium
🟡 Medium
9
Smoke test reliability
Add a fallback mechanism in smoke workflow triggers and document which smoke tests are required vs informational
Medium
🟡 Medium
10
No image size tracking
Add a docker images --format step in build workflows to report image sizes and diff vs baseline in PRs
Low
🟡 Medium
11
Link check path filter
Remove path filter from link-check (or add a scheduled + PR-always mode) so renaming source files triggers link verification
Low
🟢 Low
12
No spell checking
Add cspell action to lint workflow for markdown and TypeScript files
Low
🟢 Low
13
Unpinned action SHAs in performance-monitor.yml
Pin actions/checkout and actions/setup-node to SHA digests
Low
🟢 Low
14
No license compliance
Add licensee or FOSSA action to dependency-audit workflow
Low
🟢 Low
15
Separate chroot test workflow
Consider consolidating test entrypoints or adding a summary workflow that references all integration test sub-workflows
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
This report provides a comprehensive assessment of the current CI/CD pipeline state, existing quality gates, and identified gaps in PR quality measurement for this repository.
📊 Current CI/CD Pipeline Status
The repository has a mature and layered CI/CD pipeline combining traditional GitHub Actions workflows with agentic AI-powered workflows. Overall pipeline health is good, though a dependency audit failure and several low-priority docs issues were observed in recent runs.
Total workflows: ~45 (17 standard YAML + 27 agentic Markdown + lock files)
Recent run health (last 30 runs, 2026-04-02):
mainpush✅ Existing Quality Gates
Automatic PR Checks
build.ymllint.yml,build.ymllint.ymltest-integration.yml(type-check job)test-coverage.ymlcodeql.ymldependency-audit.ymltest-integration-suite.ymltest-integration-suite.ymltest-integration-suite.ymltest-integration-suite.ymltest-integration-suite.ymltest-chroot.ymltest-chroot.ymltest-chroot.ymltest-chroot.ymltest-examples.ymlpr-title.ymllink-check.yml*.mdfilesdocs-preview.ymlAgentic AI-Powered PR Checks
Scheduled Quality Checks
🔍 Identified Gaps
🔴 High Priority
1. Critically Low Unit Test Coverage with Permissive Thresholds
Current state: Overall unit test coverage is only 38% statements / 31% branches / 37% functions. The two most critical files have near-zero coverage:
cli.ts(entry point): 0% coverage — zero tests for CLI argument parsing, signal handling, and orchestration flowdocker-manager.ts(core container logic): 18% statements, 4% function coverage — the file that manages container lifecycle is almost entirely untestedThe coverage thresholds in
jest.config.jsare set at these existing low levels (38%/30%/35%/38%), meaning they only prevent further drops and do not drive improvement.2. No Container/Docker Image Security Scanning on PRs
Current state: CodeQL and npm audit cover source code and JS dependencies, but container images (
containers/squid/,containers/agent/,containers/api-proxy/) are not scanned for OS-level vulnerabilities on PRs. There is no Trivy, Snyk, or Grype scan integrated into the PR pipeline.Risk: A PR could introduce a base image with critical CVEs (e.g., a vulnerable Ubuntu or Squid package) without detection until manual review or external discovery.
3. Dependency Vulnerability Audit Currently Failing on Main
Current state: The most recent push to
mainshowsDependency Vulnerability Auditwith conclusionfailure. This means the main branch currently has known high/critical vulnerabilities in npm dependencies that are blocking the audit gate—yet the branch continues to receive merges.4. No Performance Regression Gate on PRs
Current state:
performance-monitor.ymlruns weekly (Monday 06:00 UTC) and creates issues if regressions are detected. However, it does not run on PRs, meaning a PR could introduce a startup time or network latency regression that won't be detected until the following Monday.🟡 Medium Priority
5. Coverage Thresholds Should Be Raised Incrementally
Current state: Thresholds are pinned at current coverage levels (38%/30%/35%/38%). While preventing regression, they do not create pressure to improve coverage. The
test-coverage-improveragentic workflow runs weekly to generate PRs, but there's no CI gate that increases the target over time.6. No SBOM (Software Bill of Materials) Generation or Attestation
Current state: The release workflow publishes GHCR container images and npm packages without generating or attesting an SBOM. There's no
anchore/sbom-actionorsigstore/cosignattestation in the release pipeline or PR checks.Risk: Supply chain security posture is weak; consumers cannot verify the provenance of published artifacts.
7. Secret Scanning Only via Scheduled Agents, Not as a Blocking PR Gate
Current state: Secret scanning happens via Secret Digger agents (Claude/Codex/Copilot) running on hourly/daily schedules. There is no synchronous secret-scanning gate (e.g.,
trufflesecurity/trufflehogorgitleaks/gitleaks-action) running as a required PR check. A PR with an accidentally committed secret would pass all PR checks and merge before being caught by the next scheduled scan.8. No Mutation Testing to Validate Test Quality
Current state: Tests run with Jest but there's no mutation testing (e.g., Stryker.js) to validate whether tests would actually catch bugs. Given 0% coverage on
cli.tsand near-zero ondocker-manager.ts, adding tests without mutation validation risks writing tests that pass regardless of implementation correctness.9. Smoke Tests Not Consistently Blocking PRs
Current state: Smoke test workflows (
smoke-claude,smoke-codex,smoke-copilot) are configured withaction_requiredstatus in recent runs, suggesting they may not always function as hard blocking gates (e.g., when runner secrets aren't available or API rate limits are hit). Thesmoke-chrootworkflow is path-filtered, so it only runs whensrc/**orcontainers/**change—changes to other paths that affect security behavior might bypass it.10. No Container Image Size Tracking on PRs
Current state: Container build steps in integration tests verify images build successfully, but there's no tracking of image size deltas. A PR adding unnecessary packages to
containers/agent/could significantly increase the attack surface without triggering a warning.🟢 Low Priority
11. Link Check Only Triggers on Markdown File Changes
Current state:
link-check.ymlusespaths: ['**/*.md'], so it only runs when markdown files change. If a non-markdown change (e.g., renaming a file referenced in docs) breaks links, it won't be caught until the next weekly scheduled run.12. No Spelling/Grammar Checks on Documentation
Current state: Markdownlint enforces formatting but not spelling. Typos in documentation (especially in security-sensitive instructions) pass all checks.
13. Performance Monitor Uses Unpinned Action SHAs
Current state:
performance-monitor.ymlusesactions/checkout@v4andactions/setup-node@v4without SHA pins, contrary to the pattern followed in other workflows. This creates a supply chain risk for this workflow specifically.14. No License Compliance Checking
Current state: No FOSSA, licensee, or similar tool checks that dependencies have compatible licenses. This is relevant given the project is Apache-2.0 licensed.
15.
test-integration-suite.ymlDoesn't Include Chroot TestsThe chroot integration tests live in a separate
test-chroot.ymland are not referenced bytest-integration-suite.yml. This creates two separate integration test workflows with no shared entrypoint, making it harder to see the full integration test picture and potentially leading to duplication.📋 Actionable Recommendations
aquasecurity/trivy-actionstep tobuild.ymlto scan built images on PRs; fail on CRITICAL severitynpm audit fixor pinning resolved versionsbuild.ymlthat measures CLI startup time and fails if >20% regression vs mainjest.config.jsthresholds to 50%/40%/50%/50% as a medium-term target; use the weeklytest-coverage-improverto generate the testsanchore/sbom-actiontorelease.ymland optionally to PR checks to generate and attest SBOMs for container imagesgitleaks/gitleaks-actionas a required PR check alongside the scheduled Secret Digger agents@stryker-mutator/jest-runnerfor the well-covered modules (squid-config, logger) to validate test qualitydocker images --formatstep in build workflows to report image sizes and diff vs baseline in PRscspellaction to lint workflow for markdown and TypeScript filesactions/checkoutandactions/setup-nodeto SHA digestslicenseeor FOSSA action to dependency-audit workflow📈 Metrics Summary
cli.ts(0% coverage) — the main entry pointGenerated by CI/CD Pipelines and Integration Tests Gap Assessment workflow · 2026-04-02
Beta Was this translation helpful? Give feedback.
All reactions