[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment #1633

2026-04-02T22:24:09Z

github-actions[bot]
bot Apr 2, 2026

This report provides a comprehensive assessment of the current CI/CD pipeline state, existing quality gates, and identified gaps in PR quality measurement for this repository.

📊 Current CI/CD Pipeline Status

The repository has a mature and layered CI/CD pipeline combining traditional GitHub Actions workflows with agentic AI-powered workflows. Overall pipeline health is good, though a dependency audit failure and several low-priority docs issues were observed in recent runs.

Total workflows: ~45 (17 standard YAML + 27 agentic Markdown + lock files)

Recent run health (last 30 runs, 2026-04-02):

Workflow	Status
Build Verification	✅ Passing (Node 20 & 22)
Lint (ESLint + Markdownlint)	✅ Passing
TypeScript Type Check	✅ Passing
Test Coverage	✅ Passing
CodeQL	✅ Passing
Integration Tests	✅ Passing
Chroot Integration Tests	✅ Passing
Examples Test	✅ Passing
PR Title Check	✅ Passing
Link Check	✅ Passing
Dependency Vulnerability Audit	❌ Failing on `main` push
Deploy Documentation	❌ Failing
Smoke Copilot	❌ Failing (one recent run)

✅ Existing Quality Gates

Automatic PR Checks

Check	Workflow	Scope
TypeScript compilation (Node 20 & 22)	`build.yml`	All PRs
ESLint static analysis	`lint.yml`, `build.yml`	All PRs
Markdownlint	`lint.yml`	All PRs
TypeScript strict type-check	`test-integration.yml` (type-check job)	All PRs
Unit test coverage (with PR comparison)	`test-coverage.yml`	All PRs (non-markdown)
CodeQL security analysis (JS/TS + Actions)	`codeql.yml`	All PRs
npm dependency vulnerability audit (main + docs-site)	`dependency-audit.yml`	All PRs
Domain filtering integration tests	`test-integration-suite.yml`	All PRs
Network security integration tests	`test-integration-suite.yml`	All PRs
Protocol & credential security integration tests	`test-integration-suite.yml`	All PRs
Container & ops integration tests	`test-integration-suite.yml`	All PRs
API proxy integration tests	`test-integration-suite.yml`	All PRs
Chroot language support tests	`test-chroot.yml`	All PRs
Chroot package manager tests	`test-chroot.yml`	All PRs
Chroot procfs tests	`test-chroot.yml`	All PRs
Chroot edge cases	`test-chroot.yml`	All PRs
Example scripts smoke tests	`test-examples.yml`	All PRs
Semantic PR title validation	`pr-title.yml`	All PRs
Documentation link checking	`link-check.yml`	PRs touching `*.md` files
Documentation preview build	`docs-preview.yml`	PRs touching docs

Agentic AI-Powered PR Checks

Check	Agent	Trigger
Security posture review	Security Guard (Claude)	Automatic on all PRs
Multi-ecosystem build test (Bun, C++, Deno, .NET, Go, Java, Node.js, Rust)	Build Test Suite (Copilot)	Automatic on all PRs
End-to-end smoke test (Claude agent)	Smoke Claude	Automatic on all PRs + scheduled
End-to-end smoke test (Codex agent)	Smoke Codex	Automatic on all PRs + scheduled
End-to-end smoke test (Copilot agent)	Smoke Copilot	Automatic on all PRs + scheduled
End-to-end smoke test (chroot mode)	Smoke Chroot	Path-filtered PRs + reactions
End-to-end smoke test (services)	Smoke Services	Automatic on all PRs + scheduled

Scheduled Quality Checks

Daily: Dependency security monitoring, token usage analysis (Claude/Copilot/Codex), security review
Weekly: Performance benchmarking, CLI flag consistency check, test coverage improvement, link check
Hourly: Secret scanning (via Secret Digger agents for Claude, Copilot, Codex)
On release: Automated release notes generation

🔍 Identified Gaps

🔴 High Priority

1. Critically Low Unit Test Coverage with Permissive Thresholds

Current state: Overall unit test coverage is only 38% statements / 31% branches / 37% functions. The two most critical files have near-zero coverage:

cli.ts (entry point): 0% coverage — zero tests for CLI argument parsing, signal handling, and orchestration flow
docker-manager.ts (core container logic): 18% statements, 4% function coverage — the file that manages container lifecycle is almost entirely untested

The coverage thresholds in jest.config.js are set at these existing low levels (38%/30%/35%/38%), meaning they only prevent further drops and do not drive improvement.

2. No Container/Docker Image Security Scanning on PRs

Current state: CodeQL and npm audit cover source code and JS dependencies, but container images (containers/squid/, containers/agent/, containers/api-proxy/) are not scanned for OS-level vulnerabilities on PRs. There is no Trivy, Snyk, or Grype scan integrated into the PR pipeline.

Risk: A PR could introduce a base image with critical CVEs (e.g., a vulnerable Ubuntu or Squid package) without detection until manual review or external discovery.

3. Dependency Vulnerability Audit Currently Failing on Main

Current state: The most recent push to main shows Dependency Vulnerability Audit with conclusion failure. This means the main branch currently has known high/critical vulnerabilities in npm dependencies that are blocking the audit gate—yet the branch continues to receive merges.

4. No Performance Regression Gate on PRs

Current state: performance-monitor.yml runs weekly (Monday 06:00 UTC) and creates issues if regressions are detected. However, it does not run on PRs, meaning a PR could introduce a startup time or network latency regression that won't be detected until the following Monday.

🟡 Medium Priority

5. Coverage Thresholds Should Be Raised Incrementally

Current state: Thresholds are pinned at current coverage levels (38%/30%/35%/38%). While preventing regression, they do not create pressure to improve coverage. The test-coverage-improver agentic workflow runs weekly to generate PRs, but there's no CI gate that increases the target over time.

6. No SBOM (Software Bill of Materials) Generation or Attestation

Current state: The release workflow publishes GHCR container images and npm packages without generating or attesting an SBOM. There's no anchore/sbom-action or sigstore/cosign attestation in the release pipeline or PR checks.

Risk: Supply chain security posture is weak; consumers cannot verify the provenance of published artifacts.

7. Secret Scanning Only via Scheduled Agents, Not as a Blocking PR Gate

Current state: Secret scanning happens via Secret Digger agents (Claude/Codex/Copilot) running on hourly/daily schedules. There is no synchronous secret-scanning gate (e.g., trufflesecurity/trufflehog or gitleaks/gitleaks-action) running as a required PR check. A PR with an accidentally committed secret would pass all PR checks and merge before being caught by the next scheduled scan.

8. No Mutation Testing to Validate Test Quality

Current state: Tests run with Jest but there's no mutation testing (e.g., Stryker.js) to validate whether tests would actually catch bugs. Given 0% coverage on cli.ts and near-zero on docker-manager.ts, adding tests without mutation validation risks writing tests that pass regardless of implementation correctness.

9. Smoke Tests Not Consistently Blocking PRs

Current state: Smoke test workflows (smoke-claude, smoke-codex, smoke-copilot) are configured with action_required status in recent runs, suggesting they may not always function as hard blocking gates (e.g., when runner secrets aren't available or API rate limits are hit). The smoke-chroot workflow is path-filtered, so it only runs when src/** or containers/** change—changes to other paths that affect security behavior might bypass it.

10. No Container Image Size Tracking on PRs

Current state: Container build steps in integration tests verify images build successfully, but there's no tracking of image size deltas. A PR adding unnecessary packages to containers/agent/ could significantly increase the attack surface without triggering a warning.

🟢 Low Priority

11. Link Check Only Triggers on Markdown File Changes

Current state: link-check.yml uses paths: ['**/*.md'], so it only runs when markdown files change. If a non-markdown change (e.g., renaming a file referenced in docs) breaks links, it won't be caught until the next weekly scheduled run.

12. No Spelling/Grammar Checks on Documentation

Current state: Markdownlint enforces formatting but not spelling. Typos in documentation (especially in security-sensitive instructions) pass all checks.

13. Performance Monitor Uses Unpinned Action SHAs

Current state: performance-monitor.yml uses actions/checkout@v4 and actions/setup-node@v4 without SHA pins, contrary to the pattern followed in other workflows. This creates a supply chain risk for this workflow specifically.

14. No License Compliance Checking

Current state: No FOSSA, licensee, or similar tool checks that dependencies have compatible licenses. This is relevant given the project is Apache-2.0 licensed.

15. `test-integration-suite.yml` Doesn't Include Chroot Tests

The chroot integration tests live in a separate test-chroot.yml and are not referenced by test-integration-suite.yml. This creates two separate integration test workflows with no shared entrypoint, making it harder to see the full integration test picture and potentially leading to duplication.

📋 Actionable Recommendations

#	Gap	Recommended Solution	Complexity	Impact
1	Low unit test coverage (cli.ts=0%, docker-manager.ts=18%)	Incrementally add unit tests for cli.ts argument parsing and docker-manager.ts container lifecycle; raise thresholds by 5% per quarter	Medium	🔴 High
2	No Docker image vulnerability scanning	Add `aquasecurity/trivy-action` step to `build.yml` to scan built images on PRs; fail on CRITICAL severity	Low	🔴 High
3	Dependency audit failing on main	Investigate and resolve current npm audit failures; consider `npm audit fix` or pinning resolved versions	Low	🔴 High
4	No performance gate on PRs	Add a lightweight performance check job to `build.yml` that measures CLI startup time and fails if >20% regression vs main	Medium	🔴 High
5	Low coverage thresholds	Update `jest.config.js` thresholds to 50%/40%/50%/50% as a medium-term target; use the weekly `test-coverage-improver` to generate the tests	Low	🟡 Medium
6	No SBOM	Add `anchore/sbom-action` to `release.yml` and optionally to PR checks to generate and attest SBOMs for container images	Low	🟡 Medium
7	Secret scanning not a blocking gate	Add `gitleaks/gitleaks-action` as a required PR check alongside the scheduled Secret Digger agents	Low	🟡 Medium
8	No mutation testing	Integrate `@stryker-mutator/jest-runner` for the well-covered modules (squid-config, logger) to validate test quality	Medium	🟡 Medium
9	Smoke test reliability	Add a fallback mechanism in smoke workflow triggers and document which smoke tests are required vs informational	Medium	🟡 Medium
10	No image size tracking	Add a `docker images --format` step in build workflows to report image sizes and diff vs baseline in PRs	Low	🟡 Medium
11	Link check path filter	Remove path filter from link-check (or add a scheduled + PR-always mode) so renaming source files triggers link verification	Low	🟢 Low
12	No spell checking	Add `cspell` action to lint workflow for markdown and TypeScript files	Low	🟢 Low
13	Unpinned action SHAs in performance-monitor.yml	Pin `actions/checkout` and `actions/setup-node` to SHA digests	Low	🟢 Low
14	No license compliance	Add `licensee` or FOSSA action to dependency-audit workflow	Low	🟢 Low
15	Separate chroot test workflow	Consider consolidating test entrypoints or adding a summary workflow that references all integration test sub-workflows	Low	🟢 Low

📈 Metrics Summary

Metric	Value
Total workflows	~45 (17 standard + 27 agentic + lock files)
Workflows running automatically on PRs	~12 standard + 7 agentic
Recent main push success rate (last 10 runs)	~80% (2 failures: dep-audit, deploy-docs)
Unit test count	135 tests across 6 suites
Unit test statement coverage	38.39%
Unit test branch coverage	31.78%
Unit test function coverage	37.03%
Integration test categories	9 categories (domain, network, protocol/security, container/ops, API proxy, chroot languages, chroot pkg managers, chroot procfs, chroot edge cases)
CI/CD security posture	Good (CodeQL, npm audit, secret diggers, Security Guard AI agent)
Biggest single coverage gap	`cli.ts` (0% coverage) — the main entry point

Generated by CI/CD Pipelines and Integration Tests Gap Assessment workflow · 2026-04-02

AI generated by CI/CD Pipelines and Integration Tests Gap Assessment

expires on Apr 9, 2026, 10:24 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment #1633

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment #1633

Uh oh!

github-actions[bot] bot Apr 2, 2026

📊 Current CI/CD Pipeline Status

✅ Existing Quality Gates

Automatic PR Checks

Agentic AI-Powered PR Checks

Scheduled Quality Checks

🔍 Identified Gaps

🔴 High Priority

1. Critically Low Unit Test Coverage with Permissive Thresholds

2. No Container/Docker Image Security Scanning on PRs

3. Dependency Vulnerability Audit Currently Failing on Main

4. No Performance Regression Gate on PRs

🟡 Medium Priority

5. Coverage Thresholds Should Be Raised Incrementally

6. No SBOM (Software Bill of Materials) Generation or Attestation

7. Secret Scanning Only via Scheduled Agents, Not as a Blocking PR Gate

8. No Mutation Testing to Validate Test Quality

9. Smoke Tests Not Consistently Blocking PRs

10. No Container Image Size Tracking on PRs

🟢 Low Priority

11. Link Check Only Triggers on Markdown File Changes

12. No Spelling/Grammar Checks on Documentation

13. Performance Monitor Uses Unpinned Action SHAs

14. No License Compliance Checking

15. test-integration-suite.yml Doesn't Include Chroot Tests

📋 Actionable Recommendations

📈 Metrics Summary

Replies: 0 comments

github-actions[bot]
bot Apr 2, 2026

15. `test-integration-suite.yml` Doesn't Include Chroot Tests