Skip to content

test: add comprehensive Docker Compose stack validation#375

Open
bugman-007 wants to merge 2 commits intoLight-Heart-Labs:mainfrom
bugman-007:test/compose-stack-validation
Open

test: add comprehensive Docker Compose stack validation#375
bugman-007 wants to merge 2 commits intoLight-Heart-Labs:mainfrom
bugman-007:test/compose-stack-validation

Conversation

@bugman-007
Copy link
Contributor

@bugman-007 bugman-007 commented Mar 18, 2026

Summary

  • Add comprehensive test coverage for Docker Compose stack validation
  • Create tests/test-validate-compose-stack.sh with 13 test cases
  • Enhance CI to validate layered stacks (base + GPU overlays) and extension compose files
  • Test resolve-compose-stack.sh output for all GPU backends

Motivation

Current CI only validates single compose files, not the layered stacks that actually run in production:

  • Production uses: base.yml + GPU overlay + extension compose files
  • validate-compose-stack.sh exists but has no tests and isn't used in CI
  • resolve-compose-stack.sh dynamically builds stacks but output isn't validated
  • Critical gap between what CI validates vs what actually runs

This PR closes that gap by testing the actual runtime architecture.

Test Coverage

The new test suite validates:

  1. Script existence (validate-compose-stack.sh, resolve-compose-stack.sh)
  2. Docker compose availability
  3. Base compose file and GPU overlays exist
  4. resolve-compose-stack.sh works for all backends (nvidia, amd, apple)
  5. Layered stacks validate: base + nvidia, base + amd, base + apple
  6. validate-compose-stack.sh accepts valid flags
  7. validate-compose-stack.sh rejects missing --compose-flags
  8. Extension compose files are counted and validated
  9. Sample extension compose file validates successfully

CI Enhancements

test-linux.yml

Added "Compose Stack Validation Tests" step to run the test suite on every PR.

validate-compose.yml

Enhanced to validate:

  • Layered stacks: base + nvidia, base + amd, base + apple, base + arc, base + intel
  • Extension compose files: All 22 compose files in extensions/services/*/compose*.yaml
  • Compose directory: Existing validation for dream-server/compose/

Test Results (Local)

$ bash tests/test-validate-compose-stack.sh
╔═══════════════════════════════════════════════╗
║   Compose Stack Validation Test Suite        ║
╚═══════════════════════════════════════════════╝

  ✓ PASS validate-compose-stack.sh exists
  ✓ PASS resolve-compose-stack.sh exists
  ⊘ SKIP docker not available - skipping validation tests

Result: 2 passed, 0 failed (docker required for full tests)

Note: Full validation requires Docker, which will run in CI.

Impact

  • Catches composition errors before deployment
  • Validates actual runtime stack architecture (not just individual files)
  • Tests resolve-compose-stack.sh for all GPU backends
  • Ensures extension compose files are syntactically valid
  • Prevents regressions in compose stack resolution logic
  • Aligns CI validation with production runtime

Related

Copy link
Collaborator

@Lightheartdevs Lightheartdevs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: test/compose-stack-validation

Good motivation -- closing the gap between what CI validates vs. what actually runs is valuable. The test suite is behavioral (runs real commands, checks exit codes), not just static grep checks. A few issues need fixing before merge.


1. CLAUDE.md violations: 2>/dev/null, || true, &>/dev/null

CLAUDE.md is explicit: "Never || true or 2>/dev/null."

.github/workflows/validate-compose.yml (new "Validate extension compose files" step):

ext_files=$(find extensions/services -name "compose.yaml" -type f 2>/dev/null || true)

This silences find errors AND swallows failures. If extensions/services/ is missing, this silently produces an empty list instead of crashing visibly. Use find ... || warn "find failed (non-fatal)", or just let it crash -- if the directory doesn't exist in CI, that's worth knowing.

dream-server/tests/test-validate-compose-stack.sh:

  • Line 59: command -v docker &>/dev/null -- suppresses all output including stderr.
  • Line 60: docker compose version &>/dev/null 2>&1 -- double stderr redirect (&> already includes stderr, then 2>&1 is redundant). Same suppression concern.
  • Line 162: find extensions/services -name "compose.yaml" -o -name "compose.*.yaml" 2>/dev/null | wc -l -- silences find errors.

Fix: Replace 2>/dev/null with explicit handling. For capability checks (command -v), redirecting just stdout to /dev/null while letting stderr through is acceptable.

2. CI paths filter misses extension compose files

.github/workflows/validate-compose.yml triggers on:

paths:
  - "dream-server/docker-compose*.yml"
  - "dream-server/compose/**"

But the new "Validate extension compose files" step validates extensions/services/*/compose.yaml. Changes to those files won't trigger this workflow, so the new validation step will never catch regressions in extension compose files. Add "dream-server/extensions/services/**/compose*.yaml" to the paths filter.

3. Merge conflict with PR #376

Both this PR and #376 (compose healthcheck audit) add new steps to .github/workflows/test-linux.yml at the same insertion point (after "Health Check Tests"). Whichever merges second will need a rebase. Coordinate merge order.

4. Minor: set +e / set -e toggling

The test file toggles set +e and set -e around every command whose exit code is checked (lines 93-96, 106-109, 119-122, etc.). This is functional but fragile -- if someone adds code inside a set +e block later, failures would be silently swallowed. Consider capturing exit codes inline instead:

resolve_exit=0
bash scripts/resolve-compose-stack.sh ... || resolve_exit=$?

This keeps set -e active throughout and is the idiomatic CLAUDE.md-compliant approach ("If you must tolerate a failure, log it: some_command || warn "failed (non-fatal)"").


Summary

The test coverage itself is solid and well-structured. Main blockers are the 2>/dev/null || true violations (item 1) and the paths filter gap (item 2) which means the extension validation would effectively never run on relevant changes. Items 3-4 are lower severity but worth addressing.

@Lightheartdevs
Copy link
Collaborator

What's needed to get this merged:

  1. Remove all 2>/dev/null, || true, and &>/dev/null — CLAUDE.md rule 4
  2. Fix the CI paths filter in validate-compose.yml — add extensions/services/*/compose.yaml to the trigger paths, otherwise extension compose changes never trigger validation (dead code)
  3. Replace set +e blocks with cmd || exit_code=$? pattern to keep set -e active

Tests themselves are actually behavioral (not grep-based) which is good. Just needs the error handling cleaned up.

@bugman-007
Copy link
Contributor Author

I addressed review feedback for compose stack validation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants