Skip to content

feat: env validation v2 with mode-aware runtime gates#320

Open
eva57gr wants to merge 3 commits intoLight-Heart-Labs:mainfrom
eva57gr:feat/env-validation-v2
Open

feat: env validation v2 with mode-aware runtime gates#320
eva57gr wants to merge 3 commits intoLight-Heart-Labs:mainfrom
eva57gr:feat/env-validation-v2

Conversation

@eva57gr
Copy link
Contributor

@eva57gr eva57gr commented Mar 17, 2026

No description provided.

Copy link
Collaborator

@Lightheartdevs Lightheartdevs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: feat: env validation v2 with mode-aware runtime gates

Good concept — mode-aware validation that fails early before runtime crashes is the right pattern. Two breaking bugs need fixing before this can land.

Breaking Bugs

  1. validate_env_gate called but never defined. dream-cli calls validate_env_gate in cmd_start and cmd_update, but this function doesn't exist anywhere in the codebase or this PR. Under set -euo pipefail, this crashes dream start and dream update immediately. Either define it in dream-cli or source it from a shared lib.

  2. preflight-engine.sh argument count mismatch. The PR adds 7 new variables to the Python sys.argv[1:] unpacking (env_file, schema_file, env_validation_attempted, etc.) but does NOT update the Bash invocation line that passes arguments to the Python heredoc. Python expects 19 args, only 12 are passed → ValueError: not enough values to unpack on every preflight run.

Must-fix

  1. Duplicate usage() in validate-env.sh. The PR adds a new usage() function but doesn't remove the old one. Bash silently uses the last definition, so the new detailed help text is overridden by the old shorter version. Remove the old one.

  2. Unknown key detection silently removed. The loop that detects unknown/typo'd env keys is deleted with no replacement. The --allow-unknown flag is parsed but has no corresponding check. Unknown keys now silently pass validation — regression from current behavior.

  3. STRICT has no default value. The --strict flag description says it's the default, but no default is set in the arg-parsing block. If unset, behavior is undefined.

Conflicts

  • PR #349: Both modify get_current_version() in dream-update.sh and compare_versions in migrate-config.sh. Merge conflict guaranteed. Coordinate merge order.
  • PR #356: Minor textual conflicts on port defaults in shared files.

Non-blocking

  • DREAM_MODE not set at all has no explicit fallback — should default to local or error clearly.
  • Empty .env (zero keys) behavior is unclear.
  • The core design (schema-driven mode requirements, deprecated key tracking, exit code separation) is solid. Just needs the bugs fixed.

Copy link
Collaborator

@Lightheartdevs Lightheartdevs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of env-validation-v2

Copy link
Collaborator

@Lightheartdevs Lightheartdevs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: feat/env-validation-v2

Solid direction -- mode-aware env validation with schema-driven deprecation auto-fix is a good architectural choice. The test coverage in test-env-validation-v2.sh is thorough. However, there are several bugs that will break runtime behavior and some CLAUDE.md convention violations that need fixing before merge.


Bugs (blocking)

1. validate_env_gate is called but never defined (dream-cli, lines ~358/381)
cmd_start() and cmd_update() both call validate_env_gate, but this function is never defined anywhere in dream-cli or sourced from another file. This will crash every dream-cli start and dream-cli update invocation. Likely the implementation was planned but not included in this PR.

2. preflight-engine.sh -- Python argv mismatch will crash every invocation
The Python code (after the diff) unpacks 19 positional args from sys.argv[1:] (adding env_file, schema_file, env_validation_attempted, env_validation_file, env_validation_exit, env_validation_strict, skip_env_validation) but the bash invocation line (~line 83) still only passes the original 12 args. The 7 new variables are never appended to the Python invocation. This will produce a ValueError: not enough values to unpack on every preflight run, breaking dream-doctor.sh and the installer.

3. validate-env.sh -- duplicate usage() function
The diff adds a new usage() function (with the full --strict/--json docs) but does NOT remove the original usage() that remains further down in the file. Bash will silently use the second definition, meaning the new help text will never display. The old usage() also references the now-removed positional-arg interface. One of them needs to be removed.

4. validate-env.sh -- unknown-key check silently removed
The diff removes the unknown-key detection loop but the reporting section that references the unknown array still exists downstream. Unknown keys will never be detected, silently weakening validation. If this was intentional (related to --allow-unknown), the conditional logic is missing.


CLAUDE.md Convention Violations

5. || true used twice in migrate-config.sh (~line 253-254 in diff)
cmd_autofix_env || true and cmd_validate --warn-only >/dev/null || true violate the CLAUDE.md rule: "Never || true or 2>/dev/null." The second line also uses >/dev/null. If these failures are intentionally non-fatal, use the approved pattern: cmd_autofix_env || log_warn "autofix-env failed (non-fatal)".

6. Multiple set +e blocks in migrate-config.sh
Four set +e / set -e blocks to capture compare_versions return codes. Toggling set -e is fragile -- any command failing between set +e and set -e is silently swallowed. Safer pattern: compare_versions "$a" "$b" && result=0 || result=$?.

7. Broad except Exception in preflight-engine.sh Python (line ~661 in diff)
CLAUDE.md rule: "No broad or silent catches." This should catch specific exceptions (json.JSONDecodeError, FileNotFoundError, KeyError) that map to distinct statuses.

8. 2>/dev/null in migrate-config.sh get_current_version()
jq errors are suppressed with 2>/dev/null. Per CLAUDE.md, errors should be visible or logged.


Minor / Non-blocking

9. cmd_autofix_env uses ((changed+=1)) which returns exit code 1 when changed goes from 0 to 1 under set -e (arithmetic returning 0 is falsy in bash). Use : $((changed += 1)) instead.

10. dream-doctor.sh references PREFLIGHT_ENV_VALIDATION_STATUS etc. that are only emitted in --env mode. If preflight crashes due to bug #2, these vars will be unset and the ${VAR:-not_run} defaults will silently mask the failure.


Summary

Bugs #1 and #2 are ship-blockers -- they will crash dream-cli start/update and every preflight invocation. Bug #3 and #4 are functional regressions. The || true and broad-catch patterns need updating per CLAUDE.md conventions. The overall design (schema-driven mode requirements, deprecation auto-fix, JSON reporting) is sound and well-tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants