Conversation
- Dockerfile parser with multi-stage, continuation lines, variable resolution - Contract extractor producing Runtime Contract Model (RCM) - Goss generator outputting goss.yml and goss_wait.yml - Evidence engine for dynamic probe mode (docker/podman) - Interactive mode with dialoguer for guided Q&A - Config file support (.dgossgen.yml) with policy knobs - CLI with clap: init, probe, explain, lint commands - 34 unit tests covering all core modules https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- 5 fixture Dockerfiles: nginx, node multistage, python, go minimal, complex healthcheck - 20 golden tests covering contract extraction, generation, profiles, idempotence, stable ordering, wait file generation, provenance comments https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- CI workflow: check, test, fmt, clippy, multi-OS build (ubuntu, macos) - Release workflow: multi-target builds (linux gnu/musl, macos x86/arm), automatic GitHub Release on tag push https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- Installation, quick start, all CLI commands with options - Architecture diagram and how-it-works explanation - Profile comparison table, configuration reference - Example inputs/outputs for nginx and Go Dockerfiles - Security considerations, exit codes, development guide https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- Replace manual strip_suffix/strip_prefix with idiomatic methods - Use is_some_and instead of map_or(false, ...) - Derive Default for VariableResolver - Collapse consecutive str::replace calls - Fix never_loop clippy error in interactive preview - Add PatternDetector type alias to reduce type complexity https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- Add Formula/dgossgen.rb template for dortort/homebrew-tap - Update release workflow: SHA256 checksums, auto-generate and push formula to tap repo on tagged releases - Add Homebrew installation instructions to README Setup required: 1. Create dortort/homebrew-tap repo with Formula/ directory 2. Create HOMEBREW_TAP_TOKEN secret with repo write access 3. Tag a release: git tag v0.1.0 && git push --tags https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
Auto-detects OS (Linux/macOS) and architecture (x86_64/aarch64), downloads the latest release binary, and installs to /usr/local/bin. Supports INSTALL_DIR override for custom install location. https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
Replace the static whitelist gate with a universal detection approach: every package extracted from apt-get/apk/pip/npm install commands now gets a package-manager-native assertion (dpkg -s, apk info -e, pip show, npm list -g). Known version commands (nginx -v, curl --version, etc.) are layered on as optional enrichment rather than being required for assertion generation. This means unknown/arbitrary packages now produce tests automatically instead of being silently dropped. Also adds: - PackageInstalled variant to AssertionKind with PackageManager enum - Low-value package skip list (ca-certificates, gnupg, etc.) - pip flag-remnant filtering to avoid false positives - Tests for unknown packages, pip packages, and low-value filtering https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
Add detection for `composer require vendor/package` in Dockerfile RUN commands. Packages are verified via `composer show vendor/package` as the native check. Version constraints (e.g. :^6.0) are stripped. Includes: - PackageManager::Composer variant - Regex pattern for composer require with flag filtering - composer --version in known_version_cmd enrichment table - PHP Composer test fixture and 7 new tests (3 unit + 4 golden) https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- Use strip_prefix instead of starts_with + manual slicing in npm scoped-package handler (clippy::manual_strip) - Apply rustfmt to heuristics.rs for CI compliance https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
dortort
left a comment
There was a problem hiding this comment.
PR Review: feat: implement core dgossgen tool
Build & Test Verification
- Build: Compiles cleanly on Rust 1.93.0
- Tests: 65/65 passing (41 unit + 24 golden)
- Clippy: Clean (zero warnings with
-D warnings) - Formatting: Clean (
cargo fmt --checkpasses)
Architecture: APPROVE (with recommendations)
The pipeline design (parser -> extractor -> generator -> probe) is clean, maps directly to the domain, and has well-defined input/output boundaries at each stage. The RuntimeContract model captures what the container promises rather than how the Dockerfile builds it — an excellent abstraction choice. Provenance tracking (origin, source line, confidence) on every assertion is a standout feature.
Strengths
- Clean unidirectional pipeline with testable stages
Confidenceenum withOrdenables elegant profile-based filtering- Good test fixture coverage (6 Dockerfile archetypes)
- Solid CI/CD with multi-OS builds, SHA256 checksums, and Homebrew formula auto-generation
- Sensible, idiomatic Rust dependency choices
Issues Found
CRITICAL (2) — Must fix before merge
1. Command Injection via --run-arg (probe/mod.rs:127-129)
The --run-arg CLI parameter is passed directly to docker run/podman run. While Command::arg() prevents shell metacharacter injection, the threat is semantic injection — a user can pass --privileged, -v /:/host, --network=host, etc., overriding the --network none isolation. Needs an allowlist/denylist of permitted flags.
2. HEALTHCHECK Flag Parsing Drops Last Token (parser/mod.rs:403-428)
remaining.find(' ') returns None when a flag like --retries=3 is the last token before CMD, silently dropping it. Fix: let end = remaining.find(' ').unwrap_or(remaining.len());
HIGH (5) — Should fix before merge
3. Unsanitized Dockerfile Content in Shell Commands (extractor/mod.rs:63, generator/mod.rs:361-364)
sanitize_command() only strips \0 and \r. Shell metacharacters (;, |, $(), backticks) are preserved. The USER UID interpolation via format!("id -u | grep -q {}", resolved) doesn't validate that resolved is actually numeric.
4. Global ARGs Before First FROM Are Silently Ignored (parser/mod.rs:527-531)
A common pattern like ARG BASE_IMAGE=ubuntu:22.04 / FROM $BASE_IMAGE produces incorrect output since global ARGs aren't tracked.
5. Regex Compiled on Every Call (extractor/heuristics.rs:254-392)
package_install_patterns() compiles 5 regexes per invocation, called once per RUN instruction. Use std::sync::LazyLock (stable since 1.80) or OnceLock.
6. install.sh Lacks Checksum Verification
Downloads a binary from GitHub without verifying SHA256 integrity, despite .sha256 files being published by the release workflow.
7. Duplicate File Assertions for Entrypoint Scripts (extractor/mod.rs:209-243)
Both a generic FileExists and an entrypoint-specific FileExists(mode: "0755") are created for the same path. identity_key() deduplication may drop the mode assertion non-deterministically.
MEDIUM (8) — Should fix before v1.0
8. main.rs Contains ~200 Lines of Business Logic — Lint engine, LintIssue, write_output, etc. should be extracted to separate modules for testability.
9. Hand-Rolled YAML Rendering (generator/render.rs) — String concatenation with manual escaping risks YAML injection. Either use serde_yaml serialization or add a round-trip validation pass.
10. Component Detection False Positives (extractor/heuristics.rs:44-73) — Substring matching: "node" matches "download", "java" matches "javascript". Use word-boundary regex.
11. Variable Resolver Char/Byte Indexing Bug Risk (parser/resolver.rs:48-101) — Converts to Vec<char> but indexes into the &str using char-based positions. Will panic on non-ASCII content before ${} references.
12. Probe Timeout Not Implemented (probe/mod.rs:163) — _timeout: Duration is accepted but unused. docker exec commands can hang indefinitely.
13. Container Cleanup Not Guaranteed (probe/mod.rs:117-157) — No RAII guard; panic between container creation and cleanup leaves orphaned containers.
14. Config version_cmd Becomes Shell Command (config/mod.rs) — A malicious .dgossgen.yml in a cloned repo could inject arbitrary commands via the version_cmd field.
15. serde_yaml Is Deprecated — The crate is archived/unmaintained. Consider migrating to serde_yml or another actively maintained alternative.
LOW (6) — Nice to have
- 16.
thiserror,indicatif,indexmap,insta,pretty_assertionsare declared as dependencies but never used — dead weight - 17.
open_in_editordoesn't check editor exit status - 18.
has_unresolved()produces false positives for literal$in shell commands - 19. No CLI integration tests (consider
assert_cmd) - 20. GitHub Actions
dmnemec/copy_file_to_another_repo_action@mainnot hash-pinned - 21. Warnings emitted after files already written in non-interactive mode
Unused Dependencies Audit
| Dependency | Type | Status |
|---|---|---|
thiserror |
runtime | Never used (no #[derive(Error)] anywhere) |
indicatif |
runtime | Never imported |
indexmap |
runtime | Never imported |
insta |
dev | Never imported |
pretty_assertions |
dev | Never imported |
Summary
| Severity | Count |
|---|---|
| CRITICAL | 2 |
| HIGH | 5 |
| MEDIUM | 8 |
| LOW | 6 |
Verdict: REQUEST CHANGES — The CRITICAL and HIGH items (especially command injection via --run-arg, HEALTHCHECK flag parsing bug, and unsanitized shell command construction) should be addressed before merging. The architecture is sound and extensible, the test coverage is good for v0.1.0, and the pipeline design is clean. Addressing the critical/high items would make this ready to merge.
The non-interactive init path returned exit code 2 immediately when any warnings existed, skipping file output entirely. This caused Dockerfiles with a mix of low-confidence (skipped) and medium/high-confidence (valid) assertions to produce zero output files. Move write_output before the early return so goss.yml and goss_wait.yml are always written, while still signalling warnings via exit code 2.
https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64