Skip to content

feat: implement core dgossgen tool#1

Merged
dortort merged 24 commits intomainfrom
claude/dockerfile-dgoss-converter-zHseA
Feb 12, 2026
Merged

feat: implement core dgossgen tool#1
dortort merged 24 commits intomainfrom
claude/dockerfile-dgoss-converter-zHseA

Conversation

@dortort
Copy link
Owner

@dortort dortort commented Feb 12, 2026

  • Dockerfile parser with multi-stage, continuation lines, variable resolution
  • Contract extractor producing Runtime Contract Model (RCM)
  • Goss generator outputting goss.yml and goss_wait.yml
  • Evidence engine for dynamic probe mode (docker/podman)
  • Interactive mode with dialoguer for guided Q&A
  • Config file support (.dgossgen.yml) with policy knobs
  • CLI with clap: init, probe, explain, lint commands
  • 34 unit tests covering all core modules

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64

- Dockerfile parser with multi-stage, continuation lines, variable resolution
- Contract extractor producing Runtime Contract Model (RCM)
- Goss generator outputting goss.yml and goss_wait.yml
- Evidence engine for dynamic probe mode (docker/podman)
- Interactive mode with dialoguer for guided Q&A
- Config file support (.dgossgen.yml) with policy knobs
- CLI with clap: init, probe, explain, lint commands
- 34 unit tests covering all core modules

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- 5 fixture Dockerfiles: nginx, node multistage, python, go minimal, complex healthcheck
- 20 golden tests covering contract extraction, generation, profiles,
  idempotence, stable ordering, wait file generation, provenance comments

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- CI workflow: check, test, fmt, clippy, multi-OS build (ubuntu, macos)
- Release workflow: multi-target builds (linux gnu/musl, macos x86/arm),
  automatic GitHub Release on tag push

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- Installation, quick start, all CLI commands with options
- Architecture diagram and how-it-works explanation
- Profile comparison table, configuration reference
- Example inputs/outputs for nginx and Go Dockerfiles
- Security considerations, exit codes, development guide

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- Replace manual strip_suffix/strip_prefix with idiomatic methods
- Use is_some_and instead of map_or(false, ...)
- Derive Default for VariableResolver
- Collapse consecutive str::replace calls
- Fix never_loop clippy error in interactive preview
- Add PatternDetector type alias to reduce type complexity

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- Add Formula/dgossgen.rb template for dortort/homebrew-tap
- Update release workflow: SHA256 checksums, auto-generate and push
  formula to tap repo on tagged releases
- Add Homebrew installation instructions to README

Setup required:
  1. Create dortort/homebrew-tap repo with Formula/ directory
  2. Create HOMEBREW_TAP_TOKEN secret with repo write access
  3. Tag a release: git tag v0.1.0 && git push --tags

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
Auto-detects OS (Linux/macOS) and architecture (x86_64/aarch64),
downloads the latest release binary, and installs to /usr/local/bin.
Supports INSTALL_DIR override for custom install location.

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
Replace the static whitelist gate with a universal detection approach:
every package extracted from apt-get/apk/pip/npm install commands now
gets a package-manager-native assertion (dpkg -s, apk info -e, pip show,
npm list -g). Known version commands (nginx -v, curl --version, etc.)
are layered on as optional enrichment rather than being required for
assertion generation.

This means unknown/arbitrary packages now produce tests automatically
instead of being silently dropped.

Also adds:
- PackageInstalled variant to AssertionKind with PackageManager enum
- Low-value package skip list (ca-certificates, gnupg, etc.)
- pip flag-remnant filtering to avoid false positives
- Tests for unknown packages, pip packages, and low-value filtering

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
Add detection for `composer require vendor/package` in Dockerfile RUN
commands. Packages are verified via `composer show vendor/package` as
the native check. Version constraints (e.g. :^6.0) are stripped.

Includes:
- PackageManager::Composer variant
- Regex pattern for composer require with flag filtering
- composer --version in known_version_cmd enrichment table
- PHP Composer test fixture and 7 new tests (3 unit + 4 golden)

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
- Use strip_prefix instead of starts_with + manual slicing in npm
  scoped-package handler (clippy::manual_strip)
- Apply rustfmt to heuristics.rs for CI compliance

https://claude.ai/code/session_015HNyDXJGiqJrdijAQxZG64
Copy link
Owner Author

@dortort dortort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: feat: implement core dgossgen tool

Build & Test Verification

  • Build: Compiles cleanly on Rust 1.93.0
  • Tests: 65/65 passing (41 unit + 24 golden)
  • Clippy: Clean (zero warnings with -D warnings)
  • Formatting: Clean (cargo fmt --check passes)

Architecture: APPROVE (with recommendations)

The pipeline design (parser -> extractor -> generator -> probe) is clean, maps directly to the domain, and has well-defined input/output boundaries at each stage. The RuntimeContract model captures what the container promises rather than how the Dockerfile builds it — an excellent abstraction choice. Provenance tracking (origin, source line, confidence) on every assertion is a standout feature.

Strengths

  • Clean unidirectional pipeline with testable stages
  • Confidence enum with Ord enables elegant profile-based filtering
  • Good test fixture coverage (6 Dockerfile archetypes)
  • Solid CI/CD with multi-OS builds, SHA256 checksums, and Homebrew formula auto-generation
  • Sensible, idiomatic Rust dependency choices

Issues Found

CRITICAL (2) — Must fix before merge

1. Command Injection via --run-arg (probe/mod.rs:127-129)
The --run-arg CLI parameter is passed directly to docker run/podman run. While Command::arg() prevents shell metacharacter injection, the threat is semantic injection — a user can pass --privileged, -v /:/host, --network=host, etc., overriding the --network none isolation. Needs an allowlist/denylist of permitted flags.

2. HEALTHCHECK Flag Parsing Drops Last Token (parser/mod.rs:403-428)
remaining.find(' ') returns None when a flag like --retries=3 is the last token before CMD, silently dropping it. Fix: let end = remaining.find(' ').unwrap_or(remaining.len());

HIGH (5) — Should fix before merge

3. Unsanitized Dockerfile Content in Shell Commands (extractor/mod.rs:63, generator/mod.rs:361-364)
sanitize_command() only strips \0 and \r. Shell metacharacters (;, |, $(), backticks) are preserved. The USER UID interpolation via format!("id -u | grep -q {}", resolved) doesn't validate that resolved is actually numeric.

4. Global ARGs Before First FROM Are Silently Ignored (parser/mod.rs:527-531)
A common pattern like ARG BASE_IMAGE=ubuntu:22.04 / FROM $BASE_IMAGE produces incorrect output since global ARGs aren't tracked.

5. Regex Compiled on Every Call (extractor/heuristics.rs:254-392)
package_install_patterns() compiles 5 regexes per invocation, called once per RUN instruction. Use std::sync::LazyLock (stable since 1.80) or OnceLock.

6. install.sh Lacks Checksum Verification
Downloads a binary from GitHub without verifying SHA256 integrity, despite .sha256 files being published by the release workflow.

7. Duplicate File Assertions for Entrypoint Scripts (extractor/mod.rs:209-243)
Both a generic FileExists and an entrypoint-specific FileExists(mode: "0755") are created for the same path. identity_key() deduplication may drop the mode assertion non-deterministically.

MEDIUM (8) — Should fix before v1.0

8. main.rs Contains ~200 Lines of Business Logic — Lint engine, LintIssue, write_output, etc. should be extracted to separate modules for testability.

9. Hand-Rolled YAML Rendering (generator/render.rs) — String concatenation with manual escaping risks YAML injection. Either use serde_yaml serialization or add a round-trip validation pass.

10. Component Detection False Positives (extractor/heuristics.rs:44-73) — Substring matching: "node" matches "download", "java" matches "javascript". Use word-boundary regex.

11. Variable Resolver Char/Byte Indexing Bug Risk (parser/resolver.rs:48-101) — Converts to Vec<char> but indexes into the &str using char-based positions. Will panic on non-ASCII content before ${} references.

12. Probe Timeout Not Implemented (probe/mod.rs:163)_timeout: Duration is accepted but unused. docker exec commands can hang indefinitely.

13. Container Cleanup Not Guaranteed (probe/mod.rs:117-157) — No RAII guard; panic between container creation and cleanup leaves orphaned containers.

14. Config version_cmd Becomes Shell Command (config/mod.rs) — A malicious .dgossgen.yml in a cloned repo could inject arbitrary commands via the version_cmd field.

15. serde_yaml Is Deprecated — The crate is archived/unmaintained. Consider migrating to serde_yml or another actively maintained alternative.

LOW (6) — Nice to have

  • 16. thiserror, indicatif, indexmap, insta, pretty_assertions are declared as dependencies but never used — dead weight
  • 17. open_in_editor doesn't check editor exit status
  • 18. has_unresolved() produces false positives for literal $ in shell commands
  • 19. No CLI integration tests (consider assert_cmd)
  • 20. GitHub Actions dmnemec/copy_file_to_another_repo_action@main not hash-pinned
  • 21. Warnings emitted after files already written in non-interactive mode

Unused Dependencies Audit

Dependency Type Status
thiserror runtime Never used (no #[derive(Error)] anywhere)
indicatif runtime Never imported
indexmap runtime Never imported
insta dev Never imported
pretty_assertions dev Never imported

Summary

Severity Count
CRITICAL 2
HIGH 5
MEDIUM 8
LOW 6

Verdict: REQUEST CHANGES — The CRITICAL and HIGH items (especially command injection via --run-arg, HEALTHCHECK flag parsing bug, and unsanitized shell command construction) should be addressed before merging. The architecture is sound and extensible, the test coverage is good for v0.1.0, and the pipeline design is clean. Addressing the critical/high items would make this ready to merge.

@dortort dortort merged commit 7f493d1 into main Feb 12, 2026
6 checks passed
@dortort dortort deleted the claude/dockerfile-dgoss-converter-zHseA branch February 16, 2026 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants