Skip to content

Validate VCF sample options and other CLI args upfront#2

Merged
tfenne merged 1 commit intomainfrom
tf/validate-vcf-options
Apr 15, 2026
Merged

Validate VCF sample options and other CLI args upfront#2
tfenne merged 1 commit intomainfrom
tf/validate-vcf-options

Conversation

@tfenne
Copy link
Copy Markdown
Collaborator

@tfenne tfenne commented Apr 15, 2026

Summary

  • Validate VCF --sample name exists in the VCF header upfront, before the simulation loop, with an error message that lists available samples
  • Validate that multi-sample VCFs require --sample to be specified (and that single-sample VCFs work without it)
  • Reject --sample without --vcf
  • Reject output prefix when parent directory doesn't exist
  • Add write_vcf_header_only() test helper for creating multi-sample VCFs
  • Add 6 integration tests covering all validation paths

Test plan

  • cargo ci-fmt passes
  • cargo ci-lint passes
  • cargo ci-test passes (177/177 tests, including 6 new validation tests)

Summary by CodeRabbit

  • Bug Fixes
    • Improved validation for --sample and --vcf option combinations with clearer error messages
    • Enhanced error messages to display up to 10 available VCF sample names (previously 5)
    • Added validation to verify the output directory exists before simulation starts
    • Strengthened VCF sample configuration validation at startup

Previously, an invalid --sample name or a multi-sample VCF without
--sample would panic or produce a cryptic error deep in the per-contig
simulation loop.  Now these checks run during argument validation so
the user gets a clear, actionable error message immediately.

- Add validate_vcf_sample() to open the VCF header and verify sample
  configuration before simulation begins
- Improve error messages to list available sample names
- Reject --sample without --vcf
- Reject output prefix when parent directory doesn't exist
- Add write_vcf_header_only() test helper for multi-sample VCFs
- Add 6 integration tests covering all validation paths
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 15, 2026

📝 Walkthrough

Walkthrough

The changes add preflight validation to the simulate command, verifying that --sample requires --vcf, validating VCF sample configurations, and checking that the output directory exists. A new VCF validation function is introduced with enhanced error messaging, and comprehensive integration tests are added to verify validation behavior.

Changes

Cohort / File(s) Summary
Validation Logic
src/commands/simulate.rs, src/vcf/mod.rs
Enhanced Simulate::validate to reject --sample without --vcf, validate VCF sample selections via new validate_vcf_sample function, and verify output parent directory exists. Improved error messages in resolve_sample_index to show up to 10 available samples.
Test Infrastructure
tests/helpers/mod.rs
Added TestEnv::write_vcf_header_only method to generate minimal VCF files with headers and sample columns for testing.
Test Coverage
tests/test_simulate.rs
Added integration tests validating CLI behavior for --sample and --vcf option combinations, sample matching, multi-sample VCF handling, and output directory existence checks.

Sequence Diagram

sequenceDiagram
    participant User as User/CLI
    participant Simulate as Simulate::validate
    participant VCF as vcf::validate_vcf_sample
    participant Output as Output Dir Check

    User->>Simulate: simulate --sample X --vcf file.vcf --output prefix
    Simulate->>Simulate: Check --sample without --vcf
    alt --sample without --vcf
        Simulate-->>User: Error: bail!
    end
    
    Simulate->>VCF: validate_vcf_sample(path, sample_name)
    VCF->>VCF: Open VCF, read header
    VCF->>VCF: resolve_sample_index(sample_name)
    alt Sample not found
        VCF-->>Simulate: Error with available samples
        Simulate-->>User: Error: bail!
    end
    
    Simulate->>Output: Check parent directory exists
    alt Directory missing
        Output-->>Simulate: Error
        Simulate-->>User: Error: bail!
    end
    
    alt All validations pass
        Simulate-->>User: Proceed to run_simulation
    end
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A fluffy bunny hops with glee,
Validating VCF samples, one, two, three!
No sample without a VCF in sight,
Output dirs checked—everything's right! ✨
Tests blooming like carrots, orange and bright!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and clearly summarizes the main objective of the PR: adding upfront validation for VCF sample options and other CLI arguments before simulation execution.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch tf/validate-vcf-options

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/test_simulate.rs (1)

2976-3111: Add /// doc comments to the six new validation tests.

The new test cases are non-trivial and should follow the same doc-comment style used by the surrounding tests for consistency and discoverability.

As per coding guidelines, "Add doc comments on all public and non-trivial private items".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_simulate.rs` around lines 2976 - 3111, Add triple-slash doc
comments (///) above each of the six new tests explaining what the test
validates: place short one-line descriptions above
test_sample_without_vcf_fails, test_wrong_sample_name_fails,
test_multi_sample_vcf_without_sample_flag_fails,
test_single_sample_vcf_without_sample_flag_works,
test_multi_sample_vcf_with_correct_sample_works, and
test_output_directory_does_not_exist_fails that mirror surrounding tests' style
(e.g., "Checks that ... fails/works when ..."), keeping them concise and focused
on the behavior asserted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/test_simulate.rs`:
- Around line 2976-3111: Add triple-slash doc comments (///) above each of the
six new tests explaining what the test validates: place short one-line
descriptions above test_sample_without_vcf_fails, test_wrong_sample_name_fails,
test_multi_sample_vcf_without_sample_flag_fails,
test_single_sample_vcf_without_sample_flag_works,
test_multi_sample_vcf_with_correct_sample_works, and
test_output_directory_does_not_exist_fails that mirror surrounding tests' style
(e.g., "Checks that ... fails/works when ..."), keeping them concise and focused
on the behavior asserted.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 954a622b-5651-49c3-aaef-f4b3f81a3407

📥 Commits

Reviewing files that changed from the base of the PR and between 5f88918 and 0a3f51d.

📒 Files selected for processing (4)
  • src/commands/simulate.rs
  • src/vcf/mod.rs
  • tests/helpers/mod.rs
  • tests/test_simulate.rs

@tfenne tfenne merged commit ce9c63d into main Apr 15, 2026
5 checks passed
@tfenne tfenne deleted the tf/validate-vcf-options branch April 15, 2026 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant