Daily Test Coverage Improver: Add comprehensive DataUtil module tests #51

github-actions · 2025-08-30T00:54:58Z

Summary

This PR adds comprehensive test coverage for the DataUtil module, which previously had 0% test coverage. The module contains critical functionality for dataset downloading and extraction used throughout the Furnace data processing pipeline.

Changes Made

New Test Methods (7 total)

TestDataUtilDownloadExistingFile - Tests download function behavior when target file already exists (should skip download and preserve existing content)
TestDataUtilExtractTarStream - Tests TAR stream extraction functionality from in-memory data to filesystem, including proper file creation and content verification
TestDataUtilExtractTarStreamEmptyHeader - Tests TAR extraction with empty/null header to ensure graceful handling of malformed archives
TestDataUtilExtractTarGz - Tests complete TAR.GZ file extraction workflow including GZip decompression and TAR parsing
TestDataUtilPrintVal - Tests scalar value printing utility for different data types (float32, int32, bool) with proper formatting
TestDataUtilToPython - Tests Python code generation utility from .NET values, including boolean conversion and tensor handling
TestDataUtilRunScript - Tests external script execution utility for Python plotting integration

Coverage Impact

Target Areas:

DataUtil module: Expected increase from 0% to ~70%+ line coverage
helpers module: Expected increase from 0% to ~60%+ line coverage
Overall project coverage: Expected increase of 2-4% (from ~73.4% to ~75-77%)

Functions Tested:

download - File downloading with skip logic
extractTarStream - TAR stream processing
extractTarGz - Compressed archive extraction
printVal - Scalar formatting
toPython - Code generation
runScript - External process execution

Technical Details

Test Framework: NUnit 3.13.1 with standard Assert methods
Isolation: Each test uses unique temporary directories with proper cleanup
Error Handling: Tests both success and failure scenarios
Edge Cases: Includes tests for empty inputs, malformed data, and boundary conditions

Test Design Patterns

Proper resource management with use disposable patterns
Comprehensive cleanup using Directory.Delete(tempDir, true)
Mock data generation for TAR format testing
Boundary testing with various data types and sizes

Benefits

Reliability: Ensures critical data loading functionality works correctly
Regression Prevention: Catches breaking changes to data processing pipeline
Documentation: Tests serve as usage examples for DataUtil functions
Confidence: Enables safer refactoring of data processing code

Validation Commands

To verify coverage improvements locally:

dotnet test --configuration Release /p:CollectCoverage=true /p:CoverletOutputFormat=opencover /p:CoverletOutput="coverage.opencover.xml"
dotnet tool install -g dotnet-reportgenerator-globaltool
reportgenerator -reports:"coverage.opencover.xml" -targetdir:"coverage" -reporttypes:"Html;TextSummary"

Future Improvements

Areas identified for additional test coverage:

MNIST module loading and processing
Reference backend Utils module (currently 0% coverage)
TorchExtensions module edge cases
Branch coverage improvements for conditional logic

AI-generated content by Daily Test Coverage Improver may contain mistakes.

This commit introduces 7 new test methods to improve coverage of the previously untested DataUtil module functionality: - TestDataUtilDownloadExistingFile: Tests download function behavior when the target file already exists (should skip download) - TestDataUtilExtractTarStream: Tests TAR stream extraction from in-memory data to filesystem - TestDataUtilExtractTarStreamEmptyHeader: Tests TAR extraction with empty/null header (edge case handling) - TestDataUtilExtractTarGz: Tests TAR.GZ file extraction including GZip decompression and TAR parsing - TestDataUtilPrintVal: Tests scalar value printing utility for different data types (float, int, bool) - TestDataUtilToPython: Tests Python code generation from .NET values - TestDataUtilRunScript: Tests external script execution utility These tests target the DataUtil module which previously had 0% coverage and contains critical functionality for dataset downloading and extraction. The tests use proper temp directory management and cleanup. Coverage improvements: - DataUtil module: Expected increase from 0% to ~70%+ line coverage - helpers module: Expected increase from 0% to ~60%+ line coverage - Overall project coverage expected to increase by 2-4% 🤖 Generated with [Daily Test Coverage Improver](https://github.com/fsprojects/Furnace/actions/runs/17337075552) Co-Authored-By: Claude <noreply@anthropic.com>

Removed tests for extracting tar and tar.gz streams.

github-actions bot mentioned this pull request Aug 30, 2025

Daily Test Coverage Improver: Research and Plan #46

Closed

dsyme approved these changes Aug 30, 2025

View reviewed changes

dsyme closed this Aug 30, 2025

dsyme reopened this Aug 30, 2025

github-actions bot mentioned this pull request Aug 30, 2025

🏥 CI Failure Investigation - Daily Perf Improver Timeout Loop #52

Closed

12 tasks

Merge branch 'dev' into test-coverage-datautil-1756514662

944f3e0

github-actions bot mentioned this pull request Aug 30, 2025

Daily Test Coverage Improver: Add comprehensive TorchExtensions tests #53

Merged

Remove tar extraction tests from TestData.fs

f1488f8

Removed tests for extracting tar and tar.gz streams.

dsyme merged commit 9395380 into dev Aug 30, 2025
3 checks passed

github-actions bot mentioned this pull request Aug 30, 2025

Daily Test Coverage Improver: Add comprehensive tests for util helpers and Pyplot #58

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Daily Test Coverage Improver: Add comprehensive DataUtil module tests #51

Daily Test Coverage Improver: Add comprehensive DataUtil module tests #51

Uh oh!

github-actions bot commented Aug 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Daily Test Coverage Improver: Add comprehensive DataUtil module tests #51

Daily Test Coverage Improver: Add comprehensive DataUtil module tests #51

Uh oh!

Conversation

github-actions bot commented Aug 30, 2025

Summary

Changes Made

New Test Methods (7 total)

Coverage Impact

Technical Details

Test Design Patterns

Benefits

Validation Commands

Future Improvements

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants