Implement bedtools correctness test suite with DataFusion execution engine

## Summary

Implement a functional/integration test suite that validates GIQL operator correctness against bedtools. Each test generates controlled genomic interval datasets, executes the equivalent operation in both GIQL (transpiled to SQL and executed via DataFusion) and bedtools (via pybedtools), then compares the results.

### Operators to cover

- **INTERSECTS** — validate against `bedtools intersect`, including strand-aware modes (`-s`, `-S`)
- **MERGE** — validate against `bedtools merge`, including strand-aware merging
- **NEAREST** — validate against `bedtools closest`, including k-nearest and distance calculations
- **CLUSTER** — validate against `bedtools cluster`
- **DISTANCE** — validate against `bedtools closest -d` distance output

### Architecture

```
tests/integration/bedtools/
├── conftest.py                  # Fixtures: DataFusion session, interval generator
├── test_intersect.py            # INTERSECTS correctness tests
├── test_merge.py                # MERGE correctness tests
├── test_nearest.py              # NEAREST correctness tests
├── test_cluster.py              # CLUSTER correctness tests
├── test_distance.py             # DISTANCE correctness tests
├── test_strand_aware.py         # Cross-operator strand-specific tests
└── utils/
    ├── bedtools_wrapper.py      # Pybedtools wrapper for each operation
    ├── comparison.py            # Result comparison with epsilon tolerance
    ├── data_models.py           # GenomicInterval, ComparisonResult, etc.
    ├── datafusion_engine.py     # DataFusion session setup and GIQL execution
    └── interval_generator.py    # Seeded interval generation for reproducibility
```

### Test pattern

Each test follows a consistent pattern:

1. **Arrange** — Generate intervals using `IntervalGenerator` with a deterministic seed. Load into both a DataFusion session (as Arrow tables) and pybedtools BedTool objects.
2. **Act (bedtools)** — Execute the operation via the pybedtools wrapper.
3. **Act (GIQL)** — Transpile the GIQL query and execute it against DataFusion.
4. **Assert** — Compare results using order-independent comparison with epsilon tolerance for floats and exact matching for integer positions.

### DataFusion execution

Use `datafusion` (PyArrow-based Python bindings) as the execution engine. GIQL transpiles to SQL; DataFusion executes it. This validates that GIQL's generated SQL is portable and correct on the engine the project targets for production use. The test engine wrapper registers Arrow tables from the interval generator and executes transpiled GIQL queries via `SessionContext.sql()`.

### Dependencies

- `pybedtools` — Python wrapper for bedtools CLI
- `bedtools` — System dependency (tests skip gracefully if not installed)
- `datafusion` — Apache DataFusion Python bindings
- `hypothesis` — Property-based test data generation for edge-case discovery

## Motivation

GIQL transpiles spatial genomic queries into SQL, but the existing unit tests only verify that the generated SQL has the expected structure — they do not verify that the SQL produces correct results on real data. bedtools is the de facto standard for genomic interval operations, making it the ideal oracle for correctness testing. Using DataFusion as the execution engine ensures the suite validates correctness on GIQL's target production engine and catches any SQL dialect incompatibilities early.

## Expected outcome

- Integration test suite under `tests/integration/bedtools/` covers the five merged GIQL operators (INTERSECTS, MERGE, NEAREST, CLUSTER, DISTANCE)
- Tests use DataFusion as the execution engine for GIQL queries
- Tests skip gracefully when bedtools is not installed
- Interval generation is seeded and reproducible
- Strand-aware modes are tested for operators that support them
- The test suite passes against the current GIQL transpiler output
- COVERAGE integration tests are deferred to the COVERAGE operator PR (#62)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement bedtools correctness test suite with DataFusion execution engine #74

Summary

Operators to cover

Architecture

Test pattern

DataFusion execution

Dependencies

Motivation

Expected outcome

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement bedtools correctness test suite with DataFusion execution engine #74

Description

Summary

Operators to cover

Architecture

Test pattern

DataFusion execution

Dependencies

Motivation

Expected outcome

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions