Skip to content

add Zarr (v2) support#179

Open
petercla0119 wants to merge 26 commits intocheeseman-lab:mainfrom
petercla0119:preprocess_zarr_v2_pr
Open

add Zarr (v2) support#179
petercla0119 wants to merge 26 commits intocheeseman-lab:mainfrom
petercla0119:preprocess_zarr_v2_pr

Conversation

@petercla0119
Copy link

@petercla0119 petercla0119 commented Jan 13, 2026

Description

This PR adds comprehensive Zarr and OME-Zarr support to Brieflow, enabling cloud-native data formats for improved performance with large datasets and native visualization support in Napari.

Motivation

Large-scale optical pooled screens generate massive amounts of image data. This PR introduces:

  • Zarr format support for efficient storage and processing of large datasets
  • OME-Zarr multiscale pyramids for interactive visualization in Napari
  • Format-agnostic I/O system that seamlessly handles both TIFF and Zarr formats
  • Configurable output formats allowing users to choose between TIFF, Zarr, or both

This enhancement significantly improves Brieflow's scalability and interoperability with modern microscopy visualization tools while maintaining full backward compatibility with existing TIFF-based workflows.

Related to ongoing efforts to modernize Brieflow's data handling capabilities and support cloud-native workflows.

What is the nature of your change?

  • Enhancement (adds functionality).
  • This change requires a documentation update.

Checklist

  • My code follows the conventions of this project.
  • I have updated the pyproject.toml to reflect the change as designated by semantic versioning.
  • I have checked linting and formatting with ruff check and ruff format.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have deleted all non-relevant text in this pull request template.

Backward Compatibility

This PR maintains full backward compatibility:

  • Existing TIFF-based workflows continue to work without changes
  • Default behavior uses Zarr (can be changed to TIFF in config)
  • All processing scripts automatically detect and handle both formats
  • No breaking changes to existing APIs or configurations

petercla0119 and others added 26 commits January 12, 2026 17:39
- Updated `write_image_omezarr` and `write_labels_omezarr` functions to accept pixel sizes as float, tuple, or dictionary, allowing for more flexible input formats.
- Introduced `_parse_pixel_sizes` helper function to standardize pixel size extraction and validation.
- Enhanced metadata extraction in `extract_metadata_tile_nd2` and `extract_metadata_well_nd2` to include pixel size, objective magnification, zoom magnification, and binning information.
- Updated `export_omezarr_image` script to read image data from TIFF or raw formats, improving compatibility with different data sources.
- Added warnings for potential inconsistencies in pixel size calibration.
- Introduced `conftest.py` to ensure the repository root is included in `sys.path` for test imports.
- Updated `test_preprocess.py` to assert required columns in metadata instead of exact counts.
- Modified `test_omezarr_exports.py` to check for any Zarr files in the output directory.
- Enhanced `write_image_omezarr` to accept new parameters: `coarsening_factor`, `max_levels`, and `is_label`, improving flexibility in image writing.
- Added error handling for `max_levels` and `coarsening_factor` to ensure valid values.
- Updated metadata handling in `write_image_omezarr` to accommodate label images and ensure proper storage of pixel sizes.
omezarr_writer.py moved under lib/shared
resolves issue with failed labels import into napari
…ss_zarr_v4

Cherry-picked from 1161474 on 79f1eea_preprocess_zarr_v4.
- Add dynamic key selection (CONVERT_SBS_KEY/CONVERT_PHENOTYPE_KEY) based on OME_ZARR_ENABLED
- IC fields respect IC_EXT (zarr vs tiff) based on config
- Downstream rules (sbs.smk, phenotype.smk) use dynamic keys for preprocess inputs
- Add image_to_omezarr.py script that uses convert_to_array + write_image_omezarr
- Add convert_sbs_omezarr and convert_phenotype_omezarr rules
- Update CONVERT_*_KEY selection to use _omezarr variants when USE_OME_ZARR=True
- This allows direct ND2→Zarr conversion, bypassing TIFF intermediates entirely
- Added integration tests for Zarr preprocessing functionality, ensuring nd2_to_zarr conversion produces outputs equivalent to TIFF conversion.
- Updated pytest markers to include integration tests.
- Modified existing tests to prioritize Zarr format over TIFF where applicable.
- Introduced new rules for Zarr conversion in the Snakemake workflow, allowing for flexible output formats based on configuration.
- Implemented a script for direct ND2 to standard Zarr conversion, streamlining the preprocessing pipeline.
@petercla0119 petercla0119 marked this pull request as ready for review January 13, 2026 00:36
@petercla0119 petercla0119 marked this pull request as draft January 30, 2026 07:47
@petercla0119 petercla0119 marked this pull request as ready for review January 31, 2026 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments