-
Notifications
You must be signed in to change notification settings - Fork 9
Use ngff zarr in to multiscale
#113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ngff zarr in to multiscale
#113
Conversation
|
Very excited to see multiscale-spatial-image and ngff-zarr coming together here @jo-mueller @thewtex
Just wanted to quickly comment that having some utility functions such as extracting scale or translation would be super useful. For multiview-stitcher I added some here, but it'd be great to have them live centrally in this repository! Happy to help with this also. |
|
Hi @m-albert, thanks for chiming in! I think this is exactly what I was looking for :) Should that go in here or into spatial_image? (ngl, I'm not sure how exactly they related to each other - is |
@jo-mueller True kind of, so multiscale-spatial-image uses spatial-image but not the other way round. Here, each resolution level is represented by a spatial-image.
You're completely right, this is specific to spatial-image and would probably fit better in spatial-image/spatial-image. Then I can also imagine utility functions that'd be useful here, like "get_spatial_dims" or "get_extent".
Before I forget to mention this, one problem I noticed when using this approach is that it's not clear how to define the scale along an axis that only has a single coordinate (shape 1). |
|
@jo-mueller wonderful! 🥇 🎇
Yes, I agree we should consolidate on the ngff_zarr Methods. We can release a new major version marking the breaking change. The xarray-coarsen method is equivalent to bin-shrink -- we can document as such where appropriate.
Yes :-) We can squash bugs and add features by leveraging the hardened ngff-zarr implementation -- scaling issues, etc. have been addressed there and there is extensive testing.
👍 fixes here are welcome! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR replaces the internal multiscale creation logic with ngff-zarr library integration to address issue #112. The change aims to leverage the established ngff-zarr implementation for more standardized NGFF-compliant multiscale operations.
Key changes:
- Replaces custom downsampling method implementations with
ngff-zarr.to_multiscales() - Updates dependency list to include
ngff-zarr - Refactors the main
to_multiscale()function to usengff_zarrconversion workflow
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| pyproject.toml | Adds ngff-zarr as a new dependency |
| multiscale_spatial_image/to_multiscale/to_multiscale.py | Major refactor replacing custom downsampling with ngff-zarr integration, removing internal method implementations |
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
Co-Authored-By: Marvin Albert <12528388+m-albert@users.noreply.github.com>
Maybe a way to ease the change would be to map the current methods to the downsample methods from
and in case someone used a call to |
map methods to ngff_zarr methods
48f1180 to
a9222a5
Compare
@jo-mueller 👍 yes, great plan! The mapping would be:
|
|
@thewtex A few follow-up questions/comments where advice would be needed:
They are a bit strange to me, because hypothetically I am just calling the |
|
@jo-mueller great work!
Yes, multiscale-spatial-image/test/_data.py Lines 89 to 104 in ecb6aa4
We could change where they are stored, e.g. GitHub Releases or other places.
Awesome! 🥇
This should be resolve with a dependency, possibly a python package "extra", on
Might be a dimension mismatch?
I am working towards this spitting out a more informative backtrace. If the test is reproducible, I can take a look. Side note: I will be offline until mid next week. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
Hi @thewtex , I think I have it working properly locally. I uploaded the data tarball to filebase, but it seems like public buckets require a paid subscription. I do have access to a few other stores, but I'm not sure whether it's actually desirable to have the tests depending on some storage that only have access to? Maybe Zenodo would be an option?
The respective code piece is this, right? Would have to look something like this? test_data = pooch.create(
path=test_dir,
base_url=f"https://ipfs.filebase.io/ipfs/{test_data_ipfs_cid}/",
# base_url="https://github.com/spatial-image/multiscale-spatial-image/releases/download/v2.0.0/",
registry={
"data.tar.gz": f"sha256:{test_data_sha256}",
},
retry_if_failed=5,
) |
to multiscaleto multiscale
@jo-mueller that is a great idea. Want to try it?
Yes, that's it. |
|
Hi @thewtex , I think you can give the tests another whirl. I uploaded the testing data tarball to Zenodo and the tests pass locally. I think the metadata on zenodo is not quite alright yet, though. I kept the licensing data to default and added only myself as contributor, which I don't think is accurate. I can still edit the metadata, so some more information on the data sources/licenses/people to cite would be appreciated 👍 Edit: Something to think about: Should every contributor create a new repository on Zenodo if the baseline changed? How should this be handled? Not sure. |
|
@jo-mueller thanks for the update! It looks like we still have test errors. Yes, Zenodo does have issues in practice with setup and attribution metadata. Could you please try https://pinata.cloud ? |
This is really boggling me. Just to make sure I'm using the correct procedure.
|
This reverts commit 35d6970.
|
Ok, could upload to pinata.cloud without issues. 🤞 for the tests... |
|
Strange. Could you maybe try and pull the branch on your machine to see if tests pass locally there? |
|
I do get test failures locally: Were the baselines created on an ARM mac? If I recall correctly, ARM mac's do have some numerical differences in their output. We may need to add a tolerance to the baseline comparison. |
|
I'm working on a windows machine (Intel(R) Core(TM) Ultra 7 258V). But I have access to other machines, I'll try there! |
|
@thewtex ok, just got around to trying it on a different machine, works perfectly :/ Maybe we find the time at the Hackathon to see this through. Definitely some interesting insights there ^^" |
|
@jo-mueller thanks for testing! Yes, let's hack this out at the hackathon 🤝 |
Addresses: FAILED test/test_to_multiscale_dask_image.py::test_gaussian_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal FAILED test/test_to_multiscale_dask_image.py::test_gaussian_anisotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal FAILED test/test_to_multiscale_dask_image.py::test_label_nearest_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal FAILED test/test_to_multiscale_dask_image.py::test_label_nearest_anisotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal FAILED test/test_to_multiscale_dask_image.py::test_label_mode_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal FAILED test/test_to_multiscale_dask_image.py::test_label_mode_anisotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal FAILED test/test_to_multiscale_itk.py::test_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal FAILED test/test_to_multiscale_itk.py::test_gaussian_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal FAILED test/test_to_multiscale_itk.py::test_label_gaussian_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal FAILED test/test_to_multiscale_itk.py::test_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x72b7b9ab87d0>' FAILED test/test_to_multiscale_itk.py::test_gaussian_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x72b87ff93010>' FAILED test/test_to_multiscale_itk.py::test_label_gaussian_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x72b7b7807b10>' FAILED test/test_to_multiscale_itk.py::test_from_itk - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x72b7b9e38c50>' FAILED test/test_to_multiscale_xarray.py::test_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal FAILED test/test_to_multiscale_xarray.py::test_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x72b7b74f4e90>'
Pixi changed the name.
Bump Python versions and constrain dependencies
thewtex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💙
Fixes #112
Hi @thewtex, so I replaced the internal multiscale creation with conversion to
ngff_imageand subsequent multiscale creation viangff_zarr.to_multiscale. A few things that came up to me during the refactoring:The implemented downsampling methods (xarray_coarsen, etc) are only used in the mutiscales creation here - with the latter replaced by
ngff.to_multiscales, is there still a need to have these implementations here? They are tested extensively and I think refactoring all the tests would take some more time so I wanted to check before. One could remove the downsampling methods here - bad for backwards compatibility - or replace them with calls to the methods inngff_zarr?Is there a straightforward way to infer an image's
scalefrom aspatial_image? Thescaleis passed as a parameter toto_spatial_imagewhere I presume it is converted into indeces but it is not retained as a separate parameter. Would I have to calculate it from, say the first two elements of each spatial index arrays?I think the multiscale-creation is a bit more correct now than before now :) In the
ConvertTiffFileexample, an image of shape(z: 242, y: 342, x: 3882)is downsampled by factors[{'x':2,'y':1,'z':1}, {'x':4,'y':2,'z':2}, {'x':8,'y':4,'z':4}]It seems like the scale factors were previously computed with respect to the next-higher level? Unless this is intended behavior, then I'll have to check again :)