Skip to content

Conversation

@jo-mueller
Copy link
Collaborator

@jo-mueller jo-mueller commented Aug 14, 2025

Fixes #112

Hi @thewtex, so I replaced the internal multiscale creation with conversion to ngff_image and subsequent multiscale creation via ngff_zarr.to_multiscale. A few things that came up to me during the refactoring:

  • The implemented downsampling methods (xarray_coarsen, etc) are only used in the mutiscales creation here - with the latter replaced by ngff.to_multiscales, is there still a need to have these implementations here? They are tested extensively and I think refactoring all the tests would take some more time so I wanted to check before. One could remove the downsampling methods here - bad for backwards compatibility - or replace them with calls to the methods in ngff_zarr?

  • Is there a straightforward way to infer an image's scale from a spatial_image? The scale is passed as a parameter to to_spatial_image where I presume it is converted into indeces but it is not retained as a separate parameter. Would I have to calculate it from, say the first two elements of each spatial index arrays?

  • I think the multiscale-creation is a bit more correct now than before now :) In the ConvertTiffFile example, an image of shape (z: 242, y: 342, x: 3882) is downsampled by factors

[{'x':2,'y':1,'z':1},
 {'x':4,'y':2,'z':2},
 {'x':8,'y':4,'z':4}]
Scale Previous implementation Now
scale0 (z: 242, y: 342, x: 3882) (z: 242, y: 342, x: 3882)
scale1 (z: 242, y: 342, x: 1941) (z: 242, y: 342, x: 1941)
scale2 (z: 121, y: 171, x: 485) (z: 121, y: 171, x: 970)
scale3 (z: 30, y: 42, x: 60) (z: 60, y: 85, x: 485)

It seems like the scale factors were previously computed with respect to the next-higher level? Unless this is intended behavior, then I'll have to check again :)

  • Lastly: I encountered a few errors in the demo notebook that likely come from zarr 2/3 compatibility issues. Do you mind if I fix these here as I go? Or would you rather have them on a separate PR?

@m-albert
Copy link
Contributor

Very excited to see multiscale-spatial-image and ngff-zarr coming together here @jo-mueller @thewtex

Is there a straightforward way to infer an image's scale from a spatial_image? The scale is passed as a parameter to to_spatial_image where I presume it is converted into indeces but it is not retained as a separate parameter. Would I have to calculate it from, say the first two elements of each spatial index arrays?

Just wanted to quickly comment that having some utility functions such as extracting scale or translation would be super useful. For multiview-stitcher I added some here, but it'd be great to have them live centrally in this repository! Happy to help with this also.

@jo-mueller
Copy link
Collaborator Author

jo-mueller commented Aug 14, 2025

Hi @m-albert, thanks for chiming in! I think this is exactly what I was looking for :) Should that go in here or into spatial_image? (ngl, I'm not sure how exactly they related to each other - is multiscale-spatial-image like an extension of si?)

@m-albert
Copy link
Contributor

is multiscale-spatial-image like an extension of si?

@jo-mueller True kind of, so multiscale-spatial-image uses spatial-image but not the other way round. Here, each resolution level is represented by a spatial-image.

Should that go in here or into spatial_image?

You're completely right, this is specific to spatial-image and would probably fit better in spatial-image/spatial-image. Then I can also imagine utility functions that'd be useful here, like "get_spatial_dims" or "get_extent".

I think this is exactly what I was looking for :)

Before I forget to mention this, one problem I noticed when using this approach is that it's not clear how to define the scale along an axis that only has a single coordinate (shape 1).

@thewtex
Copy link
Contributor

thewtex commented Aug 14, 2025

@jo-mueller wonderful! 🥇 🎇

latter replaced by ngff.to_multiscales, is there still a need to have these implementations here

Yes, I agree we should consolidate on the ngff_zarr Methods. We can release a new major version marking the breaking change. The xarray-coarsen method is equivalent to bin-shrink -- we can document as such where appropriate.

I think the multiscale-creation is a bit more correct now than before now :)

Yes :-) We can squash bugs and add features by leveraging the hardened ngff-zarr implementation -- scaling issues, etc. have been addressed there and there is extensive testing.

Lastly: I encountered a few errors in the demo notebook that likely come from zarr 2/3 compatibility issues. Do you mind if I fix these here as I go? Or would you rather have them on a separate PR?

👍 fixes here are welcome!

@thewtex thewtex requested a review from Copilot August 14, 2025 13:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR replaces the internal multiscale creation logic with ngff-zarr library integration to address issue #112. The change aims to leverage the established ngff-zarr implementation for more standardized NGFF-compliant multiscale operations.

Key changes:

  • Replaces custom downsampling method implementations with ngff-zarr.to_multiscales()
  • Updates dependency list to include ngff-zarr
  • Refactors the main to_multiscale() function to use ngff_zarr conversion workflow

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
pyproject.toml Adds ngff-zarr as a new dependency
multiscale_spatial_image/to_multiscale/to_multiscale.py Major refactor replacing custom downsampling with ngff-zarr integration, removing internal method implementations

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

@jo-mueller
Copy link
Collaborator Author

@thewtex

Yes, I agree we should consolidate on the ngff_zarr Methods. We can release a new major version marking the breaking change. The xarray-coarsen method is equivalent to bin-shrink -- we can document as such where appropriate.

Maybe a way to ease the change would be to map the current methods to the downsample methods from ngff_zarr. It seems like it's only a few methods that wouldn't be supported anymore.

old method ngff_zarr method
XARRAY_COARSEN ITK_BIN_SHRINK
ITK_BIN_SHRINK ITKWASM_BIN_SHRINK
ITK_GAUSSIAN ITK_GAUSSIAN
ITKWASM_GAUSSIAN
ITK_LABEL_GAUSSIAN ITKWASM_LABEL_IMAGE
ITKWASM_LABEL_IMAGE DASK_IMAGE_GAUSSIAN
DASK_IMAGE_MODE DASK_IMAGE_MODE
DASK_IMAGE_NEAREST DASK_IMAGE_NEAREST

and in case someone used a call to msi.methods.WHATEVER we could fly a small deprecation warning?

@jo-mueller jo-mueller force-pushed the use-ngff_zarr-in-`to_multiscale` branch from 48f1180 to a9222a5 Compare August 15, 2025 11:51
@thewtex
Copy link
Contributor

thewtex commented Aug 15, 2025

@thewtex

Yes, I agree we should consolidate on the ngff_zarr Methods. We can release a new major version marking the breaking change. The xarray-coarsen method is equivalent to bin-shrink -- we can document as such where appropriate.

Maybe a way to ease the change would be to map the current methods to the downsample methods from ngff_zarr. It seems like it's only a few methods that wouldn't be supported anymore.

and in case someone used a call to msi.methods.WHATEVER we could fly a small deprecation warning?

@jo-mueller 👍 yes, great plan!

The mapping would be:

old method ngff_zarr method
XARRAY_COARSEN ITK_BIN_SHRINK
ITK_BIN_SHRINK ITK_BIN_SHRINK
ITK_GAUSSIAN ITK_GAUSSIAN
ITKWASM_GAUSSIAN ITKWASM_GAUSSIAN
ITK_LABEL_GAUSSIAN ITKWASM_LABEL_IMAGE
ITKWASM_LABEL_IMAGE ITKWASM_LABEL_IMAGE
DASK_IMAGE_GAUSSIAN DASK_IMAGE_GAUSSIAN
DASK_IMAGE_MODE DASK_IMAGE_MODE
DASK_IMAGE_NEAREST DASK_IMAGE_NEAREST

@jo-mueller
Copy link
Collaborator Author

jo-mueller commented Aug 18, 2025

@thewtex A few follow-up questions/comments where advice would be needed:

  • It took me a bit to realize why tests are failing that dramatically, but the reason is the bug in the multiscales creation above. With that behavior changed, it means that all the reference images in the testing folders are no longer appropriate and need to be resaved with the correct multiscale factors. Is there a routine to do this easily?

  • Comment: Some of the data source links in the demo notebook yield timeout errors, noteably in ConvertITKImage.ipynb and HelloMultiscaleSpatialImageWorld.ipynb.

  • The current version works roughly, but only with the Dask-based methods. When I use the ITK-based downsample filters, I get a variety of errors - among others:

  • For nz.Methods.ITK_GAUSSIAN: AttributeError: module 'itk' has no attribute 'DiscreteGaussianImageFilter'

  • For nz.Methods.ITK_BIN_SHRINK: Wrong number or type of arguments for overloaded function 'itkBinShrinkImageFilterIUC3IUC3_SetShrinkFactors'.

  • For nz.Methods.ITKWASM_LABEL_IMAGE: 0: 0x363688 - <unknown>!<wasm function 5110> 1: 0x387a77 - <unknown>!<wasm function 5127>

They are a bit strange to me, because hypothetically I am just calling the nz.to_mutliscales function with the nz.Methods.WhatEver method keyword argument. Are these errors something that maybe directly resonate with you?

@thewtex
Copy link
Contributor

thewtex commented Aug 18, 2025

@jo-mueller great work!

It took me a bit to realize why tests are failing that dramatically, but the reason is the bug in the multiscales creation above. With that behavior changed, it means that all the reference images in the testing folders are no longer appropriate and need to be resaved with the correct multiscale factors. Is there a routine to do this easily?

Yes, store_new_image can be used.

def store_new_image(dataset_name, baseline_name, multiscale_image):
"""Helper method for writing output results to disk
for later upload as test baseline"""
path = test_data_dir / f"baseline/{dataset_name}/{baseline_name}"
try:
from zarr.storage import DirectoryStore
store = DirectoryStore(
str(path),
dimension_separator="/",
)
except ImportError:
from zarr.storage import LocalStore
store = LocalStore(str(path))
multiscale_image.to_zarr(store, mode="w")

Comment: Some of the data source links in the demo notebook yield timeout errors, noteably in ConvertITKImage.ipynb and HelloMultiscaleSpatialImageWorld.ipynb.

We could change where they are stored, e.g. GitHub Releases or other places.

The current version works roughly, but only with the Dask-based methods.

Awesome! 🥇

For nz.Methods.ITK_GAUSSIAN: AttributeError: module 'itk' has no attribute 'DiscreteGaussianImageFilter'

This should be resolve with a dependency, possibly a python package "extra", on itk-filtering.

For nz.Methods.ITK_BIN_SHRINK: Wrong number or type of arguments for overloaded function 'itkBinShrinkImageFilterIUC3IUC3_SetShrinkFactors'.

Might be a dimension mismatch?

For nz.Methods.ITKWASM_LABEL_IMAGE: 0: 0x363688 - !<wasm function 5110> 1: 0x387a77 - !<wasm function 5127>

I am working towards this spitting out a more informative backtrace. If the test is reproducible, I can take a look.

Side note: I will be offline until mid next week.

@thewtex thewtex requested a review from melonora August 19, 2025 10:27
@jo-mueller
Copy link
Collaborator Author

Hi @thewtex , I think I have it working properly locally. I uploaded the data tarball to filebase, but it seems like public buckets require a paid subscription. I do have access to a few other stores, but I'm not sure whether it's actually desirable to have the tests depending on some storage that only have access to? Maybe Zenodo would be an option?

Re the testing data: the URL also needs to be updated.

The respective code piece is this, right? Would have to look something like this?

test_data = pooch.create(
    path=test_dir,
    base_url=f"https://ipfs.filebase.io/ipfs/{test_data_ipfs_cid}/",
    # base_url="https://github.com/spatial-image/multiscale-spatial-image/releases/download/v2.0.0/",
    registry={
        "data.tar.gz": f"sha256:{test_data_sha256}",
    },
    retry_if_failed=5,
)

@jo-mueller jo-mueller changed the title WIP: Use ngff zarr in to multiscale Use ngff zarr in to multiscale Sep 16, 2025
@thewtex
Copy link
Contributor

thewtex commented Sep 16, 2025

Maybe Zenodo would be an option?

@jo-mueller that is a great idea. Want to try it?

The respective code piece is this, right? Would have to look something like this?

Yes, that's it.

@jo-mueller
Copy link
Collaborator Author

jo-mueller commented Sep 17, 2025

Hi @thewtex , I think you can give the tests another whirl. I uploaded the testing data tarball to Zenodo and the tests pass locally.

I think the metadata on zenodo is not quite alright yet, though. I kept the licensing data to default and added only myself as contributor, which I don't think is accurate. I can still edit the metadata, so some more information on the data sources/licenses/people to cite would be appreciated 👍

Edit: Something to think about: Should every contributor create a new repository on Zenodo if the baseline changed? How should this be handled? Not sure.

@thewtex
Copy link
Contributor

thewtex commented Sep 22, 2025

@jo-mueller thanks for the update!

It looks like we still have test errors.

Yes, Zenodo does have issues in practice with setup and attribution metadata.

Could you please try https://pinata.cloud ?

@jo-mueller
Copy link
Collaborator Author

jo-mueller commented Sep 22, 2025

It looks like we still have test errors.

This is really boggling me. Just to make sure I'm using the correct procedure.

  1. Use the store_new_image to overwrite the baseline with the newly generated data
  2. Get hash value using pooch
  3. Upload to (tbd) and get correct download information and add in _data.py
  4. Delete local data.tar.gz to make sure tests download from remote data source?
  5. Run tests

@jo-mueller
Copy link
Collaborator Author

Ok, could upload to pinata.cloud without issues. 🤞 for the tests...

@jo-mueller
Copy link
Collaborator Author

Strange. Could you maybe try and pull the branch on your machine to see if tests pass locally there?

@thewtex
Copy link
Contributor

thewtex commented Sep 26, 2025

I do get test failures locally:

===================================================================== short test summary info ======================================================================
FAILED test/test_to_multiscale_dask_image.py::test_gaussian_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_dask_image.py::test_gaussian_anisotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_dask_image.py::test_label_nearest_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_dask_image.py::test_label_nearest_anisotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_dask_image.py::test_label_mode_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_dask_image.py::test_label_mode_anisotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_itk.py::test_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_itk.py::test_gaussian_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_itk.py::test_label_gaussian_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_itk.py::test_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x7aca
b9eda350>'
FAILED test/test_to_multiscale_itk.py::test_gaussian_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object
at 0x7acab97751b0>'
FAILED test/test_to_multiscale_itk.py::test_label_gaussian_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore o
bject at 0x7acab9c92ad0>'
FAILED test/test_to_multiscale_itk.py::test_from_itk - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x7ac9e4405b70>'
FAILED test/test_to_multiscale_xarray.py::test_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_xarray.py::test_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x7
acab9edb700>'
============================================================ 15 failed, 9 passed, 17 warnings in 21.77s ============================================================

Were the baselines created on an ARM mac? If I recall correctly, ARM mac's do have some numerical differences in their output. We may need to add a tolerance to the baseline comparison.

@jo-mueller
Copy link
Collaborator Author

jo-mueller commented Sep 30, 2025

I'm working on a windows machine (Intel(R) Core(TM) Ultra 7 258V). But I have access to other machines, I'll try there!

@jo-mueller
Copy link
Collaborator Author

@thewtex ok, just got around to trying it on a different machine, works perfectly :/ Maybe we find the time at the Hackathon to see this through. Definitely some interesting insights there ^^"

@thewtex
Copy link
Contributor

thewtex commented Oct 13, 2025

@jo-mueller thanks for testing! Yes, let's hack this out at the hackathon 🤝

thewtex and others added 6 commits November 13, 2025 07:31
Addresses:

FAILED test/test_to_multiscale_dask_image.py::test_gaussian_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_dask_image.py::test_gaussian_anisotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_dask_image.py::test_label_nearest_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_dask_image.py::test_label_nearest_anisotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_dask_image.py::test_label_mode_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_dask_image.py::test_label_mode_anisotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_itk.py::test_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_itk.py::test_gaussian_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_itk.py::test_label_gaussian_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_itk.py::test_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x72b7b9ab87d0>'
FAILED test/test_to_multiscale_itk.py::test_gaussian_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x72b87ff93010>'
FAILED test/test_to_multiscale_itk.py::test_label_gaussian_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x72b7b7807b10>'
FAILED test/test_to_multiscale_itk.py::test_from_itk - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x72b7b9e38c50>'
FAILED test/test_to_multiscale_xarray.py::test_isotropic_scale_factors - AssertionError: Left and right DatasetView objects are not equal
FAILED test/test_to_multiscale_xarray.py::test_anisotropic_scale_factors - FileNotFoundError: No such file or directory: '<zarr.storage.DirectoryStore object at 0x72b7b74f4e90>'
Bump Python versions and constrain dependencies
Copy link
Contributor

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💙

@thewtex thewtex merged commit 8443a7b into spatial-image:main Nov 13, 2025
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement to_multiscale with ngff-zarr

4 participants