There are multiple methods of loading data for use with pathology.
Zarr and LitData are two of the ways we are currently considering this data storing.
This task is being deliberately left fairly open - we are trying to understand the best ways to process and load the data, for fast training.
Some things (non-exhaustive list) to consider are:
- limited disk space on the training cluster
- limited streaming speed
- potential limits on disk storing space