Skip to content

Parallel data loading #41

@atrabattoni

Description

@atrabattoni

When loading from one disk, making parallel data request usually does not improve I/O speed. But in some parallel hardware context or if the data need to be inflated (e.g. because ZFP compression was used) parallel loading of data chunks can improve I/O significantly (i.e. 8x speedup has been observed).

Two Xdas mechanisms could be improved:

  • The __array__ method of the VirtualArray class could have a parallel optional argument that could also be configured by xdas.config.set("parallel_read", 8).
  • The DataArrayLoader class could load several chunks in a parallel fashion also with a parallel optional argument (right now it only loads chunks one by one in a async way).

It would be nice to do some benchmarks to se what is the best, probably implementing the two is the best.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions