Skip to content

Speed up making latest.zarr #221

@peterdudfield

Description

@peterdudfield

Current the nwp dag moves the copies recent zarr file to latest.zarr but this can take 10 munutes as there could be 10,000 files.

  1. change the chunking, but doesn't work so well if we writing to the store in parrellel. And don't want to have make too many differences to archive and consume. This could be done with an env var when collecting live data
  2. Use s3 batch jobs
  3. re chunk after pulling the data in the nwp-consumer
  4. use aws sync?
  5. save as zarr and as a zarr.zip, then dag could copy and unzip
  6. Try zarr3 and larger chunk sizes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions