-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Something that has recently intrigued me is the choice of data format for files. Whether it be csv, json, pickle, zarr, netCDF, parquet, arrow, COG, icechunk, there isn't a right or wrong answer, just trade offs for what can and can't be done.
I'd like to dive very deep into the encodings/backends of why these data formats shine in one way or another. Examples include how parquets are column-oriented and save a schema, zarr/icechunk is compressed and chunked, etc.
I think this would be useful to understand why to use something to avoid "using something because everyone else does
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels