Context
The lexicon repo (forecast-bio/atdata-lexicon#21) has added new array format tokens and shim schemas:
New array format tokens
sparseBytes — Scipy sparse matrix format (CSR/CSC/COO)
structuredBytes — Numpy structured array format (compound dtypes)
arrowTensor — Arrow tensor IPC format
safetensors — HuggingFace safetensors format
New shim schemas
schemas/sparse_shim.json — sparse matrix bytes
schemas/structured_shim.json — structured array bytes
schemas/arrow_tensor_shim.json — Arrow tensor bytes
schemas/safetensors_shim.json — safetensors bytes
schemas/dataframe_shim.json — Parquet-encoded tabular data
NDArray shim v1.1.0
schemas/ndarray_shim_v1.1.0.json — adds optional dtype, shape, and dimensionNames annotation fields
Tasks
References
Context
The lexicon repo (forecast-bio/atdata-lexicon#21) has added new array format tokens and shim schemas:
New array format tokens
sparseBytes— Scipy sparse matrix format (CSR/CSC/COO)structuredBytes— Numpy structured array format (compound dtypes)arrowTensor— Arrow tensor IPC formatsafetensors— HuggingFace safetensors formatNew shim schemas
schemas/sparse_shim.json— sparse matrix bytesschemas/structured_shim.json— structured array bytesschemas/arrow_tensor_shim.json— Arrow tensor bytesschemas/safetensors_shim.json— safetensors bytesschemas/dataframe_shim.json— Parquet-encoded tabular dataNDArray shim v1.1.0
schemas/ndarray_shim_v1.1.0.json— adds optionaldtype,shape, anddimensionNamesannotation fieldsTasks
arrayFormatknown valuesresolveSchemaXRPC)References