Releases: forecast-bio/atdata
Releases · forecast-bio/atdata
v0.3.2b3
Fixed
Atmosphere.upload_blob()TypeError: The timeout heuristic passedtimeout=toClient.upload_blob()which only accepts(data: bytes). Switched to the namespace methodcom.atproto.repo.upload_blob()which forwards kwargs through to httpx
Testing
- ATProto SDK signature compatibility tests: New
test_atproto_compat.pywith 7 tests that instantiate a real atprotoClient(withClientRaw._invokepatched) to validate method signatures without network I/O. Coversupload_blob,create_record,list_records,get_record,delete_record, andexport_session
v0.3.2b2
Added
- Lexicon-mirror type system:
StorageHttp,StorageS3,StorageBlobs,BlobEntry,ShardChecksumdataclasses that mirror ATProto lexicon definitions, withstorage_from_record()union deserializer ShardUploadResult: Typed return fromPDSBlobStore.write_shards()carrying both AT URIs and blob ref dicts- Lexicon reference docs: Auto-generated documentation page for the
ac.foundation.dataset.*namespace - Example docs: dataset-profiler, lens-graph, and query-cookbook with plots and interactive tabsets
- Typed proxy DSL for manifest queries (
foundation-ac #43)
Changed
DatasetPublisherrefactored: Extracted_create_record()helper, fixing a bug wherepublish()useddataset.urlinstead ofdataset.list_shards()for multi-shard datasetsPDSBlobStore.write_shards()returnsShardUploadResultinstead of using a_last_blob_refsside-channel- Blob storage uploads: PDS blob uploads now use
storageBlobswith embedded blob ref objects instead of string AT URIs instorageExternal, preventing PDS garbage collection of uploaded blobs - Replaced lexicon symlinks with real files
- Guarded
redisimports behindTYPE_CHECKINGinindex/_entry.pyandindex/_index.py - Standardized benchmark outputs to
.benchmarks/directory
Fixed
publish()multi-shard bug: was passing single URL instead of full shard list- Double-write eliminated in
PDSBlobStore - Lens-graph example: removed float rounding in calibrate lens that broke law assertions
- Unused imports and E402 violations in atmosphere module
Testing
- Strengthened weak mock assertions with argument verification across 4 test files
- Fixed misleading unicode tests: real emoji (🌍🎉🚀) and CJK characters (日本語テスト, 中文测试, 한국어시험) instead of ASCII placeholders
- Exact shard count assertions instead of
>= 2bounds - Fixed self-referential assertion in
test_publish_schema - Added content assertions for empty/corrupted shard recovery tests
v0.3.2b1
What's Changed
- release: v0.3.2b1 by @maxine-at-forecast in #5
- release: v0.3.2b1 by @maxine-at-forecast in #6
Full Changelog: v0.3.1b1...v0.3.2b1
v0.3.1b1
What's Changed
Added
- Lexicon packaging: ATProto lexicon JSON files bundled in
src/atdata/lexicons/withimportlib.resourcesaccess viaatdata.lexicons.get_lexicon()andlist_lexicons() DatasetDictsingle-split proxy: When aDatasetDicthas one split,.ordered(),.shuffled(),.list_shards(), and otherDatasetmethods are proxied directlywrite_samples(manifest=True): Opt-in manifest generation during sample writing for query-based access- Example documentation: Five executable Quarto example docs covering typed pipelines, lens transforms, manifest queries, index workflows, and multi-split datasets
- Bounds checking in
bytes_to_array()for truncated/corrupted input buffers
Changed
AtmosphereClient→Atmosphere: Renamed with factory classmethodsAtmosphere.login()andAtmosphere.from_env();AtmosphereClientremains as a deprecated aliassampleSchema→schema: Lexicon record type renamed fromac.foundation.dataset.sampleSchematoac.foundation.dataset.schema(clean break, no backward compat)- Module reorganization:
local/split intoindex/(Index, entries, schema management) andstores/(LocalDiskStore, S3DataStore);local/remains as backward-compat re-export shim - CLI rename:
atdata localsubcommand renamed toatdata infra - Uniform Repository model:
Indexnow treats"local"as a regularRepository, collapsing 3-way routing to 2-way (repo/atmosphere) SampleBatchaggregation usesnp.stack()instead ofnp.array(list(...))for efficiency- Numpy scalar coercion in
_make_packable— numpy scalars extracted to Python primitives before msgpack serialization
Fixed
- Schema round-trip in
Index.write()— schemas with NDArray fields now survive publish/decode correctly - Duplicate
force-includeinpyproject.tomlcausing PyPI wheel upload rejection
Full Changelog: v0.3.0b2...v0.3.1b1
v0.3.0b2
What's Changed
- Release v0.2.1a1 by @maxine-at-forecast in #1
- release: v0.3.0b2 by @maxine-at-forecast in #2
New Contributors
- @maxine-at-forecast made their first contribution in #1
Full Changelog: https://github.com/forecast-bio/atdata/commits/v0.3.0b2