Unordered Add Write Mode Support for Conflict-Free Concurrent Table Writes (#575)
Portable Manifests for Catalog Sharing (#575)
Partitioned Transaction Logs and Metafile Revisions to Improve Cloud Metadata Performance at Scale (#575)
DeltaCAT Catalog Writer Version Tracking via Version Files (#575)
Catalog Config File Support (#569)

Bug Fixes:

Full Changelog: 2.0.0.post2...2.0.0.post3

Assets 4

06 Sep 17:57

pdames

DeltaCAT 2.0.0.post2

Release Notes

Post-release bug fixes for DeltaCAT 2.0.

Bug Fixes:

Fixed Ray Data and Daft Infinite Retry Loop Issue when Writing to Cloud Storage (#574)
Fixed Bug Causing Ray Data, Pandas, NumPy, and Polars to return None/NaN values for new fields following schema evolution in some write modes (#573)
Improved support for external multimodal URL processing (#573)

Full Changelog: 2.0.0.post1...2.0.0.post2

Assets 4

03 Sep 06:57

pdames

DeltaCAT 2.0

Initial implementation of core DeltaCAT 2.0 catalog APIs for Daft, Ray Data, Pandas, PyArrow, NumPy, and Polars.

Among other features, it provides:

Inline copy-on-write table compaction and table properties to control automated compaction.
Automatic/manual schema evolution support, and table properties to control table schema evolution behavior.
Support for writing/reading both schemaless tables and tables with schemas.
Full cross-catalog, recursive metadata copy and backfill support (e.g., to support easily backfilling major revisions to catalog metadata storage specification).
Frontpage "overview"/"quickstart" documentation and more detailed Storage, Table, and Schema README doc pages.
Multi-table/namespace/etc. transaction support (i.e., transactions that can operate over any number of objects within the bounds of a single catalog).
Comprehensive, auto-generated (via new make type-mappings makefile target) reader/writer support matrix in reader_compatibility_mapping.py across all Arrow data types, supported dataset types (PyArrow, Pandas, Polars, NumPy, Daft, Ray Data), and supported content types with inline schema (Parquet, Avro, Orc, Feather). This allows us to quickly detect and short-circuit any write that would break a declared supported reader before persisting data or doing any computationally expensive work.
Transaction log queries and time travel.

Full Changelog: 2.0.0b11...2.0.0.post1

Assets 4

28 Aug 18:36

pfaraone

1.1.38

What's Changed

fix typo by @raghumdani in #538
Adds output_record_count field to compaction session audit info by @pfaraone in #571
Version bump to 1.1.38 by @pfaraone in #572

Full Changelog: 1.1.37...1.1.38