Skip to content

Add UNSHREDDED Parquet writer and read support for Variant #17746

@the-other-tim-brown

Description

@the-other-tim-brown

Task Description

What needs to be done:
Add support for writing parquet files with Variant data types.

Additionally, set up Spark functional tests (or augment existing ones) for the variant type.

Also ensure that files with this data type can be read back into Hoodie Records or engine specific row types in MOR and COW tables

Update functional tests in spark to ensure that the data can be read back properly.

Why this task is needed:
This sets up the foundation for supporting Variant in Hudi

After this is completed, we should be able to have a simple CoW and Mor workflow with variant for unshredded data.

Task Type

Code improvement/refactoring

Related Issues

Parent feature issue: (if applicable )
Related issues:
NOTE: Use Relationships button to add parent/blocking issues after issue is created.

Metadata

Metadata

Assignees

Labels

type:devtaskDevelopment tasks and maintenance work

Type

No type

Projects

Status

In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions