Skip to content

GH-506: Clarify field ordering for partially shredded object union#565

Open
nssalian wants to merge 1 commit intoapache:masterfrom
nssalian:variant-union-spec-change
Open

GH-506: Clarify field ordering for partially shredded object union#565
nssalian wants to merge 1 commit intoapache:masterfrom
nssalian:variant-union-spec-change

Conversation

@nssalian
Copy link
Copy Markdown

The construct_variant pseudocode in VariantShredding.md uses .union() without specifying result field ordering. This adds a comment cross-referencing VariantEncoding.md's requirement that object field IDs must be sorted lexicographically.

I dug into the code and found that parquet-java already enforces this via Collections.sort(fields) in VariantBuilder.endObject(), verified by TestReadVariant#testPartiallyShreddedObjectOutOfOrder.

Closes #506

@nssalian nssalian marked this pull request as ready for review April 15, 2026 19:56
@nssalian
Copy link
Copy Markdown
Author

CC: @emkornfield

Comment thread VariantShredding.md

# union the shredded fields and non-shredded fields
# (the result is a Variant object; field ID ordering rules
# from VariantEncoding.md apply)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably better to summarize them here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[VariantShredding] Union of partially shredded objects has undefined field order in result object

2 participants