fix: Support data pruning using nested partition columns #18126

linliu-code · 2026-02-07T23:25:56Z

Describe the issue this Pull Request addresses

There's a change in behavior for for SparkHoodieTableFileIndex since 0.14.1. The StructType(partitionFields) returned doesn't have the full path and causing data validation failures. This behavior was changed as part of this PR https://github.com/apache/hudi/pull/9863/changes

Summary and Changelog

If there's a table with a nested partition column whose leaf name conflicts with another top level field the partitionedSchema passed to the new file group reader is incorrect. The fix is to return the partition field with the full path name instead of the inner field name.

Impact

Medium

Risk Level

Low.

Documentation Update

Contributor's checklist

Read through contributor's guide
Enough context is provided in the sections above
Adequate tests were added if applicable

nsivabalan · 2026-02-09T03:10:07Z

@hudi-bot run azure

linliu-code · 2026-02-09T17:33:52Z

@hudi-bot run azure

The command seems not working. Let me push it again to trigger the Azure test.

hudi-bot · 2026-02-10T01:22:26Z

CI report:

eaf9bd8 Azure: PENDING
0000 Unknown: CANCELED

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

Handle nested map and array columns in MDT

ff71010

github-actions bot added the size:M PR with lines of changes in (100, 300] label Feb 7, 2026

linliu-code force-pushed the nested_partitioning branch 3 times, most recently from d6f9ca7 to 413fa60 Compare February 8, 2026 01:00

linliu-code changed the title ~~fix: Reproduce nested partition columns pruning data validation failure~~ fix: Support data pruning using nested partition columns Feb 8, 2026

linliu-code marked this pull request as ready for review February 8, 2026 05:50

linliu-code requested a review from yihua February 8, 2026 05:54

nsivabalan approved these changes Feb 9, 2026

View reviewed changes

Fix the issue and add tests

eaf9bd8

linliu-code force-pushed the nested_partitioning branch from 413fa60 to eaf9bd8 Compare February 9, 2026 17:34

apache deleted a comment from hudi-bot Feb 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Support data pruning using nested partition columns #18126

fix: Support data pruning using nested partition columns #18126

Uh oh!

linliu-code commented Feb 7, 2026 •

edited

Loading

Uh oh!

nsivabalan commented Feb 9, 2026

Uh oh!

linliu-code commented Feb 9, 2026

Uh oh!

hudi-bot commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: Support data pruning using nested partition columns #18126

Are you sure you want to change the base?

fix: Support data pruning using nested partition columns #18126

Uh oh!

Conversation

linliu-code commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the issue this Pull Request addresses

Summary and Changelog

Impact

Risk Level

Documentation Update

Contributor's checklist

Uh oh!

nsivabalan commented Feb 9, 2026

Uh oh!

linliu-code commented Feb 9, 2026

Uh oh!

hudi-bot commented Feb 10, 2026

CI report:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

linliu-code commented Feb 7, 2026 •

edited

Loading