Skip to content

[VL][BUG]Spark UTs from suite DynamicPartitionPruningHiveScanSuite are failing #11692

@manikumararyas

Description

@manikumararyas

Backend

VL (Velox)

Bug description

Some UTs from Spark test suite DynamicPartitionPruningHiveScanSuiteAEOff and DynamicPartitionPruningHiveScanSuiteAEOn are failing from Gluten version 1.5.1 onwards.

Error Message:
requirement failed: input[0, bigint, true] IN dynamicpruning#13980 has not finished

The below are the list of UTs.

avoid reordering broadcast join keys to match input hash partitioning
broadcast a single key in a HashedRelation
broadcast multiple keys in a LongHashedRelation
broadcast multiple keys in an UnsafeHashedRelation
cleanup any DPP filter that isn't pushed down due to expression id clashes
different broadcast subqueries with identical children
DPP should not be rewritten as an existential join
DPP triggers only for certain types of query
nB
filtering ratio policy with stats when the broadcast pruning is disabled
Gluten - Make sure dynamic pruning works on uncorrelated queries
Gluten - SPARK-38674: Remove useless deduplicate in SubqueryBroadcastExec
join key with multiple references on the filtering plan
partition pruning in broadcast hash joins
partition pruning in broadcast hash joins with aliases
Plan broadcast pruning only when the broadcast can be reused
simple inner join triggers DPP with mock-up tables
SPARK-32659: Fix the data issue when pruning DPP on non-atomic type
SPARK-32817: DPP throws error when the broadcast side is empty
SPARK-34436: DPP support LIKE ANY/ALL expression
SPARK-34595: DPP support RLIKE expression
SPARK-34637: DPP side broadcast query stage is created firstly
SPARK-36444: Remove OptimizeSubqueries from batch of PartitionPruning
SPARK-38148: Do not add dynamic partition pruning if there exists static partition pruning
SPARK-39217: Makes DPP support the pruning side has Union
SPARK-39338: Remove dynamic pruning subquery if pruningKey's references is empty
Subquery reuse across the whole plan

Looks like #11113 is causing the issue. Consistently reproducible.

Gluten version
Gluten 1.5.1 - latest

Spark version
4.0.x

Spark configurations

System information

Gluten version

Gluten-1.5

Spark version

spark-4.0.x

Spark configurations

No response

System information

No response

Relevant logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions