Replies: 4 comments 11 replies
-
|
@FelixYBW can you please look. |
Beta Was this translation helpful? Give feedback.
-
|
@shadowmmu Thanks for sharing the findings and analysis. Looks like in your example run for Q17 with Gluten, Spark run time filter is not triggered. Have you also tried to lower the application side threshold?
Please also note by default DBX enabled the local caching feature which can significantly improve performance for subsequent queries in a power test run. It also collects runtime statistics, helping later queries generate better execution plans(their CBO is enabled by default) |
Beta Was this translation helpful? Give feedback.
-
|
@zhouyuan do you have any suggestion about how can I overcome this runtime filter bottleneck? |
Beta Was this translation helpful? Give feedback.
-
|
Velox uses a very simple and different BloomFilter https://github.com/facebookincubator/velox/blob/main/velox/common/base/BloomFilter.h#L120 vs Spark BloomFilter, this may affect the filter efficiency. |
Beta Was this translation helpful? Give feedback.


Uh oh!
There was an error while loading. Please reload this page.
-
While benchmarking TPC-H 1TB on Gluten+Velox, we identified a major performance bottleneck compared to Databricks Photon in shuffle-heavy queries (Q8, Q9, Q17, Q18, Q21).
The Evidence
lineitemrecords using Bloom Filters, keeping Disk I/O below 10%.The Root Cause
Spark has a default limit of 10MB for Bloom Filter, so when we increased the limit to 1GB to make comparable with Databricks Photon, bloom filter is created after config change but filter is negligable.
The Velox backend seems limited by a hardcoded Bloom Filter size of 4,194,304 bits (). This limit is too low to maintain low false-positive rates for 1TB cardinality, rendering the filter ineffective.
Questions for Maintainers
maxNumBitsfor Bloom Filters to scale beyond the current hardcoded limit?Spark Plan for Q17
Databricks

Velox

Beta Was this translation helpful? Give feedback.
All reactions