[pull] main from apache:main by pull[bot] · Pull Request #93 · buraksenn/datafusion

pull · 2026-04-10T00:33:25Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

## Which issue does this PR close? - Closes #21518. ## Rationale for this change Similar to other recent changes, `substr` currently checks for NULLs and builds the result NULL bitmap on a per-row basis. It is faster to instead compute the result NULL bitmap in bulk via bitwise AND. Benchmarks (ARM64): ``` - substr, no count, short strings/substr_large_string [size=1024]: 21.4µs → 20.9µs (-2.3%) - substr, no count, short strings/substr_large_string [size=4096]: 83.1µs → 83.0µs (-0.1%) - substr, no count, short strings/substr_string [size=1024]: 20.5µs → 19.8µs (-3.4%) - substr, no count, short strings/substr_string [size=4096]: 78.8µs → 77.0µs (-2.3%) - substr, no count, short strings/substr_string_view [size=1024]: 18.9µs → 16.1µs (-14.8%) - substr, no count, short strings/substr_string_view [size=4096]: 74.0µs → 61.6µs (-16.8%) - substr, scalar args, long strings/substr_large_string [size=1024]: 35.2µs → 34.0µs (-3.4%) - substr, scalar args, long strings/substr_large_string [size=4096]: 140.6µs → 134.5µs (-4.3%) - substr, scalar args, long strings/substr_string [size=1024]: 35.5µs → 33.8µs (-4.8%) - substr, scalar args, long strings/substr_string [size=4096]: 138.9µs → 134.2µs (-3.4%) - substr, scalar args, long strings/substr_string_view [size=1024]: 34.0µs → 31.0µs (-8.8%) - substr, scalar args, long strings/substr_string_view [size=4096]: 132.0µs → 121.8µs (-7.7%) - substr, scalar args, short strings/substr_string [size=1024]: 31.0µs → 29.2µs (-5.8%) - substr, scalar args, short strings/substr_string [size=4096]: 120.8µs → 111.5µs (-7.7%) - substr, scalar args, short strings/substr_string_view [size=1024]: 26.8µs → 23.1µs (-13.8%) - substr, scalar args, short strings/substr_string_view [size=4096]: 101.6µs → 86.4µs (-14.9%) - substr, scalar start, no count, long strings/substr_string [size=1024]: 34.5µs → 33.2µs (-3.8%) - substr, scalar start, no count, long strings/substr_string [size=4096]: 134.4µs → 133.6µs (-0.6%) - substr, scalar start, no count, long strings/substr_string_view [size=1024]: 32.9µs → 29.4µs (-10.6%) - substr, scalar start, no count, long strings/substr_string_view [size=4096]: 126.1µs → 115.2µs (-8.6%) - substr, scalar start, no count, short strings/substr_string [size=1024]: 20.9µs → 20.1µs (-3.8%) - substr, scalar start, no count, short strings/substr_string [size=4096]: 80.1µs → 77.5µs (-3.2%) - substr, scalar start, no count, short strings/substr_string_view [size=1024]: 19.9µs → 16.7µs (-16.1%) - substr, scalar start, no count, short strings/substr_string_view [size=4096]: 74.4µs → 62.4µs (-16.1%) - substr, short count, long strings/substr_large_string [size=1024]: 30.3µs → 28.4µs (-6.3%) - substr, short count, long strings/substr_large_string [size=4096]: 117.1µs → 112.0µs (-4.4%) - substr, short count, long strings/substr_string [size=1024]: 30.2µs → 28.3µs (-6.3%) - substr, short count, long strings/substr_string [size=4096]: 118.0µs → 111.0µs (-5.9%) - substr, short count, long strings/substr_string_view [size=1024]: 26.1µs → 22.8µs (-12.6%) - substr, short count, long strings/substr_string_view [size=4096]: 101.5µs → 87.7µs (-13.6%) - substr, with count, long strings/substr_large_string [size=1024]: 34.6µs → 32.8µs (-5.2%) - substr, with count, long strings/substr_large_string [size=4096]: 136.7µs → 133.0µs (-2.7%) - substr, with count, long strings/substr_string [size=1024]: 34.2µs → 32.7µs (-4.4%) - substr, with count, long strings/substr_string [size=4096]: 136.6µs → 132.3µs (-3.1%) - substr, with count, long strings/substr_string_view [size=1024]: 33.3µs → 30.3µs (-9.0%) - substr, with count, long strings/substr_string_view [size=4096]: 129.1µs → 119.6µs (-7.4%) ``` ## What changes are included in this PR? * Implement optimization * Rename `make_and_append_view` to `append_view`, and have callers deal with NULL handling; making it part of `append_view` encourages per-row NULL computations, which should be avoided when possible. * Mark `append_view` as never-inline; this avoids a performance regression on some of the `substr` microbenchmarks, where LLVM is a little eager to inline a large-ish function into a hot loop. ## Are these changes tested? Yes. ## Are there any user-facing changes? No.

…erState> (#21517) ## Which issue does this PR close? Partially addresses #20910, might be the last one for now. ## Rationale for this change In full outer joins with filters, `BufferedBatch` tracks which buffered rows had all filter evaluations fail using a `HashMap<u64, bool>`. This map is read and written per-row in a hot loop during `freeze_streamed_matched`. The HashMap pays ~40-60 bytes per entry (8-byte u64 key + 1-byte bool value + hash table overhead), hashes every key twice per iteration (once for `get`, once for `insert`), and scatters entries across heap allocations with poor cache locality. ## What changes are included in this PR? Replaces `HashMap<u64, bool>` with `Vec<FilterState>` indexed by absolute row position within the batch. `FilterState` is a `#[repr(u8)]` enum with three variants (`Unvisited`, `AllFailed`, `SomePassed`), so the Vec is 1 byte per row — allocated once, direct-indexed, no hashing. At the default batch size of 8192 rows the Vec is 8 KB (fits in L1 cache). Even at large batch sizes (32K+), 32 KB is still within L1 on most machines, while the HashMap at 32K entries would consume ~1-2 MB of scattered heap memory. Three states are needed because a simple `Vec<bool>` cannot distinguish "never matched" (handled separately by `null_joined`) from "matched but all filters failed" (must be emitted as null-joined). The enum variant names are self-documenting, unlike `Option<bool>` where `None`/`Some(true)`/`Some(false)` would be opaque. ## Are these changes tested? Existing tests. ## Are there any user-facing changes? No.

neilconway and others added 2 commits April 9, 2026 21:38

pull bot locked and limited conversation to collaborators Apr 10, 2026

pull bot added the ⤵️ pull label Apr 10, 2026

pull bot merged commit e8990c4 into buraksenn:main Apr 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from apache:main#93

[pull] main from apache:main#93
pull[bot] merged 2 commits intoburaksenn:mainfrom
apache:main

pull bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pull bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pull bot commented Apr 10, 2026 •

edited

Loading