arrow-select: implement specialized interleave_list#8953
Conversation
|
run benchmark interleave_kernels |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
I noticed there were no benchmarks for interleaving ListArrays, so I made a PR to add some |
alamb
left a comment
There was a problem hiding this comment.
Thank you @asubiotto -- the code looks good to me, but I don't think we should merge this in without some results showing it make things faster
I made a PR to add some benchmark coverage -- once that is merged i can test this pR again.
I also verified coverage using
cargo llvm-cov --html -p arrow-selectIt looks like we don't have coverage for interleaving LargeLists
However as this is the same code for normal lists I think it is ok (though it would be nice to add more coverage)
36392c1 to
1281cf3
Compare
Added coverage for large lists by making the existing test generic over the offset type and running it for both list and largelist. Thanks for the review! This is ready for another look. |
1281cf3 to
a291eec
Compare
# Which issue does this PR close? - Part of #8953 # Rationale for this change While reviewing #8953 from @asubiotto I noticed there was no benchmark for interleave with ListArray. Let's add some so we can evaluate the performance impact of ttps://github.com//pull/8953 and future changes. # What changes are included in this PR? Add benchmark for list interleaving # Are these changes tested? I ran the bechmarks manually ```shell cargo bench --bench interleave_kernels -- list ``` # Are there any user-facing changes? No
|
run benchmark interleave_kernels |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
(I think we need to merge/rebase this branch with the latest main so it also contains the list benchmarks) |
a291eec to
be7904c
Compare
|
Triggered an automatic rebase with Github. |
|
run benchmark interleave_kernels |
|
🤖 |
|
🤖: Benchmark completed Details
|
Looks good to me 👍 |
alamb
left a comment
There was a problem hiding this comment.
Thank you @asubiotto -- this looks great to me ❤️
be7904c to
a1674ad
Compare
|
run benchmark interleave_kernels |
|
🤖 |
|
🤖: Benchmark completed Details
|
Previously, List and LargeList would fall through to the interleave_fallback match arm, which is inefficient. This commit implements interleave_list, which interleaves a list's child arrays and rebuilds the offsets buffer. Running it on production tests reduced memory by 80%. Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
a1674ad to
5b441ec
Compare
|
Addressed the latest comment and rebased. This should be ready to merge. |
|
run benchmark interleave_kernels |
This comment was marked as outdated.
This comment was marked as outdated.
|
Thank you for this PR @asubiotto and all the help @Dandandan |
|
🤖 |
|
🤖: Benchmark completed Details
|
Previously, List and LargeList would fall through to the interleave_fallback match arm, which is inefficient. This commit implements interleave_list, which interleaves a list's child arrays and rebuilds the offsets buffer. Running it on production tests reduced memory by 80%.
Which issue does this PR close?
interleavefor List/LargeList #8952Rationale for this change
Performance and memory usage when interleaving List/LargeList
Are these changes tested?
This PR does not include tests because interleave tests for Lists already exist
Are there any user-facing changes?
No, purely performance