Implement native interleave for ListView#9558
Conversation
a1131f2 to
16f2287
Compare
|
Do you mind comparing this to the fallthrough performance of #9562 ? |
Oh for sure, thanks for reminding me! |
|
Updated the description with results now. It's not looking like a win..! |
|
I would say let's merge the fallthrough and iterate on this version. I'm sure there are several possibilities for optimizations. |
|
FWIW I pushed up the branch I've had marinating locally for a month or two in case it's helpful: main...polarsignals:arrow-rs:asubiotto/lvinterleave. I believe the benchmarks showed a slight regression for interleaves of small lists, but overall the perf was an improvement. I'm not able to take a closer look right now, but sharing in case it's helpful. |
Thank you! |
6e8412e to
b18c3a6
Compare
|
Updated implementation and results now! |
|
Sorry for dropping the ball on this! I think this is going in the right direction but when I pulled this in to try it out I realized that it doesn't work very well when interleaving listviews with a high number of shraed elements (i.e. offset/size windows are overlapping). I think we can get the best of both worlds by computing a heuristic: i.e. how many values are referenced vs how many values are in the backing array to figure out if we want to do per-row copies as this pr does or just a full concat of the backing slice which preserves overlapping encodings and can be much cheaper in the end. Here is a commit that implements that on top of this PR with a benchmark: polarsignals@7cb6880 There is a slight perf hit vs your branch to compute the heuristic (summing referenced sizes), but I think it's worth it in the grand scheme of things: |
Could you perhaps make a PR that adds this case as a benchmark? |
| let list_i64_no_nulls = | ||
| create_primitive_list_array_with_seed::<i32, Int64Type>(8192, 0.0, 0.0, 20, 42); | ||
|
|
||
| let list_view_i64: ListViewArray = |
There was a problem hiding this comment.
Can we please add this benchmark as a separate PR (to make it easier to run the automated benchmark runners)?
Ref #9558 (comment) --------- Co-authored-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
ListViewsupport forinterleavekernel #9342.This PR adds a native implementation of interleave for the ListView type. Also adds a benchmark.
Performance improves by more than 30% in all cases in the benchmark:
list_view<i64>(0.1,0.1,20) 100list_view<i64>(0.1,0.1,20) 400list_view<i64>(0.1,0.1,20) 1024list_view<i64>(0.1,0.1,20) 1024 4-arrlist_view<i64>(0.0,0.0,20) 100list_view<i64>(0.0,0.0,20) 400list_view<i64>(0.0,0.0,20) 1024list_view<i64>(0.0,0.0,20) 1024 4-arr