Skip to content

arrow-select: implement specialized interleave_list#8953

Merged
alamb merged 1 commit intoapache:mainfrom
polarsignals:asubiotto/interleavelist
Dec 13, 2025
Merged

arrow-select: implement specialized interleave_list#8953
alamb merged 1 commit intoapache:mainfrom
polarsignals:asubiotto/interleavelist

Conversation

@asubiotto
Copy link
Copy Markdown
Contributor

Previously, List and LargeList would fall through to the interleave_fallback match arm, which is inefficient. This commit implements interleave_list, which interleaves a list's child arrays and rebuilds the offsets buffer. Running it on production tests reduced memory by 80%.

Which issue does this PR close?

Rationale for this change

Performance and memory usage when interleaving List/LargeList

Are these changes tested?

This PR does not include tests because interleave tests for Lists already exist

Are there any user-facing changes?

No, purely performance

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 4, 2025
@asubiotto
Copy link
Copy Markdown
Contributor Author

cc @alamb / @tustvold

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Dec 10, 2025

run benchmark interleave_kernels

@alamb-ghbot
Copy link
Copy Markdown

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing asubiotto/interleavelist (36392c1) to bab30ae diff
BENCH_NAME=interleave_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench interleave_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=asubiotto_interleavelist
Results will be posted here when complete

@alamb-ghbot
Copy link
Copy Markdown

🤖: Benchmark completed

Details

group                                                                                        asubiotto_interleavelist               main
-----                                                                                        ------------------------               ----
interleave dict(20, 0.0) 100 [0..100, 100..230, 450..1000]                                   1.01      3.2±0.07µs        ? ?/sec    1.00      3.1±0.07µs        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.00     15.6±0.13µs        ? ?/sec    1.00     15.5±0.27µs        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                  1.02     15.5±0.05µs        ? ?/sec    1.00     15.3±0.06µs        ? ?/sec
interleave dict(20, 0.0) 400 [0..100, 100..230, 450..1000]                                   1.00      7.0±0.08µs        ? ?/sec    1.01      7.1±0.22µs        ? ?/sec
interleave dict_distinct 100                                                                 1.03      3.0±0.02µs        ? ?/sec    1.00      2.9±0.03µs        ? ?/sec
interleave dict_distinct 1024                                                                1.00      2.8±0.07µs        ? ?/sec    1.02      2.9±0.02µs        ? ?/sec
interleave dict_distinct 2048                                                                1.01      2.9±0.03µs        ? ?/sec    1.00      2.9±0.04µs        ? ?/sec
interleave dict_sparse(20, 0.0) 100 [0..100, 100..230, 450..1000]                            1.01      3.2±0.02µs        ? ?/sec    1.00      3.1±0.02µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                  1.00     15.6±0.07µs        ? ?/sec    1.00     15.6±0.17µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000]                           1.01     15.5±0.19µs        ? ?/sec    1.00     15.4±0.09µs        ? ?/sec
interleave dict_sparse(20, 0.0) 400 [0..100, 100..230, 450..1000]                            1.00      7.0±0.05µs        ? ?/sec    1.00      7.0±0.07µs        ? ?/sec
interleave i32(0.0) 100 [0..100, 100..230, 450..1000]                                        1.01    315.1±8.22ns        ? ?/sec    1.00    311.6±0.38ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00   1851.4±8.06ns        ? ?/sec    1.01  1863.9±87.62ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000]                                       1.04   1901.7±8.53ns        ? ?/sec    1.00  1835.3±31.36ns        ? ?/sec
interleave i32(0.0) 400 [0..100, 100..230, 450..1000]                                        1.01   917.5±12.49ns        ? ?/sec    1.00    912.7±5.74ns        ? ?/sec
interleave i32(0.5) 100 [0..100, 100..230, 450..1000]                                        1.01    606.4±4.21ns        ? ?/sec    1.00    600.7±2.47ns        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00      4.3±0.02µs        ? ?/sec    1.00      4.3±0.14µs        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000]                                       1.00      4.3±0.06µs        ? ?/sec    1.00      4.3±0.01µs        ? ?/sec
interleave i32(0.5) 400 [0..100, 100..230, 450..1000]                                        1.04  1918.9±63.50ns        ? ?/sec    1.00  1846.1±53.39ns        ? ?/sec
interleave str(20, 0.0) 100 [0..100, 100..230, 450..1000]                                    1.00   806.6±26.48ns        ? ?/sec    1.00   807.3±26.28ns        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.01      6.3±0.07µs        ? ?/sec    1.00      6.2±0.20µs        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                   1.00      6.3±0.03µs        ? ?/sec    1.01      6.3±1.14µs        ? ?/sec
interleave str(20, 0.0) 400 [0..100, 100..230, 450..1000]                                    1.03      2.6±0.01µs        ? ?/sec    1.00      2.6±0.04µs        ? ?/sec
interleave str(20, 0.5) 100 [0..100, 100..230, 450..1000]                                    1.00  1045.5±10.13ns        ? ?/sec    1.03  1074.9±106.77ns        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.00     10.4±0.06µs        ? ?/sec    1.00     10.3±0.33µs        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000]                                   1.00     10.3±0.06µs        ? ?/sec    1.00     10.3±0.09µs        ? ?/sec
interleave str(20, 0.5) 400 [0..100, 100..230, 450..1000]                                    1.01      3.7±0.03µs        ? ?/sec    1.00      3.7±0.03µs        ? ?/sec
interleave str_view(0.0) 100 [0..100, 100..230, 450..1000]                                   1.00   788.6±34.57ns        ? ?/sec    1.11   876.9±21.59ns        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.00      4.9±0.04µs        ? ?/sec    1.01      5.0±0.07µs        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000]                                  1.00      4.9±0.11µs        ? ?/sec    1.02      5.0±0.07µs        ? ?/sec
interleave str_view(0.0) 400 [0..100, 100..230, 450..1000]                                   1.00      2.1±0.01µs        ? ?/sec    1.07      2.2±0.08µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 100 [0..100, 100..230, 450..1000]                       1.03    863.3±7.98ns        ? ?/sec    1.00    837.6±5.50ns        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]             1.01      4.0±0.05µs        ? ?/sec    1.00      4.0±0.04µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000]                      1.03      4.0±0.17µs        ? ?/sec    1.00      3.9±0.04µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 400 [0..100, 100..230, 450..1000]                       1.03   1925.7±7.66ns        ? ?/sec    1.00  1861.6±35.82ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 100 [0..100, 100..230, 450..1000]                   1.02   1372.7±5.26ns        ? ?/sec    1.00   1341.1±7.20ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]         1.00      8.4±0.08µs        ? ?/sec    1.00      8.4±0.08µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                  1.00      8.4±0.05µs        ? ?/sec    1.00      8.4±0.05µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 400 [0..100, 100..230, 450..1000]                   1.01      3.7±0.11µs        ? ?/sec    1.00      3.7±0.19µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 100 [0..100, 100..230, 450..1000]              1.03  1896.7±30.53ns        ? ?/sec    1.00  1845.8±26.15ns        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000, 0..1000]    1.00     12.8±0.09µs        ? ?/sec    1.00     12.9±0.08µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000]             1.00     12.8±0.06µs        ? ?/sec    1.00     12.8±0.09µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 400 [0..100, 100..230, 450..1000]              1.02      5.5±0.03µs        ? ?/sec    1.00      5.4±0.10µs        ? ?/sec

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Dec 10, 2025

I noticed there were no benchmarks for interleaving ListArrays, so I made a PR to add some

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @asubiotto -- the code looks good to me, but I don't think we should merge this in without some results showing it make things faster

I made a PR to add some benchmark coverage -- once that is merged i can test this pR again.

I also verified coverage using

cargo llvm-cov --html -p arrow-select

It looks like we don't have coverage for interleaving LargeLists

Image

However as this is the same code for normal lists I think it is ok (though it would be nice to add more coverage)

Comment thread arrow-select/src/interleave.rs Outdated
@asubiotto asubiotto force-pushed the asubiotto/interleavelist branch from 36392c1 to 1281cf3 Compare December 11, 2025 08:37
@asubiotto
Copy link
Copy Markdown
Contributor Author

I also verified coverage using

cargo llvm-cov --html -p arrow-select

It looks like we don't have coverage for interleaving LargeLists

Added coverage for large lists by making the existing test generic over the offset type and running it for both list and largelist. Thanks for the review! This is ready for another look.

Comment thread arrow-select/src/interleave.rs Outdated
@asubiotto asubiotto force-pushed the asubiotto/interleavelist branch from 1281cf3 to a291eec Compare December 11, 2025 10:07
Dandandan pushed a commit that referenced this pull request Dec 11, 2025
# Which issue does this PR close?


- Part of  #8953

# Rationale for this change

While reviewing #8953 from
@asubiotto I noticed there was no benchmark for interleave with
ListArray. Let's add some so we can evaluate the performance impact of
ttps://github.com//pull/8953 and future changes.

# What changes are included in this PR?

Add benchmark for list interleaving

# Are these changes tested?
I ran the bechmarks manually 
```shell
cargo bench --bench interleave_kernels -- list
```

# Are there any user-facing changes?

No
@Dandandan
Copy link
Copy Markdown
Contributor

run benchmark interleave_kernels

@alamb-ghbot
Copy link
Copy Markdown

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing asubiotto/interleavelist (a291eec) to bab30ae diff
BENCH_NAME=interleave_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench interleave_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=asubiotto_interleavelist
Results will be posted here when complete

@alamb-ghbot
Copy link
Copy Markdown

🤖: Benchmark completed

Details

group                                                                                        asubiotto_interleavelist               main
-----                                                                                        ------------------------               ----
interleave dict(20, 0.0) 100 [0..100, 100..230, 450..1000]                                   1.04      3.2±0.05µs        ? ?/sec    1.00      3.1±0.02µs        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.04     16.3±0.31µs        ? ?/sec    1.00     15.7±0.19µs        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                  1.03     15.9±0.20µs        ? ?/sec    1.00     15.5±0.35µs        ? ?/sec
interleave dict(20, 0.0) 400 [0..100, 100..230, 450..1000]                                   1.03      7.3±0.07µs        ? ?/sec    1.00      7.1±0.19µs        ? ?/sec
interleave dict_distinct 100                                                                 1.00      2.9±0.06µs        ? ?/sec    1.00      2.9±0.03µs        ? ?/sec
interleave dict_distinct 1024                                                                1.00      2.8±0.04µs        ? ?/sec    1.01      2.9±0.07µs        ? ?/sec
interleave dict_distinct 2048                                                                1.00      2.8±0.03µs        ? ?/sec    1.02      2.9±0.02µs        ? ?/sec
interleave dict_sparse(20, 0.0) 100 [0..100, 100..230, 450..1000]                            1.04      3.2±0.04µs        ? ?/sec    1.00      3.1±0.05µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                  1.04     16.3±0.06µs        ? ?/sec    1.00     15.6±0.06µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000]                           1.03     15.9±0.16µs        ? ?/sec    1.00     15.4±0.35µs        ? ?/sec
interleave dict_sparse(20, 0.0) 400 [0..100, 100..230, 450..1000]                            1.05      7.3±0.07µs        ? ?/sec    1.00      7.0±0.03µs        ? ?/sec
interleave i32(0.0) 100 [0..100, 100..230, 450..1000]                                        1.00    307.9±1.37ns        ? ?/sec    1.01    309.7±1.03ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00   1852.1±7.07ns        ? ?/sec    1.00  1855.3±25.45ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000]                                       1.05  1916.7±84.61ns        ? ?/sec    1.00  1829.8±22.21ns        ? ?/sec
interleave i32(0.0) 400 [0..100, 100..230, 450..1000]                                        1.00   912.0±11.69ns        ? ?/sec    1.00   915.8±17.66ns        ? ?/sec
interleave i32(0.5) 100 [0..100, 100..230, 450..1000]                                        1.00    590.6±4.26ns        ? ?/sec    1.01    598.8±7.93ns        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00      4.3±0.05µs        ? ?/sec    1.00      4.3±0.06µs        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000]                                       1.02      4.4±0.05µs        ? ?/sec    1.00      4.3±0.05µs        ? ?/sec
interleave i32(0.5) 400 [0..100, 100..230, 450..1000]                                        1.04  1918.0±88.95ns        ? ?/sec    1.00  1841.1±33.25ns        ? ?/sec
interleave str(20, 0.0) 100 [0..100, 100..230, 450..1000]                                    1.00    788.1±7.31ns        ? ?/sec    1.00    791.9±8.24ns        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.00      6.2±0.17µs        ? ?/sec    1.01      6.2±0.09µs        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                   1.00      6.1±0.15µs        ? ?/sec    1.00      6.1±0.03µs        ? ?/sec
interleave str(20, 0.0) 400 [0..100, 100..230, 450..1000]                                    1.00      2.5±0.02µs        ? ?/sec    1.00      2.5±0.03µs        ? ?/sec
interleave str(20, 0.5) 100 [0..100, 100..230, 450..1000]                                    1.00   1038.7±5.74ns        ? ?/sec    1.02   1059.9±8.77ns        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.00     10.3±0.18µs        ? ?/sec    1.00     10.3±0.11µs        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000]                                   1.00     10.4±0.05µs        ? ?/sec    1.00     10.4±0.21µs        ? ?/sec
interleave str(20, 0.5) 400 [0..100, 100..230, 450..1000]                                    1.01      3.7±0.09µs        ? ?/sec    1.00      3.6±0.02µs        ? ?/sec
interleave str_view(0.0) 100 [0..100, 100..230, 450..1000]                                   1.09    888.8±9.96ns        ? ?/sec    1.00    813.4±3.72ns        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.00      4.9±0.01µs        ? ?/sec    1.03      5.0±0.08µs        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000]                                  1.03      5.1±0.05µs        ? ?/sec    1.00      5.0±0.08µs        ? ?/sec
interleave str_view(0.0) 400 [0..100, 100..230, 450..1000]                                   1.02      2.2±0.04µs        ? ?/sec    1.00      2.1±0.08µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 100 [0..100, 100..230, 450..1000]                       1.00   830.1±12.00ns        ? ?/sec    1.09   905.6±11.33ns        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]             1.00      3.9±0.02µs        ? ?/sec    1.01      4.0±0.02µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000]                      1.00      3.9±0.03µs        ? ?/sec    1.03      4.0±0.02µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 400 [0..100, 100..230, 450..1000]                       1.00  1839.2±20.57ns        ? ?/sec    1.06   1940.5±9.53ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 100 [0..100, 100..230, 450..1000]                   1.00   1316.1±3.14ns        ? ?/sec    1.02   1344.8±7.47ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]         1.00      8.3±0.56µs        ? ?/sec    1.00      8.3±0.13µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                  1.00      8.2±0.15µs        ? ?/sec    1.00      8.2±0.08µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 400 [0..100, 100..230, 450..1000]                   1.00      3.6±0.02µs        ? ?/sec    1.01      3.7±0.03µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 100 [0..100, 100..230, 450..1000]              1.00  1826.5±22.17ns        ? ?/sec    1.02   1858.7±8.62ns        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000, 0..1000]    1.00     12.6±0.04µs        ? ?/sec    1.02     12.8±0.08µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000]             1.00     12.5±0.07µs        ? ?/sec    1.01     12.6±0.06µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 400 [0..100, 100..230, 450..1000]              1.00      5.5±0.02µs        ? ?/sec    1.01      5.5±0.05µs        ? ?/sec

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Dec 11, 2025

(I think we need to merge/rebase this branch with the latest main so it also contains the list benchmarks)

@asubiotto asubiotto force-pushed the asubiotto/interleavelist branch from a291eec to be7904c Compare December 11, 2025 13:14
@asubiotto
Copy link
Copy Markdown
Contributor Author

Triggered an automatic rebase with Github.

@Dandandan
Copy link
Copy Markdown
Contributor

run benchmark interleave_kernels

@alamb-ghbot
Copy link
Copy Markdown

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing asubiotto/interleavelist (be7904c) to 026a260 diff
BENCH_NAME=interleave_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench interleave_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=asubiotto_interleavelist
Results will be posted here when complete

@alamb-ghbot
Copy link
Copy Markdown

🤖: Benchmark completed

Details

group                                                                                        asubiotto_interleavelist               main
-----                                                                                        ------------------------               ----
interleave dict(20, 0.0) 100 [0..100, 100..230, 450..1000]                                   1.00      3.3±0.03µs        ? ?/sec    1.00      3.3±0.07µs        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.00     16.5±0.12µs        ? ?/sec    1.00     16.5±0.42µs        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                  1.04     16.8±0.24µs        ? ?/sec    1.00     16.1±0.08µs        ? ?/sec
interleave dict(20, 0.0) 400 [0..100, 100..230, 450..1000]                                   1.06      7.9±0.08µs        ? ?/sec    1.00      7.5±0.10µs        ? ?/sec
interleave dict_distinct 100                                                                 1.00      3.0±0.18µs        ? ?/sec    1.00      3.0±0.05µs        ? ?/sec
interleave dict_distinct 1024                                                                1.00      2.9±0.04µs        ? ?/sec    1.01      3.0±0.04µs        ? ?/sec
interleave dict_distinct 2048                                                                1.00      2.9±0.08µs        ? ?/sec    1.01      3.0±0.04µs        ? ?/sec
interleave dict_sparse(20, 0.0) 100 [0..100, 100..230, 450..1000]                            1.01      3.3±0.03µs        ? ?/sec    1.00      3.3±0.02µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                  1.00     16.3±0.21µs        ? ?/sec    1.01     16.5±0.13µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000]                           1.05     16.8±1.12µs        ? ?/sec    1.00     16.1±0.11µs        ? ?/sec
interleave dict_sparse(20, 0.0) 400 [0..100, 100..230, 450..1000]                            1.05      7.9±0.10µs        ? ?/sec    1.00      7.5±0.05µs        ? ?/sec
interleave i32(0.0) 100 [0..100, 100..230, 450..1000]                                        1.01    312.6±4.10ns        ? ?/sec    1.00    308.8±3.35ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00  1856.5±23.00ns        ? ?/sec    1.01  1872.4±100.56ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000]                                       1.04  1905.9±18.40ns        ? ?/sec    1.00  1828.6±22.86ns        ? ?/sec
interleave i32(0.0) 400 [0..100, 100..230, 450..1000]                                        1.01   917.0±15.51ns        ? ?/sec    1.00    911.0±8.97ns        ? ?/sec
interleave i32(0.5) 100 [0..100, 100..230, 450..1000]                                        1.01   608.0±52.58ns        ? ?/sec    1.00   599.3±16.33ns        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00      4.3±0.01µs        ? ?/sec    1.02      4.4±0.06µs        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000]                                       1.00      4.3±0.05µs        ? ?/sec    1.03      4.4±0.01µs        ? ?/sec
interleave i32(0.5) 400 [0..100, 100..230, 450..1000]                                        1.00  1832.5±41.58ns        ? ?/sec    1.00   1841.6±6.54ns        ? ?/sec
interleave list<i64>(0.0,0.0,20) 100 [0..100, 100..230, 450..1000]                           1.00      3.3±0.14µs        ? ?/sec    1.40      4.7±0.06µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000, 0..1000]                 1.03     30.2±0.36µs        ? ?/sec    1.00     29.3±0.30µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000]                          1.00     30.0±0.35µs        ? ?/sec    1.02     30.6±3.77µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 400 [0..100, 100..230, 450..1000]                           1.00     11.5±0.13µs        ? ?/sec    1.07     12.2±0.27µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 100 [0..100, 100..230, 450..1000]                           1.00      6.4±0.03µs        ? ?/sec    1.21      7.8±0.13µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000, 0..1000]                 1.00     49.8±1.00µs        ? ?/sec    1.21     60.3±1.99µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000]                          1.00     51.1±0.82µs        ? ?/sec    1.18     60.4±0.99µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 400 [0..100, 100..230, 450..1000]                           1.00     19.7±0.07µs        ? ?/sec    1.21     23.8±0.43µs        ? ?/sec
interleave str(20, 0.0) 100 [0..100, 100..230, 450..1000]                                    1.00    800.2±7.13ns        ? ?/sec    1.11   884.3±18.17ns        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.01      6.2±0.07µs        ? ?/sec    1.00      6.1±0.06µs        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                   1.01      6.2±0.55µs        ? ?/sec    1.00      6.2±0.08µs        ? ?/sec
interleave str(20, 0.0) 400 [0..100, 100..230, 450..1000]                                    1.00      2.6±0.02µs        ? ?/sec    1.00      2.6±0.02µs        ? ?/sec
interleave str(20, 0.5) 100 [0..100, 100..230, 450..1000]                                    1.00   1054.9±5.37ns        ? ?/sec    1.01  1065.7±16.32ns        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.01     10.2±0.09µs        ? ?/sec    1.00     10.1±0.13µs        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000]                                   1.00     10.2±0.06µs        ? ?/sec    1.00     10.2±0.13µs        ? ?/sec
interleave str(20, 0.5) 400 [0..100, 100..230, 450..1000]                                    1.00      3.6±0.03µs        ? ?/sec    1.01      3.7±0.07µs        ? ?/sec
interleave str_view(0.0) 100 [0..100, 100..230, 450..1000]                                   1.00   796.6±17.19ns        ? ?/sec    1.00    795.3±2.13ns        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.00      4.9±0.06µs        ? ?/sec    1.00      4.9±0.07µs        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000]                                  1.01      4.9±0.16µs        ? ?/sec    1.00      4.9±0.07µs        ? ?/sec
interleave str_view(0.0) 400 [0..100, 100..230, 450..1000]                                   1.02      2.1±0.08µs        ? ?/sec    1.00      2.1±0.02µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 100 [0..100, 100..230, 450..1000]                       1.01    871.0±8.88ns        ? ?/sec    1.00   862.8±51.55ns        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]             1.00      4.0±0.01µs        ? ?/sec    1.00      4.0±0.01µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000]                      1.00      4.0±0.07µs        ? ?/sec    1.00      4.0±0.02µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 400 [0..100, 100..230, 450..1000]                       1.00  1876.1±22.49ns        ? ?/sec    1.05  1976.1±10.37ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 100 [0..100, 100..230, 450..1000]                   1.00  1336.3±22.84ns        ? ?/sec    1.01  1348.9±22.57ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]         1.00      8.4±0.08µs        ? ?/sec    1.00      8.4±0.23µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                  1.01      8.4±0.07µs        ? ?/sec    1.00      8.3±0.21µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 400 [0..100, 100..230, 450..1000]                   1.01      3.7±0.04µs        ? ?/sec    1.00      3.6±0.02µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 100 [0..100, 100..230, 450..1000]              1.03  1889.3±28.74ns        ? ?/sec    1.00  1837.6±24.24ns        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000, 0..1000]    1.01     12.8±0.11µs        ? ?/sec    1.00     12.7±0.07µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000]             1.00     12.7±0.10µs        ? ?/sec    1.00     12.7±0.24µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 400 [0..100, 100..230, 450..1000]              1.00      5.4±0.04µs        ? ?/sec    1.00      5.4±0.08µs        ? ?/sec

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Dec 11, 2025

interleave list<i64>(0.0,0.0,20) 100 [0..100, 100..230, 450..1000]                           1.00      3.3±0.14µs        ? ?/sec    1.40      4.7±0.06µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000, 0..1000]                 1.03     30.2±0.36µs        ? ?/sec    1.00     29.3±0.30µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000]                          1.00     30.0±0.35µs        ? ?/sec    1.02     30.6±3.77µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 400 [0..100, 100..230, 450..1000]                           1.00     11.5±0.13µs        ? ?/sec    1.07     12.2±0.27µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 100 [0..100, 100..230, 450..1000]                           1.00      6.4±0.03µs        ? ?/sec    1.21      7.8±0.13µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000, 0..1000]                 1.00     49.8±1.00µs        ? ?/sec    1.21     60.3±1.99µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000]                          1.00     51.1±0.82µs        ? ?/sec    1.18     60.4±0.99µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 400 [0..100, 100..230, 450..1000]                           1.00     19.7±0.07µs        ? ?/sec    1.21     23.8±0.43µs        ? ?/sec

Looks good to me 👍

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @asubiotto -- this looks great to me ❤️

@asubiotto asubiotto force-pushed the asubiotto/interleavelist branch from be7904c to a1674ad Compare December 11, 2025 20:44
@Dandandan
Copy link
Copy Markdown
Contributor

run benchmark interleave_kernels

Comment thread arrow-select/src/interleave.rs Outdated
@alamb-ghbot
Copy link
Copy Markdown

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing asubiotto/interleavelist (a1674ad) to 6ff8cc4 diff
BENCH_NAME=interleave_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench interleave_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=asubiotto_interleavelist
Results will be posted here when complete

@alamb-ghbot
Copy link
Copy Markdown

🤖: Benchmark completed

Details

group                                                                                        asubiotto_interleavelist                main
-----                                                                                        ------------------------                ----
interleave dict(20, 0.0) 100 [0..100, 100..230, 450..1000]                                   1.05   840.4±11.67ns        ? ?/sec     1.00    800.0±6.93ns        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.19      2.7±0.02µs        ? ?/sec     1.00      2.3±0.01µs        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                  1.20      2.6±0.02µs        ? ?/sec     1.00      2.2±0.01µs        ? ?/sec
interleave dict(20, 0.0) 400 [0..100, 100..230, 450..1000]                                   1.16  1447.6±13.35ns        ? ?/sec     1.00  1253.0±21.04ns        ? ?/sec
interleave dict_distinct 100                                                                 1.05      3.0±0.09µs        ? ?/sec     1.00      2.9±0.03µs        ? ?/sec
interleave dict_distinct 1024                                                                1.04      3.0±0.06µs        ? ?/sec     1.00      2.9±0.02µs        ? ?/sec
interleave dict_distinct 2048                                                                1.05      3.0±0.12µs        ? ?/sec     1.00      2.9±0.03µs        ? ?/sec
interleave dict_sparse(20, 0.0) 100 [0..100, 100..230, 450..1000]                            1.03      2.9±0.20µs        ? ?/sec     1.00      2.8±0.17µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                  1.07      5.4±0.30µs        ? ?/sec     1.00      5.1±0.30µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000]                           1.08      4.7±0.22µs        ? ?/sec     1.00      4.4±0.22µs        ? ?/sec
interleave dict_sparse(20, 0.0) 400 [0..100, 100..230, 450..1000]                            1.09      3.5±0.20µs        ? ?/sec     1.00      3.2±0.18µs        ? ?/sec
interleave i32(0.0) 100 [0..100, 100..230, 450..1000]                                        1.02    314.3±4.11ns        ? ?/sec     1.00    309.3±4.54ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.01  1874.4±136.70ns        ? ?/sec    1.00  1857.5±36.58ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000]                                       1.04  1903.9±31.81ns        ? ?/sec     1.00  1829.3±13.25ns        ? ?/sec
interleave i32(0.0) 400 [0..100, 100..230, 450..1000]                                        1.00   912.5±11.09ns        ? ?/sec     1.00   915.3±11.15ns        ? ?/sec
interleave i32(0.5) 100 [0..100, 100..230, 450..1000]                                        1.07    639.2±9.29ns        ? ?/sec     1.00    600.2±4.02ns        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00      4.3±0.21µs        ? ?/sec     1.00      4.3±0.18µs        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000]                                       1.00      4.3±0.04µs        ? ?/sec     1.02      4.4±0.17µs        ? ?/sec
interleave i32(0.5) 400 [0..100, 100..230, 450..1000]                                        1.00   1844.2±8.94ns        ? ?/sec     1.00  1838.2±14.71ns        ? ?/sec
interleave list<i64>(0.0,0.0,20) 100 [0..100, 100..230, 450..1000]                           1.00      3.2±0.14µs        ? ?/sec     1.46      4.7±0.15µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000, 0..1000]                 1.03     30.3±0.23µs        ? ?/sec     1.00     29.5±0.35µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000]                          1.02     30.3±0.48µs        ? ?/sec     1.00     29.6±0.46µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 400 [0..100, 100..230, 450..1000]                           1.00     11.9±0.04µs        ? ?/sec     1.02     12.1±0.12µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 100 [0..100, 100..230, 450..1000]                           1.00      6.5±0.80µs        ? ?/sec     1.20      7.7±0.09µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000, 0..1000]                 1.00     50.4±1.68µs        ? ?/sec     1.19     59.8±1.80µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000]                          1.00     51.5±0.27µs        ? ?/sec     1.15     59.2±0.27µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 400 [0..100, 100..230, 450..1000]                           1.00     20.3±0.33µs        ? ?/sec     1.16     23.5±0.24µs        ? ?/sec
interleave str(20, 0.0) 100 [0..100, 100..230, 450..1000]                                    1.00   787.8±13.48ns        ? ?/sec     1.09   859.4±19.78ns        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.02      6.2±0.03µs        ? ?/sec     1.00      6.1±0.05µs        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                   1.02      6.3±0.26µs        ? ?/sec     1.00      6.2±0.05µs        ? ?/sec
interleave str(20, 0.0) 400 [0..100, 100..230, 450..1000]                                    1.00      2.5±0.05µs        ? ?/sec     1.00      2.5±0.02µs        ? ?/sec
interleave str(20, 0.5) 100 [0..100, 100..230, 450..1000]                                    1.03   1076.9±5.63ns        ? ?/sec     1.00  1045.1±17.07ns        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.01     10.3±0.16µs        ? ?/sec     1.00     10.2±0.31µs        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000]                                   1.02     10.4±0.15µs        ? ?/sec     1.00     10.1±0.10µs        ? ?/sec
interleave str(20, 0.5) 400 [0..100, 100..230, 450..1000]                                    1.02      3.7±0.04µs        ? ?/sec     1.00      3.7±0.02µs        ? ?/sec
interleave str_view(0.0) 100 [0..100, 100..230, 450..1000]                                   1.00   816.4±17.56ns        ? ?/sec     1.07    870.6±7.31ns        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.02      5.1±0.02µs        ? ?/sec     1.00      5.0±0.02µs        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000]                                  1.01      5.1±0.02µs        ? ?/sec     1.00      5.0±0.03µs        ? ?/sec
interleave str_view(0.0) 400 [0..100, 100..230, 450..1000]                                   1.00      2.2±0.01µs        ? ?/sec     1.01      2.2±0.02µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 100 [0..100, 100..230, 450..1000]                       1.00   872.0±46.16ns        ? ?/sec     1.00   869.9±41.01ns        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]             1.00      4.0±0.12µs        ? ?/sec     1.00      4.0±0.12µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000]                      1.00      4.0±0.05µs        ? ?/sec     1.01      4.0±0.04µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 400 [0..100, 100..230, 450..1000]                       1.00  1943.6±43.02ns        ? ?/sec     1.00  1941.8±14.78ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 100 [0..100, 100..230, 450..1000]                   1.04  1410.1±20.30ns        ? ?/sec     1.00  1359.0±19.49ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]         1.00      8.4±0.09µs        ? ?/sec     1.00      8.4±0.07µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                  1.00      8.3±0.10µs        ? ?/sec     1.01      8.4±0.18µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 400 [0..100, 100..230, 450..1000]                   1.00      3.6±0.11µs        ? ?/sec     1.01      3.7±0.37µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 100 [0..100, 100..230, 450..1000]              1.01  1894.2±19.22ns        ? ?/sec     1.00  1882.9±124.40ns        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000, 0..1000]    1.02     12.9±0.38µs        ? ?/sec     1.00     12.7±0.12µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000]             1.00     12.7±0.15µs        ? ?/sec     1.00     12.7±1.08µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 400 [0..100, 100..230, 450..1000]              1.00      5.4±0.04µs        ? ?/sec     1.00      5.5±0.13µs        ? ?/sec

Previously, List and LargeList would fall through to the interleave_fallback
match arm, which is inefficient. This commit implements interleave_list, which
interleaves a list's child arrays and rebuilds the offsets buffer. Running it
on production tests reduced memory by 80%.

Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
@asubiotto asubiotto force-pushed the asubiotto/interleavelist branch from a1674ad to 5b441ec Compare December 13, 2025 09:55
@asubiotto
Copy link
Copy Markdown
Contributor Author

Addressed the latest comment and rebased. This should be ready to merge.

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Dec 13, 2025

run benchmark interleave_kernels

@alamb-ghbot

This comment was marked as outdated.

@alamb alamb merged commit 5db072f into apache:main Dec 13, 2025
26 checks passed
@alamb
Copy link
Copy Markdown
Contributor

alamb commented Dec 13, 2025

Thank you for this PR @asubiotto and all the help @Dandandan

@alamb-ghbot
Copy link
Copy Markdown

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing asubiotto/interleavelist (5b441ec) to f8796fd diff
BENCH_NAME=interleave_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench interleave_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=asubiotto_interleavelist
Results will be posted here when complete

@alamb-ghbot
Copy link
Copy Markdown

🤖: Benchmark completed

Details

group                                                                                        asubiotto_interleavelist                main
-----                                                                                        ------------------------                ----
interleave dict(20, 0.0) 100 [0..100, 100..230, 450..1000]                                   1.01    805.0±4.59ns        ? ?/sec     1.00   799.7±10.74ns        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.00      2.3±0.03µs        ? ?/sec     1.00      2.3±0.04µs        ? ?/sec
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                  1.00      2.2±0.01µs        ? ?/sec     1.00      2.2±0.06µs        ? ?/sec
interleave dict(20, 0.0) 400 [0..100, 100..230, 450..1000]                                   1.03  1294.7±25.89ns        ? ?/sec     1.00  1256.8±16.40ns        ? ?/sec
interleave dict_distinct 100                                                                 1.01      2.9±0.02µs        ? ?/sec     1.00      2.9±0.03µs        ? ?/sec
interleave dict_distinct 1024                                                                1.02      3.0±0.02µs        ? ?/sec     1.00      2.9±0.03µs        ? ?/sec
interleave dict_distinct 2048                                                                1.02      3.0±0.02µs        ? ?/sec     1.00      2.9±0.02µs        ? ?/sec
interleave dict_sparse(20, 0.0) 100 [0..100, 100..230, 450..1000]                            1.00      2.7±0.17µs        ? ?/sec     1.09      2.9±0.28µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                  1.00      4.9±0.28µs        ? ?/sec     1.02      5.0±0.39µs        ? ?/sec
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000]                           1.05      4.3±0.26µs        ? ?/sec     1.00      4.1±0.24µs        ? ?/sec
interleave dict_sparse(20, 0.0) 400 [0..100, 100..230, 450..1000]                            1.03      3.4±0.25µs        ? ?/sec     1.00      3.3±0.23µs        ? ?/sec
interleave i32(0.0) 100 [0..100, 100..230, 450..1000]                                        1.00    308.7±1.15ns        ? ?/sec     1.00    308.5±2.25ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00  1854.2±14.90ns        ? ?/sec     1.00  1855.3±26.36ns        ? ?/sec
interleave i32(0.0) 1024 [0..100, 100..230, 450..1000]                                       1.04  1902.4±18.44ns        ? ?/sec     1.00  1826.5±13.04ns        ? ?/sec
interleave i32(0.0) 400 [0..100, 100..230, 450..1000]                                        1.00   913.8±29.18ns        ? ?/sec     1.00   913.5±13.81ns        ? ?/sec
interleave i32(0.5) 100 [0..100, 100..230, 450..1000]                                        1.01    607.8±1.66ns        ? ?/sec     1.00    599.9±6.30ns        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                              1.00      4.3±0.03µs        ? ?/sec     1.01      4.3±0.07µs        ? ?/sec
interleave i32(0.5) 1024 [0..100, 100..230, 450..1000]                                       1.00      4.3±0.02µs        ? ?/sec     1.02      4.4±0.06µs        ? ?/sec
interleave i32(0.5) 400 [0..100, 100..230, 450..1000]                                        1.00   1836.5±7.53ns        ? ?/sec     1.00  1841.8±38.42ns        ? ?/sec
interleave list<i64>(0.0,0.0,20) 100 [0..100, 100..230, 450..1000]                           1.00      2.9±0.02µs        ? ?/sec     1.59      4.7±0.02µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000, 0..1000]                 1.00     29.1±0.67µs        ? ?/sec     1.01     29.2±0.45µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 1024 [0..100, 100..230, 450..1000]                          1.00     28.9±0.24µs        ? ?/sec     1.02     29.5±0.56µs        ? ?/sec
interleave list<i64>(0.0,0.0,20) 400 [0..100, 100..230, 450..1000]                           1.00     11.6±0.05µs        ? ?/sec     1.03     12.0±0.18µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 100 [0..100, 100..230, 450..1000]                           1.00      6.0±0.05µs        ? ?/sec     1.31      7.8±0.11µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000, 0..1000]                 1.00     48.9±0.60µs        ? ?/sec     1.22     59.5±0.49µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 1024 [0..100, 100..230, 450..1000]                          1.00     49.4±0.46µs        ? ?/sec     1.21     59.7±1.80µs        ? ?/sec
interleave list<i64>(0.1,0.1,20) 400 [0..100, 100..230, 450..1000]                           1.00     19.7±0.31µs        ? ?/sec     1.19     23.4±0.33µs        ? ?/sec
interleave str(20, 0.0) 100 [0..100, 100..230, 450..1000]                                    1.00   778.1±12.99ns        ? ?/sec     1.02    790.1±6.96ns        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.01      6.3±0.26µs        ? ?/sec     1.00      6.2±0.06µs        ? ?/sec
interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                                   1.00      6.3±0.21µs        ? ?/sec     1.00      6.2±0.08µs        ? ?/sec
interleave str(20, 0.0) 400 [0..100, 100..230, 450..1000]                                    1.01      2.6±0.10µs        ? ?/sec     1.00      2.5±0.02µs        ? ?/sec
interleave str(20, 0.5) 100 [0..100, 100..230, 450..1000]                                    1.00   1038.0±5.93ns        ? ?/sec     1.00   1041.4±3.20ns        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]                          1.01     10.3±0.08µs        ? ?/sec     1.00     10.2±0.05µs        ? ?/sec
interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000]                                   1.02     10.4±0.10µs        ? ?/sec     1.00     10.2±0.30µs        ? ?/sec
interleave str(20, 0.5) 400 [0..100, 100..230, 450..1000]                                    1.02      3.7±0.04µs        ? ?/sec     1.00      3.6±0.05µs        ? ?/sec
interleave str_view(0.0) 100 [0..100, 100..230, 450..1000]                                   1.00    757.9±8.39ns        ? ?/sec     1.12    852.3±2.96ns        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]                         1.00      4.4±0.03µs        ? ?/sec     1.15      5.0±0.05µs        ? ?/sec
interleave str_view(0.0) 1024 [0..100, 100..230, 450..1000]                                  1.00      4.8±0.06µs        ? ?/sec     1.02      5.0±0.06µs        ? ?/sec
interleave str_view(0.0) 400 [0..100, 100..230, 450..1000]                                   1.00  1969.9±20.05ns        ? ?/sec     1.12      2.2±0.13µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 100 [0..100, 100..230, 450..1000]                       1.05    907.5±8.01ns        ? ?/sec     1.00    864.8±5.03ns        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]             1.00      3.9±0.01µs        ? ?/sec     1.00      4.0±0.03µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 1024 [0..100, 100..230, 450..1000]                      1.01      4.0±0.02µs        ? ?/sec     1.00      4.0±0.02µs        ? ?/sec
interleave struct(i32(0.0), i32(0.0) 400 [0..100, 100..230, 450..1000]                       1.00  1955.0±28.64ns        ? ?/sec     1.00  1954.2±44.55ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 100 [0..100, 100..230, 450..1000]                   1.01   1336.9±6.62ns        ? ?/sec     1.00  1328.0±23.72ns        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]         1.00      8.3±0.03µs        ? ?/sec     1.00      8.3±0.05µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 1024 [0..100, 100..230, 450..1000]                  1.02      8.3±0.04µs        ? ?/sec     1.00      8.2±0.05µs        ? ?/sec
interleave struct(i32(0.0), str(20, 0.0) 400 [0..100, 100..230, 450..1000]                   1.00      3.6±0.02µs        ? ?/sec     1.00      3.6±0.05µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 100 [0..100, 100..230, 450..1000]              1.00  1829.5±101.17ns        ? ?/sec    1.01  1853.1±10.68ns        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000, 0..1000]    1.00     12.7±0.06µs        ? ?/sec     1.01     12.7±0.28µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 1024 [0..100, 100..230, 450..1000]             1.00     12.7±0.11µs        ? ?/sec     1.00     12.7±0.05µs        ? ?/sec
interleave struct(str(20, 0.0), str(20, 0.0)) 400 [0..100, 100..230, 450..1000]              1.00      5.4±0.09µs        ? ?/sec     1.00      5.4±0.05µs        ? ?/sec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support performant interleave for List/LargeList

4 participants