Commit d17c213
committed
arrow-select: improve dictionary interleave fallback performance
The naive interleave_fallback would use MutableArray and extend it with the
full values slice each time the target array changed in the indices slice.
This commit introduces a new approach where dictionary values are concatenated
once and then new offsets are computed over these taking the indices into
account. This results in 50-75% performance improvement in microbenchmarks and
will also improve memory usage during interleaves (used heavily in sorts).
Note that this path is only taken when should_merge_dictionary_values returns
false.
```
$ cargo bench --bench interleave_kernels -- 'dict' --baseline main
interleave dict(20, 0.0) 100 [0..100, 100..230, 450..1000]
time: [627.14 ns 634.76 ns 644.13 ns]
change: [−65.614% −65.345% −65.002%] (p = 0.00 < 0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) low mild
6 (6.00%) high mild
1 (1.00%) high severe
interleave dict(20, 0.0) 400 [0..100, 100..230, 450..1000]
time: [934.35 ns 937.51 ns 940.60 ns]
change: [−71.488% −71.340% −71.208%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000]
time: [1.6485 µs 1.6528 µs 1.6566 µs]
change: [−74.307% −74.190% −74.088%] (p = 0.00 < 0.05)
Performance has improved.
interleave dict(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]
time: [1.6723 µs 1.6782 µs 1.6842 µs]
change: [−74.664% −74.544% −74.438%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe
interleave dict_sparse(20, 0.0) 100 [0..100, 100..230, 450..1000]
time: [1.5985 µs 1.6064 µs 1.6148 µs]
change: [−12.510% −12.116% −11.715%] (p = 0.00 < 0.05)
Performance has improved.
Found 19 outliers among 100 measurements (19.00%)
10 (10.00%) low mild
6 (6.00%) high mild
3 (3.00%) high severe
interleave dict_sparse(20, 0.0) 400 [0..100, 100..230, 450..1000]
time: [1.9310 µs 1.9466 µs 1.9680 µs]
change: [−41.488% −41.091% −40.628%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
6 (6.00%) high mild
6 (6.00%) high severe
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000]
time: [2.7812 µs 2.8516 µs 2.9401 µs]
change: [−56.097% −55.276% −54.274%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
8 (8.00%) high mild
7 (7.00%) high severe
interleave dict_sparse(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]
time: [3.4926 µs 3.6558 µs 3.8427 µs]
change: [−48.423% −46.405% −44.379%] (p = 0.00 < 0.05)
Performance has improved.
interleave dict_distinct 100
time: [2.0013 µs 2.0106 µs 2.0205 µs]
change: [−1.6162% −1.0465% −0.4647%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
interleave dict_distinct 1024
time: [1.9784 µs 1.9855 µs 1.9924 µs]
change: [−2.4655% −1.8461% −1.2265%] (p = 0.00 < 0.05)
Performance has improved.
interleave dict_distinct 2048
time: [1.9832 µs 1.9959 µs 2.0087 µs]
change: [−2.9917% −2.3003% −1.6062%] (p = 0.00 < 0.05)
Performance has improved.
```
Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>1 parent dff6402 commit d17c213
1 file changed
+91
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| |||
196 | 197 | | |
197 | 198 | | |
198 | 199 | | |
199 | | - | |
| 200 | + | |
200 | 201 | | |
201 | 202 | | |
202 | 203 | | |
| |||
346 | 347 | | |
347 | 348 | | |
348 | 349 | | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
349 | 401 | | |
350 | 402 | | |
351 | 403 | | |
| |||
1182 | 1234 | | |
1183 | 1235 | | |
1184 | 1236 | | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
| 1240 | + | |
| 1241 | + | |
| 1242 | + | |
| 1243 | + | |
| 1244 | + | |
| 1245 | + | |
| 1246 | + | |
| 1247 | + | |
| 1248 | + | |
| 1249 | + | |
| 1250 | + | |
| 1251 | + | |
| 1252 | + | |
| 1253 | + | |
| 1254 | + | |
| 1255 | + | |
| 1256 | + | |
| 1257 | + | |
| 1258 | + | |
| 1259 | + | |
| 1260 | + | |
| 1261 | + | |
| 1262 | + | |
| 1263 | + | |
| 1264 | + | |
| 1265 | + | |
| 1266 | + | |
| 1267 | + | |
| 1268 | + | |
| 1269 | + | |
| 1270 | + | |
| 1271 | + | |
| 1272 | + | |
| 1273 | + | |
| 1274 | + | |
1185 | 1275 | | |
0 commit comments