Skip to content

Commit 200fb06

Browse files
authored
Avoid overallocating arrays in coalesce primitives / views (#9132)
# Which issue does this PR close? - Closes #9135 # Rationale for this change The code was calling `.reserve(batch_size)` which reserves space to at least `batch_size` additional elements (https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve). This also improves performance a bit: ``` filter: primitive, 8192, nulls: 0, selectivity: 0.001 time: [59.509 ms 59.660 ms 59.856 ms] change: [−3.0781% −2.7917% −2.4795%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe filter: primitive, 8192, nulls: 0, selectivity: 0.01 time: [6.0072 ms 6.0226 ms 6.0428 ms] change: [−8.7042% −7.1161% −6.0455%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe Benchmarking filter: primitive, 8192, nulls: 0, selectivity: 0.1: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.5s, enable flat sampling, or reduce sample count to 50. filter: primitive, 8192, nulls: 0, selectivity: 0.1 time: [1.8664 ms 1.8709 ms 1.8772 ms] change: [−15.187% −14.905% −14.632%] (p = 0.00 < 0.05) Performance has improved. Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe filter: primitive, 8192, nulls: 0, selectivity: 0.8 time: [2.5191 ms 2.5444 ms 2.5717 ms] change: [−13.064% −11.414% −10.003%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high severe Benchmarking filter: primitive, 8192, nulls: 0.1, selectivity: 0.001: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.7s, or reduce sample count to 60. filter: primitive, 8192, nulls: 0.1, selectivity: 0.001 time: [76.422 ms 76.671 ms 76.973 ms] change: [−5.5096% −4.0229% −2.8048%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe filter: primitive, 8192, nulls: 0.1, selectivity: 0.01 time: [10.197 ms 10.228 ms 10.262 ms] change: [−3.6627% −3.0569% −2.4919%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild filter: primitive, 8192, nulls: 0.1, selectivity: 0.1 time: [4.6635 ms 4.6750 ms 4.6915 ms] change: [−9.4939% −8.5908% −7.8383%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe filter: primitive, 8192, nulls: 0.1, selectivity: 0.8 time: [4.7777 ms 4.8115 ms 4.8467 ms] change: [−9.9226% −9.1384% −8.3813%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild ``` # What changes are included in this PR? Changes it to use `self.views.reserve(self.batch_size - self.views.len())` to avoid allocating more than necessary (i.e. 2x the amount). # Are these changes tested? Existing tests # Are there any user-facing changes?
1 parent 298d3aa commit 200fb06

3 files changed

Lines changed: 14 additions & 2 deletions

File tree

arrow-select/src/coalesce.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -615,6 +615,7 @@ mod tests {
615615
use crate::concat::concat_batches;
616616
use arrow_array::builder::StringViewBuilder;
617617
use arrow_array::cast::AsArray;
618+
use arrow_array::types::Int32Type;
618619
use arrow_array::{
619620
BinaryViewArray, Int32Array, Int64Array, RecordBatchOptions, StringArray, StringViewArray,
620621
TimestampNanosecondArray, UInt32Array, UInt64Array,
@@ -1625,6 +1626,11 @@ mod tests {
16251626
// Now should have a completed batch (100 rows total)
16261627
assert!(coalescer.has_completed_batch());
16271628
let output_batch = coalescer.next_completed_batch().unwrap();
1629+
let size = output_batch
1630+
.column(0)
1631+
.as_primitive::<Int32Type>()
1632+
.get_buffer_memory_size();
1633+
assert_eq!(size, 400); // 100 rows * 4 bytes each
16281634
assert_eq!(output_batch.num_rows(), 100);
16291635

16301636
assert_eq!(coalescer.get_buffered_rows(), 0);

arrow-select/src/coalesce/byte_view.rs

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,10 @@ impl<B: ByteViewType> InProgressByteViewArray<B> {
9898
/// This is done on write (when we know it is necessary) rather than
9999
/// eagerly to avoid allocations that are not used.
100100
fn ensure_capacity(&mut self) {
101-
self.views.reserve(self.batch_size);
101+
if self.views.capacity() == 0 {
102+
self.views.reserve(self.batch_size);
103+
}
104+
debug_assert_eq!(self.views.capacity(), self.batch_size);
102105
}
103106

104107
/// Finishes in progress buffer, if any

arrow-select/src/coalesce/primitive.rs

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,10 @@ impl<T: ArrowPrimitiveType> InProgressPrimitiveArray<T> {
5555
/// This is done on write (when we know it is necessary) rather than
5656
/// eagerly to avoid allocations that are not used.
5757
fn ensure_capacity(&mut self) {
58-
self.current.reserve(self.batch_size);
58+
if self.current.capacity() == 0 {
59+
self.current.reserve(self.batch_size);
60+
}
61+
debug_assert_eq!(self.current.capacity(), self.batch_size);
5962
}
6063
}
6164

0 commit comments

Comments
 (0)