Avoid a clone when creating RunEndArray from ArrayData#9189
Avoid a clone when creating RunEndArray from ArrayData#9189alamb merged 4 commits intoapache:mainfrom
RunEndArray from ArrayData#9189Conversation
843036f to
e19ec7e
Compare
| .try_into() | ||
| .expect("RunArray data should have exactly two child arrays"); | ||
|
|
||
| // deconstruct the run ends child array |
There was a problem hiding this comment.
this is what the current code does too (reaches into the child array data and gets the first buffer)
The current code also checks that there is exactly 1 buffer in the child array
| }; | ||
|
|
||
| let values = make_array(data.child_data()[1].clone()); | ||
| let [run_end_child, values_child]: [ArrayData; 2] = child_data |
There was a problem hiding this comment.
This was a nice way to check the length and destructure the ArrayData in one command
https://stackoverflow.com/questions/29570607/is-there-a-good-way-to-convert-a-vect-to-an-array
| RunEndBuffer::new_unchecked(scalar, data.offset(), data.len()) | ||
| }; | ||
|
|
||
| let values = make_array(data.child_data()[1].clone()); |
There was a problem hiding this comment.
This is real clone of ArrayData , which allocates a Vec, which is no longer done by this PR
| let values = make_array(values_child); | ||
| Self { | ||
| data_type: data.data_type().clone(), | ||
| data_type, |
There was a problem hiding this comment.
Also avoids a DataType::drop which is not likely to make a large difference but is still something
| .try_into() | ||
| .expect("Run ends should have exactly one buffer"); | ||
| let scalar = ScalarBuffer::from(run_end_buffer); | ||
| let run_ends = unsafe { RunEndBuffer::new_unchecked(scalar, offset, len) }; |
There was a problem hiding this comment.
note the previous code also uses unsafe to create a RunEndBuffer (which is valid during construction of ArrayData)
|
Thank you @Jefffrey |
Which issue does this PR close?
make_array) #9061ArrayData(speed up reading from Parquet reader) #9058Rationale for this change
Let's make arrow-rs the fastest we can and the fewer allocations the better
What changes are included in this PR?
Apply pattern from #9114
Are these changes tested?
Existing tests
Are there any user-facing changes?
No