8716: refactor: remove arrow-ord dependency in arrow-cast#11
8716: refactor: remove arrow-ord dependency in arrow-cast#11martin-augment wants to merge 2 commits intomainfrom
arrow-ord dependency in arrow-cast#11Conversation
WalkthroughThe PR removes the Changes
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
PR Review: Refactor to Remove
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
arrow-cast/src/cast/run_array.rs (3)
64-84: Fix double-Arc return (type mismatch).new_run_array is already an ArrayRef (Arc). Returning Ok(Arc::new(new_run_array)) yields Arc<Arc> and won’t type‑check. Return the ArrayRef directly.
- Ok(Arc::new(new_run_array)) + Ok(new_run_array)
161-164: Use signed indices for take() (Int32/Int64), not UInt32.The compute::take kernel expects signed integer index types (Int32 or Int64). Using UInt32 is unsupported and may error at runtime.
- let indices = PrimitiveArray::<UInt32Type>::from_iter_values( - values_indexes.iter().map(|&idx| idx as u32), - ); + // Prefer Int32 where possible; switch to Int64 if len > i32::MAX. + let indices = PrimitiveArray::<Int32Type>::from_iter_values( + values_indexes.iter().map(|&idx| idx as i32), + );Optionally gate on length to choose Int64 for very large arrays.
86-106: REE → logical expansion builds wrong-length indices and mishandles 1‑based run_ends.
- Iterates over
run_ends.len()(number of runs) instead of logical length.- run_ends are cumulative, 1‑based; equality check against 0‑based logical_idx is wrong.
- Risks OOB on
run_ends[physical_idx]and produces too few elements.- Example: with run_ends=[2, 5, 6] (6 logical elements), current code produces only 3 indices, causing incorrect take() operation.
Apply this rewrite to construct a per‑logical index mapping by expanding each run length:
- let run_ends = run_array.run_ends().values().to_vec(); - let mut indices = Vec::with_capacity(run_array.run_ends().len()); - let mut physical_idx: usize = 0; - for logical_idx in 0..run_array.run_ends().len() { - // If the logical index is equal to the (next) run end, increment the physical index, - // since we are at the end of a run. - if logical_idx == run_ends[physical_idx].as_usize() { - physical_idx += 1; - } - indices.push(physical_idx as i32); - } + let run_ends = run_array.run_ends().values(); + // logical_len = last run end (1-based) + let logical_len = { + let &last = run_ends.last().expect("non-empty REE"); + last.as_usize() + }; + let mut indices: Vec<i32> = Vec::with_capacity(logical_len); + let mut prev = 0usize; + for (run_idx, &end_native) in run_ends.iter().enumerate() { + let end = end_native.as_usize(); + let run_len = end.checked_sub(prev).expect("non-decreasing run_ends"); + // Repeat run_idx for this run's logical length + indices.extend(std::iter::repeat(run_idx as i32).take(run_len)); + prev = end; + }
🧹 Nitpick comments (4)
arrow-cast/src/cast/run_array.rs (4)
531-549: runs_generic is allocation-heavy and O(n) slices per element; consider a leaner comparator.Repeated
slice(idx, 1).to_data()allocates/clones ArrayData each iteration. Prefer a type-directed fast path (like existing runs_for_* helpers) or compare physical buffers/offsets directly for the element atidx. This will reduce allocations and improve hot‑path performance for view/complex types.If keeping generic fallback, add a micro-benchmark to quantify impact vs old partition-based approach.
150-169: Index type width selection for values_indexes.When building
values_arrayvia take of run-starts, choose Int64 indices whencast_array.len() > i32::MAXto avoid overflow on.map(|&idx| idx as i32).Consider:
if cast_array.len() <= i32::MAX as usize { // Int32 indices } else { // Int64 indices }
217-257: Dictionary runs should consider logical values, not just key equality (edge case).Using
runs_for_primitive(array.keys())splits runs when distinct keys map to identical dictionary values (permitted by Arrow). If observable equality should be by values, consider dispatching oncast(array, value_type)first (you already do that in cast_to_run_end_encoded) or normalize keys to canonicalized dictionary.
559-565: ensure_capacity reserves up to total_len every time capacity is hit.This is safe but can over-reserve for arrays with few runs. A geometric growth (Vec default) or
reserve_exactwith a modest chunk (e.g., +len/8) can reduce memory spikes for very long arrays with sparse runs.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
arrow-cast/Cargo.toml(0 hunks)arrow-cast/src/cast/run_array.rs(3 hunks)
💤 Files with no reviewable changes (1)
- arrow-cast/Cargo.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: MIRI
- GitHub Check: Test
- GitHub Check: Audit
- GitHub Check: Lint (cargo fmt)
- GitHub Check: Rustdocs are clean
- GitHub Check: Test on Windows
- GitHub Check: Release Audit Tool (RAT)
- GitHub Check: Test on Mac
- GitHub Check: codex
- GitHub Check: Archery test With other arrows
- GitHub Check: Test
- GitHub Check: claude-review
| Float16 => runs_for_primitive(array.as_primitive::<Float16Type>()), | ||
| Float32 => runs_for_primitive(array.as_primitive::<Float32Type>()), | ||
| Float64 => runs_for_primitive(array.as_primitive::<Float64Type>()), |
There was a problem hiding this comment.
Float runs: NaN handling breaks every element; use bitwise equality for floats.
Using != on floats treats NaN != NaN, fragmenting runs of NaNs. Implement float‑specific run detection using to_bits (including f16) so all equal bit patterns, including NaN, stay in a single run.
Minimal dispatch change:
- Float16 => runs_for_primitive(array.as_primitive::<Float16Type>()),
- Float32 => runs_for_primitive(array.as_primitive::<Float32Type>()),
- Float64 => runs_for_primitive(array.as_primitive::<Float64Type>()),
+ Float16 => runs_for_float16(array.as_primitive::<Float16Type>()),
+ Float32 => runs_for_float32(array.as_primitive::<Float32Type>()),
+ Float64 => runs_for_float64(array.as_primitive::<Float64Type>()),Add specialized helpers (outline):
fn runs_for_float32(array: &PrimitiveArray<Float32Type>) -> (Vec<usize>, Vec<usize>) {
// mirror runs_for_primitive but compare value.to_bits() instead of value
// and use null semantics identical to runs_for_primitive
}
fn runs_for_float64(...) -> (...) { /* same with u64 */ }
fn runs_for_float16(...) -> (...) { /* use half::f16::to_bits() (u16) */ }🤖 Prompt for AI Agents
In arrow-cast/src/cast/run_array.rs around lines 189-191, the float arms
currently call runs_for_primitive and use `!=` which splits NaNs because IEEE
floats compare NaN != NaN; replace those arms to call new float-specific run
detectors and implement three helpers:
runs_for_float16(&PrimitiveArray<Float16Type>),
runs_for_float32(&PrimitiveArray<Float32Type>), and
runs_for_float64(&PrimitiveArray<Float64Type>). Each helper should mirror
runs_for_primitive but compare raw bit representations (use half::f16::to_bits()
for f16, f32::to_bits() for f32, f64::to_bits() for f64) so equal bit patterns
(including NaN payloads) form a single run, and preserve the same null-handling
and return types as runs_for_primitive; update the match dispatch to call these
helpers for Float16/Float32/Float64.
|
Findings
|
8716: To review by AI