Bug Description
The root cause is the missing invariant validation between Array::len() and Array::logical_nulls().len() in try_binary, leading to OOB writes when processing arrays with inconsistent metadata. The try_binary function performs unchecked memory writes based on the length reported by logical_nulls(), but only allocates buffer space based on Array::len(). Since Array is not an unsafe trait, user implementations can violate the implicit assumption that these two values must match.
|
let len = a.len(); |
|
|
|
if a.null_count() == 0 && b.null_count() == 0 { |
|
try_binary_no_nulls(len, a, b, op) |
|
} else { |
|
let nulls = |
|
NullBuffer::union(a.logical_nulls().as_ref(), b.logical_nulls().as_ref()).unwrap(); |
|
|
|
let mut buffer = BufferBuilder::<O::Native>::new(len); |
|
buffer.append_n_zeroed(len); |
|
let slice = buffer.as_slice_mut(); |
|
|
|
nulls.try_for_each_valid_idx(|idx| { |
|
unsafe { |
|
*slice.get_unchecked_mut(idx) = op(a.value_unchecked(idx), b.value_unchecked(idx))? |
|
}; |
|
Ok::<_, ArrowError>(()) |
|
})?; |
|
|
|
let values = buffer.finish().into(); |
|
Ok(PrimitiveArray::new(values, Some(nulls))) |
Array is not an
unsafe trait, so implementors have no safety contract to uphold. The implicit assumption is violated where
logical_nulls().len() == len(), but this is not documented as a requirement, not enforced by the type system, and also not validated at runtime. Therefore, the attack can forge
len() = 1 and
logical_nulls().len() = 2, which causes 1-element buffer allocation, but 2-iteration loop ,
OOB write at index 1
To Reproduce
use arrow_arith::arity::try_binary;
use arrow_array::{Array, ArrayAccessor, Int32Array};
use arrow_array::types::Int32Type;
use arrow_buffer::NullBuffer;
use arrow_data::ArrayData;
use std::sync::Arc;
#[derive(Debug, Clone)]
struct Evil {
inner: Int32Array, fake_nulls: NullBuffer,
}
impl Array for Evil {
fn as_any(&self) -> &dyn std::any::Any { self }
fn to_data(&self) -> ArrayData { self.inner.to_data() }
fn into_data(self) -> ArrayData { self.inner.into_data() }
fn data_type(&self) -> &arrow_schema::DataType { self.inner.data_type() }
fn slice(&self, o: usize, l: usize) -> Arc<dyn Array> {
Arc::new(Evil {
inner: self.inner. slice(o, l),
fake_nulls: self.fake_nulls.slice(o, l),
})
}
fn len(&self) -> usize { 1 }
fn is_empty(&self) -> bool { false }
fn offset(&self) -> usize { 0 }
fn nulls(&self) -> Option<&NullBuffer> { Some(&self.fake_nulls) }
fn logical_nulls(&self) -> Option<NullBuffer> {
Some(NullBuffer::new_valid(2))
}
fn get_buffer_memory_size(&self) -> usize { 0 }
fn get_array_memory_size(&self) -> usize { 0 }
}
impl ArrayAccessor for Evil {
type Item = i32;
fn value(&self, i: usize) -> i32 { self.inner.value(i) }
unsafe fn value_unchecked(&self, i: usize) -> i32 {
self.inner.value_unchecked(i)
}
}
fn main() {
let inner = Int32Array::from(vec![42, 100]);
let fake_nulls = NullBuffer::from(vec![false]); // [null]
let a = Evil { inner: inner. clone(), fake_nulls: fake_nulls.clone() };
let b = Evil { inner, fake_nulls };
let result = try_binary::<_, _, _, Int32Type>(a, b, |x, y| {
println! (" op({}, {})", x, y);
Ok(x + y)
});
match result {
Ok(arr) => println!("\n Result: {:?}", arr),
Err(e) => println!("\n Error: {}", e),
}
}
with cargo run,
thread 'main' (3689217) panicked at /root/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-arith-57.1.0/src/arity.rs:285:24:
unsafe precondition(s) violated: slice::get_unchecked_mut requires that the index is within the slice
This indicates a bug in the program. This Undefined Behavior check is optional, and cannot be relied on for safety.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread caused non-unwinding panic. aborting.
Aborted (core dumped)
Additional Context
Relevant PR: #9092
Bug Description
The root cause is the missing invariant validation between
Array::len()andArray::logical_nulls().len()intry_binary, leading to OOB writes when processing arrays with inconsistent metadata. Thetry_binaryfunction performs unchecked memory writes based on the length reported bylogical_nulls(), but only allocates buffer space based onArray::len(). SinceArrayis not anunsafe trait, user implementations can violate the implicit assumption that these two values must match.arrow-rs/arrow-arith/src/arity.rs
Lines 271 to 291 in 2507946
Arrayis not anunsafe trait, so implementors have no safety contract to uphold. The implicit assumption is violated wherelogical_nulls().len() == len(), but this is not documented as a requirement, not enforced by the type system, and also not validated at runtime. Therefore, the attack can forgelen() = 1andlogical_nulls().len() = 2, which causes 1-element buffer allocation, but 2-iteration loop , OOB write at index 1To Reproduce
with cargo run,
Additional Context
Relevant PR: #9092