Merged
Conversation
Jefffrey
commented
Dec 19, 2025
| /// scaled well for larger inputs. | ||
| /// | ||
| /// See <https://github.com/apache/arrow-rs/pull/3622#issuecomment-1407753727> for more details. | ||
| // TODO: this technically should be a method on RunEndBuffer |
Contributor
Author
There was a problem hiding this comment.
Not sure it's worth making an API change solely for this (or we could keep this and make it a thin wrapper)
Contributor
There was a problem hiding this comment.
yeah, I think a thin wrapper would be good -- as a follow on PR perhaps
24 tasks
alamb
approved these changes
Dec 19, 2025
| /// | ||
| /// Note: any slicing of this [`RunArray`] array is not applied to the returned array | ||
| /// and must be handled separately | ||
| /// Any slicing of this [`RunArray`] array is **not** applied to the returned |
Contributor
There was a problem hiding this comment.
this is the same as how ListArray works, which is definitely tricky to use correctly, as @rluvaton has noted
| /// scaled well for larger inputs. | ||
| /// | ||
| /// See <https://github.com/apache/arrow-rs/pull/3622#issuecomment-1407753727> for more details. | ||
| // TODO: this technically should be a method on RunEndBuffer |
Contributor
There was a problem hiding this comment.
yeah, I think a thin wrapper would be good -- as a follow on PR perhaps
| /// └─────────┘ | ||
| /// logical indices | ||
| /// physical logical | ||
| /// ┌─────────┬─────────┐ ┌─────────┬─────────┐ |
Contributor
There was a problem hiding this comment.
this is a nice improvement (to show values and indexes)
| run_ends: ScalarBuffer<E>, | ||
| len: usize, | ||
| offset: usize, | ||
| logical_length: usize, |
Contributor
There was a problem hiding this comment.
😍 -- I like these names much better
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Contributor
Author
|
Thanks @alamb |
Contributor
|
Thank YOU -- this is so much better now |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
N/A
Rationale for this change
Whilst reviewing apache/datafusion#18981 I found it very confusing trying to follow along the logic for accounting for slicing of RunArrays. Decided to see if I can improve the documentation around it, especially to make it clear slicing acts logically not physically.
What changes are included in this PR?
Improve docstrings for RunArrays and RunEndBuffers.
Also add some tests for kernel operations on sliced RunArrays which seemed to be missing.
Are these changes tested?
Doc changes only. Added some (expected) failing tests to showcase #9018 too.
Are there any user-facing changes?
Doc changes only.