Replace cudf::detail::valid_if with cudf::bools_to_mask#4301
Replace cudf::detail::valid_if with cudf::bools_to_mask#4301mythrocks wants to merge 6 commits intoNVIDIA:mainfrom
cudf::detail::valid_if with cudf::bools_to_mask#4301Conversation
This commit is part of the continuihng effort to reduce the dependency of spark-rapids-jni on `cudf::detail` APIs. In this commit, some of the references to `cudf::detail::valid_if` with `cudf::bools_to_mask`. The functionality should not be altered. Existing tests ought to cover the changes. Signed-off-by: MithunR <mithunr@nvidia.com>
Greptile SummaryThis PR migrates four files away from the
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["cudf::detail::valid_if\n(iterator + predicate → bitmask)"] -->|"Replaced by"| B["cudf::bools_to_mask\n(device_span<bool> → bitmask)"]
subgraph Old["Old Pattern (detail API)"]
A1["Iterator range + predicate functor"] --> A2["valid_if computes bools &\nproduces bitmask in one step"]
A2 --> A3["Returns pair<rmm::device_buffer, size_type>"]
end
subgraph New["New Pattern (public API)"]
B1["Materialize bools into\ndevice_uvector<bool>"] --> B2["Construct device_span<bool const>"]
B2 --> B3["bools_to_mask converts\nbools → bitmask"]
B3 --> B4["Returns pair<unique_ptr<device_buffer>, size_type>"]
B4 --> B5["Extract via *ptr.release()"]
end
Last reviewed commit: 505b3b6 |
|
Build |
This change is more controversial. The only way to get away from using `cudf::detail::valid_if` in the files modified here is to materialize a temporary bool vector (that is then packed). Signed-off-by: MithunR <mithunr@nvidia.com>
|
Build |
|
c3f2550 is slightly controversial; the only way to stop using There might be value in requesting for a |
cudf::detail::valid_if with cudf::bools_to_maskcudf::detail::valid_if with cudf::bools_to_mask
Signed-off-by: MithunR <mithunr@nvidia.com>
|
Build |
|
Build |
ttnghia
left a comment
There was a problem hiding this comment.
Please hold off a little bit. We need to discuss on mitigating the issue with code duplicates and unavoidable dependency from cudf detail namespace. We should also avoid performance impact by doing this.
|
|
||
| std::pair<rmm::device_buffer, cudf::size_type> create_null_mask( | ||
| cudf::size_type num_rows, | ||
| std::unique_ptr<cudf::column> const& should_be_nullified, |
There was a problem hiding this comment.
Would it be viable to change this to should_be_valid, and forego the logical-not.
This commit is part of the continuing effort to reduce the dependency of spark-rapids-jni on
cudf::detailAPIs. In this commit, some of the references tocudf::detail::valid_ifwithcudf::bools_to_mask.The functionality should not be altered. Existing tests ought to cover the changes.