Decouple from cudf::detail::make_counting_transform_iterator#4306
Decouple from cudf::detail::make_counting_transform_iterator#4306mythrocks wants to merge 6 commits intoNVIDIA:mainfrom
cudf::detail::make_counting_transform_iterator#4306Conversation
This change introduces a version of `make_counting_transform_iterator` that is specific to Spark RAPIDS JNI. The previous version of this function is from `cudf::detail`, which is now deemed private to cuDF. This commit should allow Spark RAPIDS JNI to be insulated from changes to interfaces in `cudf::detail`. Note that this version does not use `thrust::transform_iterator`. It banks instead on `cuda::make_transform_iterator` instead. Signed-off-by: MithunR <mithunr@nvidia.com>
cudf::detail::make_counting_transform_iteratorcudf::detail::make_counting_transform_iterator
Greptile SummaryThis PR decouples spark-rapids-jni from Key changes:
The implementation correctly replicates the cuDF detail functionality while insulating the codebase from future cuDF internal API changes. Confidence Score: 5/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[cudf::detail::make_counting_transform_iterator] -->|replaced by| B[spark_rapids_jni::util::make_counting_transform_iterator]
C[cudf::detail::make_pair_iterator] -->|replaced by| D[spark_rapids_jni::util::make_pair_iterator]
E[thrust::counting_iterator] -->|changed to| F[cuda::counting_iterator]
B --> G[from_json_to_structs.cu]
B --> H[shuffle_split.cu]
B --> I[hyper_log_log_plus_plus.cu]
B --> J[Other source files]
D --> K[multiply.cu]
L[cudf/detail/iterator.cuh] -->|removed| M[utilities/iterator.cuh]
M -->|new include| N[All modified files]
style B fill:#90EE90
style D fill:#90EE90
style M fill:#90EE90
style F fill:#FFD700
Last reviewed commit: e10722a |
cudf::detail::make_counting_transform_iteratorcudf::detail::make_counting_transform_iterator
|
Build |
Signed-off-by: MithunR <mithunr@nvidia.com>
Signed-off-by: MithunR <mithunr@nvidia.com>
Signed-off-by: MithunR <mithunr@nvidia.com>
|
Build |
|
Build |
ttnghia
left a comment
There was a problem hiding this comment.
Please hold off a little bit. We need to discuss on mitigating the issue with code duplicates and unavoidable dependency from cudf detail namespace.
|
I think I think the pair-wise iterator should probably remain in CUDF. I'll check whether |
| : dscalar(cudf::get_scalar_device_view( | ||
| static_cast<ScalarType&>(const_cast<cudf::scalar&>(scalar_value)))) | ||
| { | ||
| CUDF_EXPECTS(type_id_matches_device_storage_type<Element>(scalar_value.type().id()), |
There was a problem hiding this comment.
Can't use CUDF_EXPECTS here. Throws cudf-specific errors, instead of spark-rapids-jni.
This commit introduces utility iterators to be used in place
cudf::detailiterators. This is to further reduce dependencies oncudf::detailAPIs that are now deemed private to the CUDF project.make_counting_transform_iteratorThis change introduces a version of
make_counting_transform_iteratorthat is specific to Spark RAPIDS JNI.The previous version of this function is from
cudf::detail, which is now deemed private to cuDF. This commit should allow Spark RAPIDS JNI to be insulated from changes to interfaces incudf::detail.Note that this version does not use
thrust::transform_iterator. It banks instead oncuda::make_transform_iteratorinstead.make_pair_iteratorThis commit also introduces
make_pair_iterator(column_device_view const&)andmake_pair_iterator(scalar const&). Much like their counterparts incudf::detail, these functions produce pair-iterators that allow iteration over a column's rows, along with a bool indicating whether the row is valid (i.e. non-null).