Migrate base memory resources to native CCCL resource concept#2289
Migrate base memory resources to native CCCL resource concept#2289bdice merged 5 commits intorapidsai:stagingfrom
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Migrate all 8 base (non-adaptor) memory resources to natively satisfy the CCCL cuda::mr::resource concept, so that concrete types satisfy the concept without virtual dispatch through device_memory_resource. Stateless resources (cuda_memory_resource, managed_memory_resource, pinned_host_memory_resource, cuda_async_view_memory_resource, system_memory_resource) get allocate/deallocate/allocate_sync/ deallocate_sync with cuda::stream_ref directly on the class. Stateful resources (cuda_async_memory_resource, cuda_async_managed_memory_resource, sam_headroom_memory_resource) use the cuda::mr::shared_resource<Impl> pattern with _impl classes extracted to detail/ headers and .cpp source files, matching the adaptor convention established in prior PRs. device_memory_resource inheritance is kept for backward compatibility, with do_allocate/do_deallocate delegating to the new CCCL methods.
| /** | ||
| * @brief Allocates memory of size at least \p bytes. | ||
| * | ||
| * The returned pointer will have at minimum 256 byte alignment. | ||
| * | ||
| * @param bytes The size of the allocation | ||
| * @param stream Stream on which to perform allocation | ||
| * @return void* Pointer to the newly allocated memory | ||
| * @brief Enables the `cuda::mr::device_accessible` property | ||
| */ | ||
| void* do_allocate(std::size_t bytes, rmm::cuda_stream_view stream) override | ||
| RMM_CONSTEXPR_FRIEND void get_property(cuda_async_managed_memory_resource const&, | ||
| cuda::mr::device_accessible) noexcept | ||
| { | ||
| return pool_.allocate(stream, bytes); | ||
| } | ||
|
|
There was a problem hiding this comment.
Is it possible to get these properties from the impl via using shared_base::get_property?
There was a problem hiding this comment.
get_property is a friend free function (ADL), not a member, so using cannot pull it in. All shared_resource classes duplicate it.
…or== shortcircuit - Replace (void)param casts with [[maybe_unused]] on parameter declarations in all 5 stateless base resource headers - Add cudaStreamSynchronize in allocate_sync for cuda_memory_resource, managed_memory_resource, pinned_host_memory_resource, system_memory_resource, and sam_headroom_memory_resource_impl - Add this==addressof(other) shortcircuit to operator== in cuda_async_memory_resource_impl and cuda_async_managed_memory_resource_impl - Add <memory> include for std::addressof in the two async impl headers
…ons" This reverts commit 5e4d5de.
wence-
left a comment
There was a problem hiding this comment.
This looks good AFAICT, with some small comments to address/discuss in followup
| @@ -0,0 +1,90 @@ | |||
| /* | |||
| * SPDX-FileCopyrightText: Copyright (c) 2021-2026, NVIDIA CORPORATION. | |||
There was a problem hiding this comment.
nit: This is a new file, is this the correct copyright line?
There was a problem hiding this comment.
I kept an older copyright year because most of the file contents are being moved/copied from another file.
|
|
||
| #include <cstddef> | ||
|
|
||
| namespace RMM_NAMESPACE { |
There was a problem hiding this comment.
Just a reminder that RMM_NAMESPACE has __attribute((visibility("default"))) IIRC, so all of these impl types are, today, still in the public symbols. I think we do not fix that here, but once the dust has settled perhaps we can go round and more correctly deal with this.
There was a problem hiding this comment.
Yes, absolutely. I plan to do a thorough public symbol audit once I get through this migration.
| // Begin legacy device_memory_resource compatibility layer | ||
| using device_memory_resource::allocate; | ||
| using device_memory_resource::allocate_sync; | ||
| using device_memory_resource::deallocate; | ||
| using device_memory_resource::deallocate_sync; |
There was a problem hiding this comment.
This is going away soon, but remind me, why do we get the concrete implementations from device_memory_resource rather than shared_base?
See previous comments here: #2246 (comment)
There was a problem hiding this comment.
This is being fixed in a later phase. I have a draft of it locally. I think it's fine to delay this until that later PR is ready.
Co-authored-by: Lawrence Mitchell <wence@gmx.li>
Description
Closes #2287
Migrates all 8 base (non-adaptor) memory resources to natively satisfy the CCCL
cuda::mr::resourceconcept, so that concrete types (e.g.cuda_memory_resource&) satisfy the concept without virtual dispatch throughdevice_memory_resource.Stateless resources get
allocate/deallocate/allocate_sync/deallocate_syncacceptingcuda::stream_refdirectly on the class:cuda_memory_resource—device_accessiblemanaged_memory_resource—device_accessible+host_accessiblepinned_host_memory_resource—device_accessible+host_accessiblecuda_async_view_memory_resource—device_accessiblesystem_memory_resource—device_accessible+host_accessibleStateful, non-copyable resources use
cuda::mr::shared_resource<Impl>with_implclasses extracted todetail/headers and.cppsource files, matching the adaptor convention from prior PRs (e.g.limiting_resource_adaptor):cuda_async_memory_resource—device_accessiblecuda_async_managed_memory_resource—device_accessible+host_accessiblesam_headroom_memory_resource—device_accessible+host_accessibledevice_memory_resourceinheritance is kept for backward compatibility.do_allocate/do_deallocatedelegate to the new CCCL methods. Default alignment isrmm::CUDA_ALLOCATION_ALIGNMENT(256 bytes).Checklist