Skip to content

[FEA] Support memory resources from CCCL 3.2 #2011

@bdice

Description

@bdice

Is your feature request related to a problem? Please describe.
Recently, RAPIDS/CCCL nightlies began failing due to a change in CCCL's memory resources. See NVIDIA/cccl#5313

Describe the solution you'd like
RMM should support CCCL's new memory resources, which are targeting CCCL 3.2. RAPIDS is currently using CCCL 3.1.

Current plan for adoption:

Allocation Interfaces

This list of tasks requires CCCL 3.1+, so we can ship these changes in 25.12.

  1. Build RMM w/ 3.1 via polyfill and allocate updates (Support building with CCCL 3.1.0 #2017) (25.10)
  • Need to verify that all of RAPIDS builds with CCCL 3.1 with these changes in RMM, and ask Spark to do testing with the same pre-release of CCCL 3.1. The goal is to unblock adoption of CCCL 3.1 for RAPIDS.
  • Then hopefully CCCL+RAPIDS CI should work.
  1. Update polyfill to new allocate signature (RMM internal refactoring) Use CCCL MR interface internally #2112 (25.12)
  2. Refactor RAPIDS to use new allocate signature Migrate RAPIDS to CCCL MR interface (new allocation APIs) #2126 (25.12)
  3. Deprecate old allocate signature Add deprecation warnings for legacy MR interface #2128 (25.12)
  4. Remove deprecated legacy allocate interfaces Remove legacy memory resource interface in favor of CCCL interface #2150 (26.02)

Memory Resource Handling

This list of tasks requires CCCL 3.2+, so we will need to work on that migration in 26.02.

Remove device_memory_resource and legacy interface

Post-tasks

  • Delete is_resource_adaptor.hpp and its test usages (no longer meaningful after shared_resource adoption)
  • Use cuda_mr or cuda_async_mr in tests rather than get_current_device_resource_ref (see comment)
  • Check on equality docstrings after removing legacy code (see comment)
  • Update documentation: remove references to device_memory_resource, do_allocate, virtual dispatch from Doxygen and Python docstrings
  • Remove stale #include directives for deleted headers (device_memory_resource.hpp, device_memory_resource_view.hpp, cccl_adaptors.hpp)
  • Switch adaptors to be property-agnostic (e.g. support host accessible pools) and expose Upstream& upstream_resource()
    • This isn't in scope for the core of this issue.
  • Audit public symbols (see comment)

Update RAPIDS libraries

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions