Open
Conversation
…mory Introduce a dry-run execution framework that replaces device and host memory resources with lightweight fake allocators to measure peak memory usage without holding real memory. New files: - dry_run_memory_resource.hpp: dry_run_allocator (lock-free bump allocator), dry_run_device_memory_resource, dry_run_host_memory_resource, dry_run_resource_manager (RAII), and dry_run_execute() helper. - dry_run_flag.hpp: boolean dry-run flag as a raft resource, allowing algorithms to skip kernel execution during profiling. - tests/util/dry_run_memory_resource.cpp: unit tests. The dry_run_allocator probes the upstream once to obtain a base address, then atomically bumps a pointer for each allocation — no mutex, no map, no real memory held after the initial probe.
…pinned_memory_resource Add pinned and managed resources to the raft::resources handle to make it possible to customize / temporarily replace these resources
…aking change due to transitive includes in downstream libraries
Merges Remove deprecated headers (rapidsai#2939). Conflict resolutions: - rsvd.cuh: Use new mdspan-based raft::matrix::sqrt and reciprocal APIs (they have internal dry-run guards); kept cudaMemsetAsync guard - svd.cuh: Use raft::matrix::weighted_sqrt (has internal dry-run guard) - matrix.cuh: Accept deletion (deprecated, removed in main) Co-authored-by: Cursor <cursoragent@cursor.com>
tfeher
pushed a commit
to Stardust-SJF/cuvs_rabitq
that referenced
this pull request
Mar 3, 2026
A non-breaking src-only changes to modernize the use of raft primitives across cuVS source code. The general rule applied here is to prefer raft helpers taking `raft::resources` as an argument over other raft helpers over third-party libraries. - thrust::fill / thrust::fill_n → raft::matrix::fill - thrust::transform → raft::linalg::map - thrust::sequence / thrust::tabulate → raft::linalg::map_offset - raft::linalg::unaryOp / raft::linalg::binaryOp → raft::linalg::map - raft::linalg::add (pointer-based) → raft::linalg::add (mdspan-based) - raft::copy (pointer-based) → raft::copy (mdspan-based) - raft::update_device / raft::update_host → raft::copy (mdspan-based) - raft::linalg::rowNorm → raft::linalg::norm - raft::linalg::reduce (pointer-based) → raft::linalg::reduce (mdspan-based) - cudaMemsetAsync → raft::matrix::fill The purpose of this PR is to improve the consistency in using the library code (even though sometimes at the cost of a bit more auxiliary code). This is also a prerequisite to achieving dry run compliance in cuVS if we choose to merge that in rapidsai/raft#2961 Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#1837
…aw CCCL references
…urrent_device_resource()
Adapt the dry-run protocol to use the unified cuda::mr resource infrastructure from fea-unify-memory-resources. Key changes: - Replace dry_run_device_memory_resource (rmm subclass) and dry_run_host_memory_resource (std::pmr subclass) with a single dry_run_resource<Upstream> template using cuda::forward_property, modeled after raft::mr::statistics_adaptor. - Replace dry_run_resource_manager (which modified the passed-in resources handle) with dry_run_resources, a standalone class that copies the resources object and provides implicit conversion to const resources&, enabling composability with other resource wrappers. - dry_run_allocator uses probe-once semantics: a single real allocation from the upstream is kept alive for the allocator's lifetime, and all subsequent allocations return the same valid pointer. - Remove obsolete pmr/pinned_memory_resource.hpp (superseded by cuda::mr::legacy_pinned_memory_resource in the unified branch). - Adapt tests to use unified resource APIs (host_resource_ref, host_device_resource_ref, get_default_host_resource, etc.). Made-with: Cursor
…ep/restore the state of resources
Contributor
Author
|
The smaller diff without #2968 is viewable here: achirkin#4 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The dry run protocol defines a mechanism to simulate the execution of algorithms to get a precise estimate of the memory requirements for a real execution with the same parameters.
This PR:
raft::util::dry_run_execute, tracking memory resource,resource::get_dry_run_flag) that lets callers estimate peak memory usage of any RAFT algorithm without executing GPU work.Depends on (and includes all changes of) #2968