Conversation
Fixes these `pre-commit` errors blocking CI: ```text verify-hardcoded-version.................................................Failed - hook id: verify-hardcoded-version - exit code: 1 In file RAPIDS_BRANCH:1:9: release/26.04 warning: do not hard-code version, read from VERSION file instead In file RAPIDS_BRANCH:1:9: release/26.04 In file cpp/examples/versions.cmake:8:21: set(RMM_TAG release/26.04) warning: do not hard-code version, read from VERSION file instead In file cpp/examples/versions.cmake:8:21: set(RMM_TAG release/26.04) ``` By updating `verify-hardcoded-version` configuration and by updating the C++ examples to read `RMM_TAG` from the `RAPIDS_BRANCH` file. See rapidsai/pre-commit-hooks#121 for details Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #2293
Author
|
FAILURE - Unable to forward-merge due to an error, manual merge is necessary. Do not use the IMPORTANT: When merging this PR, do not use the auto-merger (i.e. the |
Contributes to rapidsai/build-planning#256 Broken out from #2270 Proposes a stricter pattern for installing `torch` wheels, to prevent bugs of the form "accidentally used a CPU-only `torch` from pypi.org". This should help us to catch compatibility issues, improving release confidence. Other small changes: * splits torch wheel testing into "oldest" (PyTorch 2.9) and "latest" (PyTorch 2.10) * introduces a `require_gpu_pytorch` matrix filter so conda jobs can explicitly request `pytorch-gpu` (to similarly ensure solvers don't fall back to the GPU-only variant) * appends `rapids-generate-pip-constraint` output to file `PIP_CONSTRAINT` points - *(to reduce duplication and the risk of failing to apply constraints)* Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #2279
…adaptor (#2304) So that the tracking resource adaptor is thread safe, the modification of the tracked allocations should be sandwiched by an "acquire-release" pair upstream.allocate-upstream.deallocate. Previously this was not the case, the upstream allocation occurred before updating the tracked allocations, but the dellocation did not occur after. This could lead to a scenario in multi-threaded use where we get a logged error that a deallocated pointer was not tracked. To solve this, actually use the correct pattern. Moreover, ensure that we don't observe ABA issues by using try_emplace when tracking an allocation. - Closes #2303 Authors: - Lawrence Mitchell (https://github.com/wence-) - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) URL: #2304
…E 754 -0.0 (#2302) ## Description `device_uvector::set_element_async` had a zero-value optimization that used `cudaMemsetAsync` when `value == value_type{0}`. For IEEE 754 floating-point types, `-0.0 == 0.0` is `true` per the standard, so `-0.0` was incorrectly routed through `cudaMemsetAsync(..., 0, ...)` which clears all bits — including the sign bit — normalizing `-0.0` to `+0.0`. This corrupts the in-memory representation of `-0.0` for any downstream library that creates scalars through RMM (`cudf::fixed_width_scalar::set_value` → `rmm::device_scalar::set_value_async` → `device_uvector::set_element_async`), causing observable behavioral divergence in spark-rapids (e.g., `cast(-0.0 as string)` returns `"0.0"` on GPU instead of `"-0.0"`). ### Fix Per the discussion in #2298, remove all `constexpr` special casing in `set_element_async` — both the `bool` `cudaMemsetAsync` path and the `is_fundamental_v` zero-detection path — and always use `cudaMemcpyAsync`. This preserves exact bit-level representations for all types, which is the correct contract for a memory management library that sits below cuDF, cuML, and cuGraph. `set_element_to_zero_async` is unchanged — its explicit "set to zero" semantics make `cudaMemsetAsync` the correct implementation. ### Testing Added `NegativeZeroTest.PreservesFloatNegativeZero` and `NegativeZeroTest.PreservesDoubleNegativeZero` regression tests that verify the sign bit of `-0.0f` / `-0.0` survives a round-trip through `set_element_async` → `element`. All 122 tests pass locally (CUDA 13.0, RTX 5880). Closes #2298 ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/rmm/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. - [x] The documentation is up to date with these changes. Made with [Cursor](https://cursor.com) --------- Signed-off-by: Allen Xu <allxu@nvidia.com>
## Description I found that the `ulimit` settings for CUDA 13.1 devcontainers were missing. This fixes it. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/rmm/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. - [x] The documentation is up to date with these changes.
This PR sets an upper bound on the `numba-cuda` dependency to `<0.29.0` Authors: - https://github.com/brandon-b-miller Approvers: - Bradley Dice (https://github.com/bdice) URL: #2306
Collaborator
|
Closed by #2310. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Forward-merge triggered by push to release/26.04 that creates a PR to keep main up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.