Skip to content

Forward-merge release/26.04 into main#2297

Merged
bdice merged 6 commits intomainfrom
release/26.04
Mar 16, 2026
Merged

Forward-merge release/26.04 into main#2297
bdice merged 6 commits intomainfrom
release/26.04

Conversation

@rapids-bot
Copy link

@rapids-bot rapids-bot bot commented Mar 12, 2026

Forward-merge triggered by push to release/26.04 that creates a PR to keep main up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

Fixes these `pre-commit` errors blocking CI:

```text
verify-hardcoded-version.................................................Failed
- hook id: verify-hardcoded-version
- exit code: 1

In file RAPIDS_BRANCH:1:9:
 release/26.04
warning: do not hard-code version, read from VERSION file instead

In file RAPIDS_BRANCH:1:9:
 release/26.04

In file cpp/examples/versions.cmake:8:21:
 set(RMM_TAG release/26.04)
warning: do not hard-code version, read from VERSION file instead

In file cpp/examples/versions.cmake:8:21:
 set(RMM_TAG release/26.04)
```

By updating `verify-hardcoded-version` configuration and by updating the C++ examples to read `RMM_TAG` from the `RAPIDS_BRANCH` file.

See rapidsai/pre-commit-hooks#121 for details

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #2293
@rapids-bot rapids-bot bot requested a review from a team as a code owner March 12, 2026 22:04
@rapids-bot rapids-bot bot requested a review from gforsyth March 12, 2026 22:05
@rapids-bot
Copy link
Author

rapids-bot bot commented Mar 12, 2026

FAILURE - Unable to forward-merge due to an error, manual merge is necessary. Do not use the Resolve conflicts option in this PR, follow these instructions https://docs.rapids.ai/maintainers/forward-merger/

IMPORTANT: When merging this PR, do not use the auto-merger (i.e. the /merge comment). Instead, an admin must manually merge by changing the merging strategy to Create a Merge Commit. Otherwise, history will be lost and the branches become incompatible.

Contributes to rapidsai/build-planning#256

Broken out from #2270 

Proposes a stricter pattern for installing `torch` wheels, to prevent bugs of the form "accidentally used a CPU-only `torch` from pypi.org". This should help us to catch compatibility issues, improving release confidence.

Other small changes:

* splits torch wheel testing into "oldest" (PyTorch 2.9) and "latest" (PyTorch 2.10)
* introduces a `require_gpu_pytorch` matrix filter so conda jobs can explicitly request `pytorch-gpu` (to similarly ensure solvers don't fall back to the GPU-only variant)
* appends `rapids-generate-pip-constraint` output to file `PIP_CONSTRAINT` points
  - *(to reduce duplication and the risk of failing to apply constraints)*

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #2279
@rapids-bot rapids-bot bot requested a review from a team as a code owner March 13, 2026 21:46
…adaptor (#2304)

So that the tracking resource adaptor is thread safe, the modification of the tracked allocations should be sandwiched by an "acquire-release" pair upstream.allocate-upstream.deallocate. Previously this was not the case, the upstream allocation occurred before updating the tracked allocations, but the dellocation did not occur after. This could lead to a scenario in multi-threaded use where we get a logged error that a deallocated pointer was not tracked.

To solve this, actually use the correct pattern. Moreover, ensure that we don't observe ABA issues by using try_emplace when tracking an allocation.

- Closes #2303

Authors:
  - Lawrence Mitchell (https://github.com/wence-)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #2304
@rapids-bot rapids-bot bot requested a review from a team as a code owner March 16, 2026 16:11
@rapids-bot rapids-bot bot requested review from davidwendt and wence- March 16, 2026 16:11
wjxiz1992 and others added 3 commits March 16, 2026 12:49
…E 754 -0.0 (#2302)

## Description

`device_uvector::set_element_async` had a zero-value optimization that
used `cudaMemsetAsync` when `value == value_type{0}`. For IEEE 754
floating-point types, `-0.0 == 0.0` is `true` per the standard, so
`-0.0` was incorrectly routed through `cudaMemsetAsync(..., 0, ...)`
which clears all bits — including the sign bit — normalizing `-0.0` to
`+0.0`.

This corrupts the in-memory representation of `-0.0` for any downstream
library that creates scalars through RMM
(`cudf::fixed_width_scalar::set_value` →
`rmm::device_scalar::set_value_async` →
`device_uvector::set_element_async`), causing observable behavioral
divergence in spark-rapids (e.g., `cast(-0.0 as string)` returns `"0.0"`
on GPU instead of `"-0.0"`).

### Fix

Per the discussion in #2298, remove all `constexpr` special casing in
`set_element_async` — both the `bool` `cudaMemsetAsync` path and the
`is_fundamental_v` zero-detection path — and always use
`cudaMemcpyAsync`. This preserves exact bit-level representations for
all types, which is the correct contract for a memory management library
that sits below cuDF, cuML, and cuGraph.

`set_element_to_zero_async` is unchanged — its explicit "set to zero"
semantics make `cudaMemsetAsync` the correct implementation.

### Testing

Added `NegativeZeroTest.PreservesFloatNegativeZero` and
`NegativeZeroTest.PreservesDoubleNegativeZero` regression tests that
verify the sign bit of `-0.0f` / `-0.0` survives a round-trip through
`set_element_async` → `element`. All 122 tests pass locally (CUDA 13.0,
RTX 5880).

Closes #2298

## Checklist
- [x] I am familiar with the [Contributing
Guidelines](https://github.com/rapidsai/rmm/blob/HEAD/CONTRIBUTING.md).
- [x] New or existing tests cover these changes.
- [x] The documentation is up to date with these changes.

Made with [Cursor](https://cursor.com)

---------

Signed-off-by: Allen Xu <allxu@nvidia.com>
## Description
I found that the `ulimit` settings for CUDA 13.1 devcontainers were
missing. This fixes it.

## Checklist
- [x] I am familiar with the [Contributing
Guidelines](https://github.com/rapidsai/rmm/blob/HEAD/CONTRIBUTING.md).
- [x] New or existing tests cover these changes.
- [x] The documentation is up to date with these changes.
This PR sets an upper bound on the `numba-cuda` dependency to `<0.29.0`

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #2306
@bdice bdice merged commit 7ddf10f into main Mar 16, 2026
31 of 32 checks passed
@bdice
Copy link
Collaborator

bdice commented Mar 16, 2026

Closed by #2310.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants