Upgrade to Scala 2.13.18 and modernize unused warnings configuration by gerashegalov · Pull Request #5 · res-life/spark-rapids

gerashegalov · 2026-01-24T19:05:24Z

Upgrade Scala 2.13 from version 2.13.14 to 2.13.18
Modernize compiler warning flags by replacing the deprecated -Ywarn-unused:locals,patvars,privates
with more granular -Wconf and -Wunused syntax for better control over unused code detection
Remove unused imports across Delta Lake and SQL plugin files identified by stricter compiler settings
Simplify Scala 2.13 build profile handling in buildall script by consolidating POM file selection
and removing redundant profile-specific version collection logic
Update documentation references from "unshimmed-common-from-spark320.txt" to
"unshimmed-common-from-single-shim.txt" to reflect generalized shim naming
Add --scala213 command-line option to buildall for explicit Scala 2.13 builds

Signed-off-by: Gera Shegalov gshegalov@nvidia.com

### Description Update authorized users ### Checklists - [ ] This PR has added documentation for new or modified features or behaviors. - [ ] This PR has added new tests or modified existing tests to cover new code paths. (Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.) - [ ] Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description. Signed-off-by: Sameer Raheja <sraheja@.nvidia.com> Co-authored-by: Sameer Raheja <sraheja@.nvidia.com>

…e to safe (NVIDIA#14166) Contributes to NVIDIA#14135 ### Description From Spark 41x, it changes the default mode from unsafe to safe, so some cases failed. This PR set disable `spark.sql.execution.pandas.convertToArrowArraySafely` to pass the ITs. ### Checklists - [ ] This PR has added documentation for new or modified features or behaviors. - [ ] This PR has added new tests or modified existing tests to cover new code paths. (Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.) - [ ] Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description. Signed-off-by: Chong Gao <res_life@163.com> Co-authored-by: Chong Gao <res_life@163.com>

Contributes to NVIDIA#13672 ### Description This PR: Add retry support to GpuBatchedBoundedWindowIterator to handle OOM: - Protect the following 3 operations with OOM retry support. - Window computation (by `computeWindowWithRetry`) - Input batch concatenation with the cache (by `getNextInputBatchWithRetry`) - Batch trim (by `trimWithRetry`) - Add unit tests for retry and split-and-retry OOM scenarios NDS numbers (:Seconds) with 10k data size shows no perf regressions. 'rapids-4-spark_2.12-26.02.0-20260102.073925-32-cuda12.jar' was used for nightly runs, not sure why it is a little slower. Will try to run this more. |ID|with PR| Nitghtly| |--|--|--| |1| 1302 | 1369| |2| 1315 | 1389| |avg| 1308.5|1379| ### Checklists - [x] This PR has added documentation for new or modified features or behaviors. - [x] This PR has added new tests or modified existing tests to cover new code paths. (Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.) - [x] Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description. --------- Signed-off-by: Firestarman <firestarmanllc@gmail.com> Co-authored-by: Firestarman <firestarmanllc@gmail.com>

Fixes NVIDIA#14179 ## Problem When a Spark task is killed, the merger thread can remain stuck in `Object.wait()` indefinitely, becoming a "zombie" that blocks subsequent tasks assigned to the same merger slot. **Root cause**: The wait loops in `mergerTask` only check `hasNewWork` flag but don't properly respond to thread interruption. When `cancel(true)` is called, there's a race condition where the interrupt may not be handled. **Impact**: Observed 26+ minute executor hang in production NDS benchmark. ## Fix Add interrupt flag checking in wait loops and catch `InterruptedException`: ```scala while (!hasNewWork.get() && !Thread.currentThread().isInterrupted) { try { mergerCondition.wait() } catch { case _: InterruptedException => Thread.currentThread().interrupt() return } } if (Thread.currentThread().isInterrupted) { return } ``` This ensures the merger thread exits gracefully when cancelled, even in edge cases. --------- Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>

…DIA#14182) Follow‑up on the migration to remove anonymous access from Artifactory. Please migrate the pre‑merge scripts to access the new Maven Artifactory using a user and token. The test script spark-premerge-build.sh → [hybrid-execution.sh](https://github.com/NVIDIA/spark-rapids/blob/main/jenkins/hybrid_execution.sh#L41-L43) downloads JARs from the Artifactory repository via `wget`. Provide the .netrc credentials for downloading JARs from the Artifactory repository. Note: The pre‑merge CI already covers tests for this change. --------- Signed-off-by: Tim Liu <timl@nvidia.com>

) Fixes NVIDIA#7520 ### Description calls to JNI utility to do overflow check for round/bround operators Added cases for overflow check for byte/short/int/long types. Note: Only Spark 340+ supports ANSI for round/bround. ### depends on * NVIDIA/spark-rapids-jni#4174 ### Checklists - [ ] This PR has added documentation for new or modified features or behaviors. - [x] This PR has added new tests or modified existing tests to cover new code paths. - [ ] Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description. --------- Signed-off-by: Chong Gao <res_life@163.com> Co-authored-by: Chong Gao <res_life@163.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Fixes NVIDIA#13389. This commit adds support for [Iceberg's "identity" partition transform](https://iceberg.apache.org/spec/#partition-transforms). This allows for an Iceberg table to be partitioned on a column's values, with no modification to the column's row values. The implementation is trivial. In the interest of not increasing the test runtime too much, a sampling of column types have been included in the coverage tests. --------- Signed-off-by: MithunR <mithunr@nvidia.com>

Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>

Another test case to use RelationalGroupedDataset.toString to expect the correct type string. But in JDK11+ it returns empty string. Created testRapids cases by checking JDK version to change the expected string. Close NVIDIA#14188 --------- Signed-off-by: Gary Shen <gashen@nvidia.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…4189) Fixes NVIDIA#14037. ### Description Cherry-pick code from Spark. BroadcastExchangeExec after materialized won't be materialized again, so we should not reset the metrics. Cherry-pick to `GpuBroadcastExchangeExecBase`. The apache/spark@a823f95c522 targets Spark 4.1, because it's a common fix, we do not only add this to 411 shim, the change applies to all Spark versions. ### Checklists - [ ] This PR has added documentation for new or modified features or behaviors. - [ ] This PR has added new tests or modified existing tests to cover new code paths. It's only related to metrics, so do not impact any feature. - [ ] Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description. Signed-off-by: Chong Gao <res_life@163.com> Co-authored-by: Chong Gao <res_life@163.com>

) close NVIDIA#14099 According to the debugging logs as below, the root cause is `GpuProjectExec` was trying to produce a string column(“payload#3”) with too large size (~12.8G), even GPU will split it into 5 parts by the pre-split in `GpuProjectExec`, each still has ~2.56G(=12.8/5), larger than the size limit(2G) of cudf column that requires an offset buffer. The pre-split computation only takes care of the total output size, ignoring the individual column size limit. ``` ===> got 1073741760 bytes for out column of expr: input[0, bigint, true](join_key#2) ===> got 13743894532 bytes for out column of expr: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AS payload#3 ==> got 5 splits for output size: 14817636292, split unit size: 3.221225472E9 ``` So this PR improves the pre-split to take the column limit into account when calculating the splits number. Verfied this PR by the case in the linked issue locally and it can fix the "CUDF String column overflow". NOTE, this PR can only fix the literal case mentioned in the linked issue. To also fix non-literal cases for GpuProjectExec, we need to address the issue NVIDIA#14191. --------- Signed-off-by: Firestarman <firestarmanllc@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Signed-off-by: nvauto <70000568+nvauto@users.noreply.github.com>

Fixes NVIDIA#14196 ### Description We already supported identity transform in iceberg, so we should remove all fallback tests for identity transform. This pr continues NVIDIA#14183 to cleanup those tests. ### Checklists - [ ] This PR has added documentation for new or modified features or behaviors. - [x] This PR has added new tests or modified existing tests to cover new code paths. (Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.) - [ ] Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description. --------- Signed-off-by: Ubuntu <ubuntu@ip-172-31-50-247.us-west-2.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-50-247.us-west-2.compute.internal>

This commit includes the following changes: - Upgrade Scala 2.13 from version 2.13.14 to 2.13.18 - Modernize compiler warning flags by replacing the deprecated -Ywarn-unused:locals,patvars,privates with more granular -Wconf and -Wunused syntax for better control over unused code detection - Remove unused imports across Delta Lake and SQL plugin files identified by stricter compiler settings - Simplify Scala 2.13 build profile handling in buildall script by consolidating POM file selection and removing redundant profile-specific version collection logic - Update documentation references from "unshimmed-common-from-spark320.txt" to "unshimmed-common-from-single-shim.txt" to reflect generalized shim naming - Add --scala213 command-line option to buildall for explicit Scala 2.13 builds Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>

res-life · 2026-01-28T05:13:49Z

Only has Squash and merge option, I'll merge this PR mannully. Please do not use the Squash and merge.
I want to retain the commits without Squash.

sameerz and others added 16 commits January 20, 2026 08:36

Java8 Target

5e4bfff

Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>

jdk8 target

9b81311

Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>

Create release branch release/26.02

229d561

Signed-off-by: nvauto <70000568+nvauto@users.noreply.github.com>

Merge remote-tracking branch 'origin/release/26.02' into gera-jdk8

0d8b133

gerashegalov changed the title ~~Gera jdk8~~ Upgrade to Scala 2.13.18 and modernize unused warnings configuration Jan 27, 2026

gerashegalov mentioned this pull request Jan 27, 2026

[FEA] Add support for Spark 4.1.1 [databricks] NVIDIA/spark-rapids#14120

Merged

6 tasks

res-life approved these changes Jan 28, 2026

View reviewed changes

res-life merged commit 0d5157a into res-life:spark-41-shim Jan 28, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade to Scala 2.13.18 and modernize unused warnings configuration#5

Upgrade to Scala 2.13.18 and modernize unused warnings configuration#5
res-life merged 16 commits intores-life:spark-41-shimfrom
gerashegalov:gera-jdk8

gerashegalov commented Jan 24, 2026 •

edited

Loading

Uh oh!

res-life commented Jan 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Conversation

gerashegalov commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

res-life commented Jan 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

gerashegalov commented Jan 24, 2026 •

edited

Loading