Skip to content

bump spark rapids jar versions [skip ci]#1003

Merged
eordentlich merged 1 commit intoNVIDIA:mainfrom
eordentlich:eo_spark_rapids_version
Jan 9, 2026
Merged

bump spark rapids jar versions [skip ci]#1003
eordentlich merged 1 commit intoNVIDIA:mainfrom
eordentlich:eo_spark_rapids_version

Conversation

@eordentlich
Copy link
Collaborator

No description provided.

Signed-off-by: Erik Ordentlich <eordentlich@gmail.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR bumps the Spark RAPIDS JAR version from 25.08.0 to 25.12.0 across all Databricks-related configuration files and scripts. The changes are straightforward and consistent:

  • Updated SPARK_RAPIDS_VERSION variable definitions in initialization scripts
  • Updated hardcoded JAR paths in documentation and cluster specifications
  • All changes align with the RAPIDS Python package version (25.12.0) already set in the codebase

The version format follows the project's conventions (leading zero in month field for JAR versions). All affected files have been updated consistently with no remaining references to the old version.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it's a straightforward version bump
  • Score reflects the simplicity and consistency of the changes. All version updates are mechanical string replacements that maintain internal consistency. The PR only updates documentation and configuration files without any logic changes. The version format follows established conventions, and no references to the old version remain in the codebase.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
notebooks/databricks/README.md 5/5 Updated hardcoded jar path in Spark configuration example from 25.08.0 to 25.12.0
notebooks/databricks/init-pip-cuda-12.sh 5/5 Updated SPARK_RAPIDS_VERSION variable from 25.08.0 to 25.12.0, jar will be downloaded from Maven
python/benchmark/databricks/gpu_etl_cluster_spec.sh 5/5 Updated hardcoded jar path in cluster spec JSON from 25.08.0 to 25.12.0
python/benchmark/databricks/init-pip-cuda-12.sh 5/5 Updated SPARK_RAPIDS_VERSION variable from 25.08.0 to 25.12.0, jar will be downloaded from Maven
python/run_benchmark.sh 5/5 Updated SPARK_RAPIDS_VERSION variable from 25.08.0 to 25.12.0 for gpu_etl cluster type

Sequence Diagram

sequenceDiagram
    participant User as User/Developer
    participant README as notebooks/databricks/README.md
    participant InitScript1 as notebooks/databricks/init-pip-cuda-12.sh
    participant InitScript2 as python/benchmark/databricks/init-pip-cuda-12.sh
    participant ClusterSpec as python/benchmark/databricks/gpu_etl_cluster_spec.sh
    participant BenchmarkScript as python/run_benchmark.sh
    participant Maven as Maven Repository
    participant Databricks as Databricks Cluster

    User->>README: Reads documentation for setup
    Note over README: spark.executorEnv.PYTHONPATH<br/>updated to rapids-4-spark_2.12-25.12.0.jar

    User->>Databricks: Creates cluster with init script
    Databricks->>InitScript1: Executes init-pip-cuda-12.sh
    Note over InitScript1: SPARK_RAPIDS_VERSION=25.12.0
    InitScript1->>Maven: Downloads rapids-4-spark_2.12-25.12.0-cuda12.jar
    Maven-->>InitScript1: Returns JAR file
    InitScript1->>Databricks: Installs JAR to /databricks/jars/

    User->>BenchmarkScript: Runs benchmark with gpu_etl mode
    Note over BenchmarkScript: SPARK_RAPIDS_VERSION=25.12.0
    BenchmarkScript->>Maven: Downloads rapids-4-spark_2.12-25.12.0-cuda12.jar
    Maven-->>BenchmarkScript: Returns JAR file
    BenchmarkScript->>Databricks: Configures Spark with JAR

    User->>ClusterSpec: Uses cluster spec for benchmarking
    Note over ClusterSpec: spark.executorEnv.PYTHONPATH<br/>configured with rapids-4-spark_2.12-25.12.0.jar
    ClusterSpec->>InitScript2: References init-pip-cuda-12.sh
    Note over InitScript2: SPARK_RAPIDS_VERSION=25.12.0
    InitScript2->>Maven: Downloads rapids-4-spark_2.12-25.12.0-cuda12.jar
    Maven-->>InitScript2: Returns JAR file
    InitScript2->>Databricks: Installs JAR for benchmark cluster
Loading

@eordentlich
Copy link
Collaborator Author

build

@eordentlich eordentlich merged commit 81b6ee9 into NVIDIA:main Jan 9, 2026
4 checks passed
@eordentlich eordentlich deleted the eo_spark_rapids_version branch January 9, 2026 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants