Skip to content

Support Spark 4.x#450

Open
pang-wu wants to merge 34 commits intoray-project:masterfrom
pang-wu:pang/spark4
Open

Support Spark 4.x#450
pang-wu wants to merge 34 commits intoray-project:masterfrom
pang-wu:pang/spark4

Conversation

@pang-wu
Copy link
Copy Markdown
Collaborator

@pang-wu pang-wu commented Dec 8, 2025

This PR adapt raydp with Spark 4.x but leave the following work for future improvement:

  1. Support tensorflow 2.16+ (see https://keras.io/getting_started/#tensorflow--keras-2-backwards-compatibility) and numpy 2.x
  2. Support python 3.12 - we deprecated Python 3.9 because it is no longer supported by Spark. Need to modernize python build system.
  3. Deprecate Ray AIR.

To make the tests pass, the PR is based on #458. Once PR#458 is merged this PR should rebase again.

@pang-wu pang-wu changed the title Support SPark 4.0.0 Support Spark 4.0.0 Dec 8, 2025
@pang-wu pang-wu force-pushed the pang/spark4 branch 7 times, most recently from 21be2c9 to 1f04b26 Compare February 16, 2026 17:47
@pang-wu pang-wu changed the title Support Spark 4.0.0 Support Spark 4.x Feb 16, 2026
Copy link
Copy Markdown
Contributor

@rexminnis rexminnis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together — the CommandLineUtilsBridge pattern and the SparkSubmit rework are clean solutions to the cross-version API drift. A few things I noticed:

  1. Bug: spark340/SparkSqlUtils.toArrowRDD has infinite recursion (see inline comment)
  2. Java target: maven.compiler.source is still 1.8 — worth bumping to 17?
  3. Spark version: spark410.version targets 4.1.0 — consider 4.1.1 (current release)

Happy to help with testing or any of the shim work. I have a working Spark 4.1.1 setup locally and have been validating the Arrow conversion paths end-to-end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants