[CORE] Remove legacy Spark 3.2 compatibility code by QCLyu · Pull Request #11495 · apache/gluten

QCLyu · 2026-01-27T06:24:55Z

What changes are proposed in this pull request?

This PR removes remaining Spark 3.2-specific compatibility code from the codebase, completing the Spark 3.2 deprecation.
Changes:

Removed lteSpark32 from SparkVersionUtil.scala and updated lteSpark33 to use direct version comparison
Removed Spark 3.2-specific code paths from:
- SparkTaskUtil.scala - Spark 3.2 TaskContext constructor path
- SparkPlanUtil.scala - Spark 3.2-specific supportsRowBased implementation
- GlutenCostEvaluator.scala - Spark 3.2-specific CostEvaluator instantiation
- Convention.scala - Spark 3.2-specific row type handling
Removed Spark 3.2 test case from MiscOperatorSuite.scala
Deleted entire shims/spark32 directory including:
- ColumnarArrayShim.java
- ParquetFooterReaderShim.scala
Cleaned up unused imports (SparkVersionUtil, SparkShimLoader, AnalysisException)

The codebase now only supports Spark 3.3 and later versions.

How was this patch tested?

Verified compilation succeeds with all unused imports removed
Existing unit tests should continue to pass (Spark 3.3+ only)
Manual verification that no references to lteSpark32 or Spark 3.2-specific code remain in the codebase

Fixes #11379
Related #8960

github-actions · 2026-01-27T06:25:25Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-01-28T07:34:57Z

Run Gluten Clickhouse CI on x86

QCLyu · 2026-01-29T00:36:00Z

Hi could someone help review this PR? From Git Bot, the failed step was Node Copy file from S3, which is a CI infra issue: a 403 Forbidden from AWS S3 during a Jenkins pipeline step that downloads a file from S3.

A few import statements were deleted, bc they were only used in Spark 3.2 related code (removed in #8960)

PHILO-HE · 2026-01-29T05:45:30Z

@QCLyu, could you also help clean up the following code?

https://github.com/apache/incubator-gluten/blob/121776c56b9322e45971a7680e7eb6ad41a44384/LICENSE#L215-L223
https://github.com/apache/incubator-gluten/blob/121776c56b9322e45971a7680e7eb6ad41a44384/ep/build-clickhouse/src/resources/bin/gluten.sh#L39-L40

QCLyu · 2026-01-29T06:44:21Z

Thanks @PHILO-HE Getting back to you later this week.

PHILO-HE

Thanks for your continued efforts.

Could you also help check the APIs declared in SparkShims? Maybe, some of them were introduced due to some code differentiation between Spark 3.2 and the later versions. If so, we can also do a cleanup. Thank you.

PHILO-HE · 2026-01-29T06:38:25Z

gluten-core/src/main/scala/org/apache/gluten/extension/columnar/transition/Convention.scala

-      } else {
-        rowType0()
-      }
+      rowType0()


It seems we can do a more thorough cleanup. My understanding is, KnownRowTypeForSpark33OrLater was introduced for specially handling Spark 3.2. Now that Spark 3.2 has been deprecated, can we remove this trait and directly use KnownRowType instead?

PHILO-HE · 2026-01-29T06:52:02Z

gluten-core/src/main/scala/org/apache/spark/util/SparkVersionUtil.scala

  private val comparedWithSpark35 = compareMajorMinorVersion((3, 5))
  val eqSpark33: Boolean = comparedWithSpark33 == 0
-  val lteSpark33: Boolean = lteSpark32 || eqSpark33
+  val lteSpark33: Boolean = comparedWithSpark33 <= 0


Nit: Maybe, we can just remove this and use eqSpark33 on the caller side instead.

PHILO-HE · 2026-01-31T15:37:22Z

FYI. ColumnarArrayShim is refactored by #11525 whose main purpose is to reduce duplicate code. And this class for Spark 3.2 has been removed in that PR.

github-actions · 2026-02-01T22:20:22Z

Run Gluten Clickhouse CI on x86

QCLyu · 2026-02-01T23:51:59Z

Hi @PHILO-HE please check again. The CI failure was unrelated.

zzcclp · 2026-02-02T03:32:45Z

Run Gluten Clickhouse CI on x86

PHILO-HE

Some new minor comments. Please also rebase the code. Thanks.

PHILO-HE · 2026-02-02T15:02:50Z

gluten-core/src/main/scala/org/apache/gluten/execution/GlutenPlan.scala

-  override def rowType0(): Convention.RowType
+  override def rowType(): Convention.RowType = rowType0()
+
+  def rowType0(): Convention.RowType


Can we remove this?

PHILO-HE · 2026-02-02T15:04:38Z

gluten-core/src/main/scala/org/apache/spark/util/SparkPlanUtil.scala


  def isPlannedV1Write(plan: DataWritingCommandExec): Boolean = {
-    if (SparkVersionUtil.lteSpark33) {
+    if (SparkVersionUtil.compareMajorMinorVersion((3, 3)) <= 0) {


Suggest using eqSpark33 instead, since no need to consider earlier versions.

PHILO-HE · 2026-02-02T15:13:52Z

gluten-substrait/pom.xml

+        <configuration>
+          <!-- Ensure Scala compiles to the same output dir so Java can see Scala classes -->
+          <outputDirectory>${project.build.outputDirectory}</outputDirectory>
+        </configuration>


Did you see some errors without these new code? I am wondering why we need this. If it is indeed necessary, can we move the plugin configuration into root pom for consistency?

Thanks @PHILO-HE Yes. The offline build failures were the “cannot find symbol” errors in gluten-substrait's Java code (e.g. ConverterUtils, SubstraitContext, GlutenConfig). Those can also be caused by:
-Incremental/stale build (e.g. mvn clean compile fixing it)
-Scala and Java writing to different output dirs when a module overrides project.build.outputDirectory (e.g. target/scala-${scala.binary.version}/classes)

I'll move the configuration to the root POM and remove it from gluten-substrait.

github-actions · 2026-02-03T06:20:21Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-03T07:46:23Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-03T07:47:34Z

Run Gluten Clickhouse CI on x86

PHILO-HE · 2026-02-11T12:52:45Z

@QCLyu, could you please spare some time to continue updating this PR? We would like to include it in 1.6 release. If you need help to identify issues, please let me know. Thanks.

QCLyu · 2026-02-11T18:12:55Z

Sorry I'll take a look later today. Targeting close by the end of this Saturday. Please let me know if you need it earlier. Happy to adjust. Cheers, Qingchuan

…

On Wed, Feb 11, 2026, 4:53 AM PHILO-HE ***@***.***> wrote: *PHILO-HE* left a comment (apache/gluten#11495) <#11495 (comment)> @QCLyu <https://github.com/QCLyu>, could you please spare some time to continue updating this PR? We would like to include it in 1.6 release. If you need help to identify issues, please let me know. Thanks. — Reply to this email directly, view it on GitHub <#11495 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJPNXSCVFR6WKOZKUDDY55L4LMQ3JAVCNFSM6AAAAACS746CK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTQOBUGI2DQOJQHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

github-actions · 2026-02-12T04:11:06Z

Run Gluten Clickhouse CI on x86

PHILO-HE · 2026-02-12T06:22:39Z

@QCLyu, it seems this PR should not change ArrowColumnarArray. Maybe, you need to keep its related code unchanged to pass CI.

github-actions · 2026-02-12T06:53:53Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T01:49:18Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T02:02:31Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T02:12:17Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T02:20:06Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T02:41:32Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T02:52:41Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T02:53:06Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T05:00:39Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T06:34:48Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T06:42:43Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T07:06:59Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T07:23:24Z

Run Gluten Clickhouse CI on x86

github-actions · 2026-02-13T07:31:43Z

Run Gluten Clickhouse CI on x86

QCLyu · 2026-02-14T07:17:39Z

Hi @PHILO-HE , are you aware of any recent migration from java to scala regarding TreeMemoryConsumers? The challenge is I couldn't test compliers locally after making changes, and pushing blind commits is very inefficient. It always failed with sth similar to the following error:
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.9.2:compile (scala-compile-first) on project gluten-core: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:4.9.2:compile failed: Failed to find name hashes for org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumers

By switching to the main branch, the same error persisted. My guess is a migration from Scala to Java happened recently. Checked that TreeMemoryConsumers exists only as a .java file in the main branch (cc @zhztheplayer ).

What I have tried:

Standard Clean Build (mvn clean package)
Bypass Zinc Server (-Dscala.useZincServer=false)
Manual File Search (Checking for ghost .scala files)
Nuke and Pave (Manual rm -rf target/)
Forces the Scala compiler to look at Java source files in the same pass (added configuration in pom.xml)

So far all the above methods failed. How I generally test locally before pushing PR:
mvn clean package -Pbackends-velox -Pspark-3.5 -DskipTests
Or, simply mvn clean package -DskipTests
These local tests were generally useful before I took a break.

Would appreciate guidance or contexts.

QCLyu · 2026-02-15T04:13:52Z

Hi @PHILO-HE , are you aware of any recent migration from java to scala regarding TreeMemoryConsumers? The challenge is I couldn't test compliers locally after making changes, and pushing blind commits is very inefficient. It always failed with sth similar to the following error: [ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.9.2:compile (scala-compile-first) on project gluten-core: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:4.9.2:compile failed: Failed to find name hashes for org.apache.gluten.memory.memtarget.spark.TreeMemoryConsumers

By switching to the main branch, the same error persisted. My guess is a migration from Scala to Java happened recently. Checked that TreeMemoryConsumers exists only as a .java file in the main branch (cc @zhztheplayer ).

What I have tried:

Standard Clean Build (mvn clean package)

Bypass Zinc Server (-Dscala.useZincServer=false)

Manual File Search (Checking for ghost .scala files)

Nuke and Pave (Manual rm -rf target/)

Forces the Scala compiler to look at Java source files in the same pass (added configuration in pom.xml)

So far all the above methods failed. How I generally test locally before pushing PR: mvn clean package -Pbackends-velox -Pspark-3.5 -DskipTests Or, simply mvn clean package -DskipTests These local tests were generally useful before I took a break.

Would appreciate guidance or contexts.

Created a separate issue #11616 . Please correct me if I'm wrong (or overthinking).

QCLyu · 2026-02-15T04:42:11Z

Will re-open.

QCLyu · 2026-02-15T18:15:32Z

Created a separate issue #11616 . Please correct me if I'm wrong (or overthinking).

Likely a local problem. Working on it. Closed the separate issue #11616 . This abandoned PR is still cited in the issue #11379 for reference purpose. I will create a separate PR to clean Spark 3.2 compatibility code for the sake of clean history. Target release 1.7 in May 2026.

cc @PHILO-HE @zhztheplayer

github-actions bot added CORE works for Gluten Core VELOX labels Jan 27, 2026

QCLyu marked this pull request as ready for review January 29, 2026 00:34

FelixYBW changed the title ~~Qingchuanlyu~~ Cleanup of Spark 3.2 code Jan 29, 2026

FelixYBW changed the title ~~Cleanup of Spark 3.2 code~~ [CORE] Cleanup of Spark 3.2 code Jan 29, 2026

PHILO-HE self-requested a review January 29, 2026 02:31

PHILO-HE reviewed Jan 29, 2026

View reviewed changes

PHILO-HE changed the title ~~[CORE] Cleanup of Spark 3.2 code~~ [CORE] Remove legacy Spark 3.2 compatibility code Jan 30, 2026

QCLyu marked this pull request as draft February 1, 2026 21:51

github-actions bot added INFRA CLICKHOUSE labels Feb 1, 2026

QCLyu marked this pull request as ready for review February 1, 2026 23:23

PHILO-HE reviewed Feb 2, 2026

View reviewed changes

QCLyu force-pushed the qingchuanlyu branch from bb712b5 to ac95be8 Compare February 3, 2026 07:45

QCLyu marked this pull request as draft February 3, 2026 17:54

zhztheplayer mentioned this pull request Feb 11, 2026

Remove remaining Spark 3.2-specific compatibility code #11379

Open

github-actions bot added the DOCS label Feb 13, 2026

github-actions bot added the BUILD label Feb 13, 2026

QCLyu force-pushed the qingchuanlyu branch from c8c096d to 37562d5 Compare February 13, 2026 07:06

github-actions bot removed BUILD CLICKHOUSE DOCS labels Feb 13, 2026

QCLyu force-pushed the qingchuanlyu branch from 37562d5 to fcf6ae4 Compare February 13, 2026 07:22

QCLyu closed this Feb 15, 2026

QCLyu force-pushed the qingchuanlyu branch from 02dbd33 to be3eeea Compare February 15, 2026 03:05

QCLyu mentioned this pull request Mar 10, 2026

Remove Spark 3.2 compatibility code #11731

Open

Conversation

QCLyu commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes are proposed in this pull request?

How was this patch tested?

Uh oh!

github-actions bot commented Jan 27, 2026

Uh oh!

github-actions bot commented Jan 28, 2026

Uh oh!

QCLyu commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PHILO-HE commented Jan 29, 2026

Uh oh!

QCLyu commented Jan 29, 2026

Uh oh!

PHILO-HE left a comment

Choose a reason for hiding this comment

Uh oh!

PHILO-HE Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

PHILO-HE Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

PHILO-HE commented Jan 31, 2026

Uh oh!

github-actions bot commented Feb 1, 2026

Uh oh!

QCLyu commented Feb 1, 2026

Uh oh!

zzcclp commented Feb 2, 2026

Uh oh!

PHILO-HE left a comment

Choose a reason for hiding this comment

Uh oh!

PHILO-HE Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

PHILO-HE Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

PHILO-HE Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

QCLyu Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

PHILO-HE commented Feb 11, 2026

Uh oh!

QCLyu commented Feb 11, 2026 via email

Uh oh!

github-actions bot commented Feb 12, 2026

Uh oh!

PHILO-HE commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 13, 2026

QCLyu commented Jan 27, 2026 •

edited

Loading

QCLyu commented Jan 29, 2026 •

edited

Loading

QCLyu commented Feb 14, 2026 •

edited

Loading