[CORE] Remove legacy Spark 3.2 compatibility code#11495
[CORE] Remove legacy Spark 3.2 compatibility code#11495QCLyu wants to merge 0 commit intoapache:mainfrom
Conversation
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
|
Hi could someone help review this PR? From Git Bot, the failed step was Node Copy file from S3, which is a CI infra issue: a 403 Forbidden from AWS S3 during a Jenkins pipeline step that downloads a file from S3. A few import statements were deleted, bc they were only used in Spark 3.2 related code (removed in #8960) |
|
Thanks @PHILO-HE Getting back to you later this week. |
PHILO-HE
left a comment
There was a problem hiding this comment.
Thanks for your continued efforts.
Could you also help check the APIs declared in SparkShims? Maybe, some of them were introduced due to some code differentiation between Spark 3.2 and the later versions. If so, we can also do a cleanup. Thank you.
| } else { | ||
| rowType0() | ||
| } | ||
| rowType0() |
There was a problem hiding this comment.
It seems we can do a more thorough cleanup. My understanding is, KnownRowTypeForSpark33OrLater was introduced for specially handling Spark 3.2. Now that Spark 3.2 has been deprecated, can we remove this trait and directly use KnownRowType instead?
| private val comparedWithSpark35 = compareMajorMinorVersion((3, 5)) | ||
| val eqSpark33: Boolean = comparedWithSpark33 == 0 | ||
| val lteSpark33: Boolean = lteSpark32 || eqSpark33 | ||
| val lteSpark33: Boolean = comparedWithSpark33 <= 0 |
There was a problem hiding this comment.
Nit: Maybe, we can just remove this and use eqSpark33 on the caller side instead.
|
FYI. |
|
Run Gluten Clickhouse CI on x86 |
|
Hi @PHILO-HE please check again. The CI failure was unrelated. |
|
Run Gluten Clickhouse CI on x86 |
PHILO-HE
left a comment
There was a problem hiding this comment.
Some new minor comments. Please also rebase the code. Thanks.
| override def rowType0(): Convention.RowType | ||
| override def rowType(): Convention.RowType = rowType0() | ||
|
|
||
| def rowType0(): Convention.RowType |
|
|
||
| def isPlannedV1Write(plan: DataWritingCommandExec): Boolean = { | ||
| if (SparkVersionUtil.lteSpark33) { | ||
| if (SparkVersionUtil.compareMajorMinorVersion((3, 3)) <= 0) { |
There was a problem hiding this comment.
Suggest using eqSpark33 instead, since no need to consider earlier versions.
gluten-substrait/pom.xml
Outdated
| <configuration> | ||
| <!-- Ensure Scala compiles to the same output dir so Java can see Scala classes --> | ||
| <outputDirectory>${project.build.outputDirectory}</outputDirectory> | ||
| </configuration> |
There was a problem hiding this comment.
Did you see some errors without these new code? I am wondering why we need this. If it is indeed necessary, can we move the plugin configuration into root pom for consistency?
There was a problem hiding this comment.
Thanks @PHILO-HE Yes. The offline build failures were the “cannot find symbol” errors in gluten-substrait's Java code (e.g. ConverterUtils, SubstraitContext, GlutenConfig). Those can also be caused by:
-Incremental/stale build (e.g. mvn clean compile fixing it)
-Scala and Java writing to different output dirs when a module overrides project.build.outputDirectory (e.g. target/scala-${scala.binary.version}/classes)
I'll move the configuration to the root POM and remove it from gluten-substrait.
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
|
@QCLyu, could you please spare some time to continue updating this PR? We would like to include it in 1.6 release. If you need help to identify issues, please let me know. Thanks. |
|
Sorry I'll take a look later today. Targeting close by the end of this
Saturday. Please let me know if you need it earlier. Happy to adjust.
Cheers,
Qingchuan
…On Wed, Feb 11, 2026, 4:53 AM PHILO-HE ***@***.***> wrote:
*PHILO-HE* left a comment (apache/gluten#11495)
<#11495 (comment)>
@QCLyu <https://github.com/QCLyu>, could you please spare some time to
continue updating this PR? We would like to include it in 1.6 release. If
you need help to identify issues, please let me know. Thanks.
—
Reply to this email directly, view it on GitHub
<#11495 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJPNXSCVFR6WKOZKUDDY55L4LMQ3JAVCNFSM6AAAAACS746CK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTQOBUGI2DQOJQHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Run Gluten Clickhouse CI on x86 |
|
@QCLyu, it seems this PR should not change |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
3 similar comments
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
5 similar comments
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
|
Hi @PHILO-HE , are you aware of any recent migration from java to scala regarding TreeMemoryConsumers? The challenge is I couldn't test compliers locally after making changes, and pushing blind commits is very inefficient. It always failed with sth similar to the following error: By switching to the main branch, the same error persisted. My guess is a migration from Scala to Java happened recently. Checked that TreeMemoryConsumers exists only as a .java file in the main branch (cc @zhztheplayer ). What I have tried:
So far all the above methods failed. How I generally test locally before pushing PR: Would appreciate guidance or contexts. |
Created a separate issue #11616 . Please correct me if I'm wrong (or overthinking). |
|
Will re-open. |
Likely a local problem. Working on it. Closed the separate issue #11616 . This abandoned PR is still cited in the issue #11379 for reference purpose. I will create a separate PR to clean Spark 3.2 compatibility code for the sake of clean history. Target release 1.7 in May 2026. |
What changes are proposed in this pull request?
This PR removes remaining Spark 3.2-specific compatibility code from the codebase, completing the Spark 3.2 deprecation.
Changes:
The codebase now only supports Spark 3.3 and later versions.
How was this patch tested?
Verified compilation succeeds with all unused imports removed
Existing unit tests should continue to pass (Spark 3.3+ only)
Manual verification that no references to lteSpark32 or Spark 3.2-specific code remain in the codebase
Fixes #11379
Related #8960