[GLUTEN-10707] Improve SparkDirectoryUtil by reduce synchronized by beliefer · Pull Request #10708 · apache/gluten

beliefer · 2025-09-15T12:36:52Z

What changes are proposed in this pull request?

This PR proposes to improve SparkDirectoryUtil by reduce synchronized
Fixes #10707

How was this patch tested?

GA tests.

github-actions · 2025-09-15T12:37:20Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-09-15T12:37:20Z

#10707

zhztheplayer · 2025-09-15T17:00:14Z

gluten-core/src/main/scala/org/apache/spark/util/SparkDirectoryUtil.scala

-      INSTANCE = new SparkDirectoryUtil(roots)
+  private def init(roots: Array[String]): Unit = {
+    if (INSTANCE.get() == null) {
+      INSTANCE.compareAndSet(null, new SparkDirectoryUtil(roots))


The code is workable but the constructor of SparkDirectoryUtil may be unnecessarily called in concurrent invocations.

We can use AtomicBoolean instead if unable to avoid this.

I'm sorry. I didn't understand what you said. Could you tell me more ?

For example, thread 1 and thread 2 can both reach line 80 at the same time. Although only one thread will initialize the INSTANCE, another thread will still do new SparkDirectoryUtil(roots) which is not necessary.

beliefer · 2025-09-16T02:23:52Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-09-16T08:19:07Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-09-16T09:08:22Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-09-16T09:35:26Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-09-16T09:55:49Z

Run Gluten Clickhouse CI on x86

zhztheplayer

The get operation is still not synchronized... I am inclined to keep the current main code. Because this is not the hot code path and is not complicated. Do you agree?

zhztheplayer · 2025-09-16T14:45:22Z

gluten-core/src/main/scala/org/apache/spark/util/SparkDirectoryUtil.scala

 }

 object SparkDirectoryUtil extends Logging {
+  private val state = new AtomicBoolean(false)


Let's rename it. E.g., INSTANCE_INITIALIZED

zhztheplayer · 2025-09-16T14:48:33Z

gluten-core/src/main/scala/org/apache/spark/util/SparkDirectoryUtil.scala

+      if (state.compareAndSet(false, true)) {
+        INSTANCE = new SparkDirectoryUtil(roots)
+      }
      return


It's not the hot code path, let's remove the if (INSTANCE == null) condition.

Just directly call compareAndSet.

Because remove if (INSTANCE == null) will cause NPE. So I removed if (INSTANCE.roots.toSet != roots.toSet).

zhztheplayer · 2025-09-16T14:49:07Z

gluten-core/src/main/scala/org/apache/spark/util/SparkDirectoryUtil.scala

+  def get(): SparkDirectoryUtil = {
    assert(INSTANCE != null, "Default instance of SparkDirectoryUtil was not set yet")
    INSTANCE
  }


The get operation is not synchronized.

Now, it is guarded with INSTANCE_INITIALIZED.

github-actions · 2025-09-17T02:31:45Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-09-17T02:33:21Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-09-17T02:39:41Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-09-17T02:40:59Z

Run Gluten Clickhouse CI on x86

beliefer · 2025-09-17T02:44:42Z

The get operation is still not synchronized... I am inclined to keep the current main code. Because this is not the hot code path and is not complicated. Do you agree?

Now, the issue has been fixed.

zhztheplayer · 2025-09-17T13:48:49Z

gluten-core/src/main/scala/org/apache/spark/util/SparkDirectoryUtil.scala

+  def get(): SparkDirectoryUtil = {
+    assert(INSTANCE_INITIALIZED.get(), "Default instance of SparkDirectoryUtil was not set yet")
    INSTANCE
  }


Hi @beliefer, thank you for keeping iterating the code, but it is still problematic :(

When thread 1 reaches line 81 but hasn't yet set INSTANCE, thread 2 can pass line 88 and access INSTANCE which may give an unexpected result to caller.

I know it's a corner case, but we should make sure the new code covers what is covered by the old code completely.

Thank you for the explanation. I will try other way.

github-actions · 2025-09-18T02:44:56Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-09-18T02:48:51Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-09-18T02:53:54Z

Run Gluten Clickhouse CI on x86

beliefer · 2025-10-10T09:34:53Z

@zhztheplayer Could you help me review this again ?

zhztheplayer

Thanks.

zhztheplayer · 2025-10-10T21:13:00Z

gluten-core/src/main/scala/org/apache/spark/util/SparkDirectoryUtil.scala

-    }
-    if (INSTANCE.roots.toSet != roots.toSet) {
+    if (targetRoots == null) {
+      targetRoots = roots


Perhaps just use the same name and

this.roots = roots

. Just a nit. Both are fine.

github-actions · 2025-10-11T02:39:52Z

Run Gluten Clickhouse CI on x86

beliefer · 2025-10-13T07:56:11Z

@zhztheplayer Thank you!

github-actions bot added the CORE works for Gluten Core label Sep 15, 2025

zhztheplayer reviewed Sep 15, 2025

View reviewed changes

beliefer force-pushed the 10707 branch from 98e721f to 276f81a Compare September 16, 2025 08:18

beliefer changed the title ~~[GLUTEN-10707] Improve SparkDirectoryUtil with AtomicReference~~ [GLUTEN-10707] Improve SparkDirectoryUtil with AtomicBoolean Sep 16, 2025

beliefer force-pushed the 10707 branch from 555e417 to 12f360b Compare September 16, 2025 09:34

beliefer force-pushed the 10707 branch from 12f360b to 3f16936 Compare September 16, 2025 09:55

zhztheplayer requested changes Sep 16, 2025

View reviewed changes

beliefer force-pushed the 10707 branch from 3b9b084 to e5cc234 Compare September 17, 2025 02:32

beliefer force-pushed the 10707 branch from e5cc234 to 95170d2 Compare September 17, 2025 02:39

beliefer force-pushed the 10707 branch from 95170d2 to 82667e3 Compare September 17, 2025 02:40

beliefer requested a review from zhztheplayer September 17, 2025 02:45

zhztheplayer reviewed Sep 17, 2025

View reviewed changes

beliefer force-pushed the 10707 branch from 82667e3 to f3326b0 Compare September 18, 2025 02:44

beliefer changed the title ~~[GLUTEN-10707] Improve SparkDirectoryUtil with AtomicBoolean~~ [GLUTEN-10707] Improve SparkDirectoryUtil by reduce synchronized Sep 18, 2025

beliefer force-pushed the 10707 branch from f3326b0 to 9c56dd1 Compare September 18, 2025 02:48

[GLUTEN-10707] Improve SparkDirectoryUtil by reduce synchronized

4a6c746

beliefer force-pushed the 10707 branch from 9c56dd1 to 4a6c746 Compare September 18, 2025 02:53

beliefer requested a review from zhztheplayer September 18, 2025 08:57

zhztheplayer approved these changes Oct 10, 2025

View reviewed changes

zhztheplayer added the ready to merge label Oct 10, 2025

Replace targetRoots with roots

bbcc3e4

zhztheplayer merged commit 1030678 into apache:main Oct 13, 2025
99 of 100 checks passed

Conversation

beliefer commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes are proposed in this pull request?

How was this patch tested?

Uh oh!

github-actions bot commented Sep 15, 2025

Uh oh!

github-actions bot commented Sep 15, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

beliefer commented Sep 16, 2025

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

zhztheplayer left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

beliefer commented Sep 17, 2025

Uh oh!

zhztheplayer Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 18, 2025

Uh oh!

github-actions bot commented Sep 18, 2025

Uh oh!

github-actions bot commented Sep 18, 2025

Uh oh!

beliefer commented Oct 10, 2025

Uh oh!

zhztheplayer left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 11, 2025

Uh oh!

Uh oh!

beliefer commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

beliefer commented Sep 15, 2025 •

edited

Loading

zhztheplayer Sep 17, 2025 •

edited

Loading