Skip to content

[GLUTEN-10707] Improve SparkDirectoryUtil by reduce synchronized#10708

Merged
zhztheplayer merged 2 commits intoapache:mainfrom
beliefer:10707
Oct 13, 2025
Merged

[GLUTEN-10707] Improve SparkDirectoryUtil by reduce synchronized#10708
zhztheplayer merged 2 commits intoapache:mainfrom
beliefer:10707

Conversation

@beliefer
Copy link
Contributor

@beliefer beliefer commented Sep 15, 2025

What changes are proposed in this pull request?

This PR proposes to improve SparkDirectoryUtil by reduce synchronized
Fixes #10707

How was this patch tested?

GA tests.

@github-actions github-actions bot added the CORE works for Gluten Core label Sep 15, 2025
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

#10707

INSTANCE = new SparkDirectoryUtil(roots)
private def init(roots: Array[String]): Unit = {
if (INSTANCE.get() == null) {
INSTANCE.compareAndSet(null, new SparkDirectoryUtil(roots))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is workable but the constructor of SparkDirectoryUtil may be unnecessarily called in concurrent invocations.

We can use AtomicBoolean instead if unable to avoid this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry. I didn't understand what you said. Could you tell me more ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, thread 1 and thread 2 can both reach line 80 at the same time. Although only one thread will initialize the INSTANCE, another thread will still do new SparkDirectoryUtil(roots) which is not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@beliefer
Copy link
Contributor Author

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@beliefer beliefer changed the title [GLUTEN-10707] Improve SparkDirectoryUtil with AtomicReference [GLUTEN-10707] Improve SparkDirectoryUtil with AtomicBoolean Sep 16, 2025
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

Copy link
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get operation is still not synchronized... I am inclined to keep the current main code. Because this is not the hot code path and is not complicated. Do you agree?

}

object SparkDirectoryUtil extends Logging {
private val state = new AtomicBoolean(false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename it. E.g., INSTANCE_INITIALIZED

if (state.compareAndSet(false, true)) {
INSTANCE = new SparkDirectoryUtil(roots)
}
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the hot code path, let's remove the if (INSTANCE == null) condition.

Just directly call compareAndSet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because remove if (INSTANCE == null) will cause NPE. So I removed if (INSTANCE.roots.toSet != roots.toSet).

Comment on lines 94 to 97
def get(): SparkDirectoryUtil = {
assert(INSTANCE != null, "Default instance of SparkDirectoryUtil was not set yet")
INSTANCE
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get operation is not synchronized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, it is guarded with INSTANCE_INITIALIZED.

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@beliefer
Copy link
Contributor Author

The get operation is still not synchronized... I am inclined to keep the current main code. Because this is not the hot code path and is not complicated. Do you agree?

Now, the issue has been fixed.

Comment on lines 87 to 90
def get(): SparkDirectoryUtil = {
assert(INSTANCE_INITIALIZED.get(), "Default instance of SparkDirectoryUtil was not set yet")
INSTANCE
}
Copy link
Member

@zhztheplayer zhztheplayer Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @beliefer, thank you for keeping iterating the code, but it is still problematic :(

When thread 1 reaches line 81 but hasn't yet set INSTANCE, thread 2 can pass line 88 and access INSTANCE which may give an unexpected result to caller.

I know it's a corner case, but we should make sure the new code covers what is covered by the old code completely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation. I will try other way.

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@beliefer beliefer changed the title [GLUTEN-10707] Improve SparkDirectoryUtil with AtomicBoolean [GLUTEN-10707] Improve SparkDirectoryUtil by reduce synchronized Sep 18, 2025
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@beliefer
Copy link
Contributor Author

@zhztheplayer Could you help me review this again ?

Copy link
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

}
if (INSTANCE.roots.toSet != roots.toSet) {
if (targetRoots == null) {
targetRoots = roots
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just use the same name and

this.roots = roots

. Just a nit. Both are fine.

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@zhztheplayer zhztheplayer merged commit 1030678 into apache:main Oct 13, 2025
99 of 100 checks passed
@beliefer
Copy link
Contributor Author

@zhztheplayer Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve performance for SparkDirectoryUtil

2 participants