build GlobalDelegateThreadNettyScheduler #27

dreamlike-ocean · 2025-10-12T14:56:19Z

No description provided.

franz1981 · 2025-10-12T16:20:26Z

I have to use some of the Shipilev loom builds to run the CI.
Building JDK on It will likely kill it

dreamlike-ocean · 2025-10-12T16:33:41Z

Sorry, I seem to have accidentally pushed a draft earlier.
It should be more complete now — I’ve replaced all the internal Chinese comments.

To be frank, I don’t really like the design of GlobalDelegateThreadNettyScheduler, because some parts of it only exist to make Thread::startVirtualThread inherit the parent scheduler, and the ConcurrentHashMap there is bound to become a bottleneck.
Also, it’s quite hard for me to distinguish whether a virtual thread belongs to a poller.
long vtCount = globalDelegateThreadNettyScheduler.internalSchedulerMappings
.keySet().stream().map(Thread::getName)
.filter(s -> !s.contains("Poller"))
.count();

franz1981 · 2025-10-12T16:35:06Z

I will take a look in the next two days (thanks for the PR!) but I already love the idea; it is much in line with what I wanted to implement to reintroduce the inheritance while keeping the single scheduler per event loop abstraction in place, without too many lookups to emulate the missing attachment API.

I have in my to-do list to implement work stealing, which will clash a bit with this model, but I still have no proof to be so worthy, yet.

Related this PR, instead: I have a custom scheduler branch but at this point I would make master to be the one seeking the very latest loom changes.
I see little point to have a JDK 21 integration without the nice poller per carrier mode implemented by @AlanBateman

WDYT?

franz1981 · 2025-10-12T16:38:40Z

Answers in line

To be frank, I don’t really like the design of GlobalDelegateThreadNettyScheduler

I know and agree with you: tbh having both inheritance and attachment back would have make all of this more "natural" without external data structures.
But there is a nice JMH module and if we can build a benchmark which prove this to be a bottleneck we have something to give feedback related these choices IMO

Also, it’s quite hard for me to distinguish whether a virtual thread belongs to a poller.

Agree on this as well: having exposed the type of vThread in an enum while the submit happens would help, or been aware if is pinned or whatever hint..

core/src/main/java/io/netty/loom/GlobalDelegateThreadNettyScheduler.java

…he default behavior of inheriting the scheduler from the parent thread.

…-VirtualThread-Scheduler into loom_new_scheduler

dreamlike-ocean · 2025-10-13T17:28:37Z

Added a benchmark comparing GlobalDelegateThreadNettyScheduler with the default behavior of inheriting the scheduler from the parent thread.

My hardware:
13th Gen Intel(R) Core(TM) i5-13600KF

Linux dreamlike-MS-7E01 6.14.0-33-generic #33-Ubuntu SMP PREEMPT_DYNAMIC Wed Sep 17 23:22:02 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

loom build from openjdk/loom@50be4eb

Benchmark                                                      (tasks)   Mode  Cnt  Score   Error   Units
GlobalDelegateThreadNettySchedulerBenchmark.global                1000  thrpt   10  4.833 ± 0.088  ops/ms
GlobalDelegateThreadNettySchedulerBenchmark.global              100000  thrpt   10  0.049 ± 0.003  ops/ms
GlobalDelegateThreadNettySchedulerBenchmark.inheritFromParent     1000  thrpt   10  5.854 ± 0.133  ops/ms
GlobalDelegateThreadNettySchedulerBenchmark.inheritFromParent   100000  thrpt   10  0.061 ± 0.002  ops/ms

franz1981 · 2025-10-14T11:45:01Z

benchmarks/src/main/java/io/netty/loom/benchmark/GetScheduler.java

+@OutputTimeUnit(TimeUnit.NANOSECONDS)
+@BenchmarkMode(Mode.AverageTime)
+@Fork(value = 2, jvmArgs = {"--add-opens=java.base/java.lang=ALL-UNNAMED", "-XX:+UnlockExperimentalVMOptions"})
+public class GetScheduler {


Can you add few lines to describe the purpose of this benchmark? 🙏

I’d like to compare the performance difference between the new Thread.VirtualThreadScheduler.current() API and directly accessing the scheduler.
The new API appears to be slightly more complex, so I’d like to see whether it introduces any performance overhead.

franz1981 · 2025-10-14T12:22:26Z

...marks/src/main/java/io/netty/loom/benchmark/GlobalDelegateThreadNettySchedulerBenchmark.java

+        vtFactory.newThread(
+                () -> {
+                    for (int i = 0; i < tasks; i++) {
+                        Thread.startVirtualThread(countDown::countDown);


Thread.startVirtualThread should:

using determineScheduler and VirtualThreadNettyScheduler ::current (querying the scoped value)

updating the CHM in the global scheduler

allocates a lambda to handle removing the vthread once completed

submit the lambda to the right VirtualThreadNettyScheduler

Which to me seems that the main relevant aspects of difference from what we got are:

allocation rate (can be measured with -prof gc)

number of atomic operations (the CHM::get but most importantly CHM::put)

Which make me think that maybe we should have a special mpsc queue in VirtualThreadNettyScheduler which is drained round-robin with the existing one and it always automatically check for thread state and remove the vthread from the CHM if completed.
This will save allocating the tiny wrapper lambda around the vthread continuation and the likely additional type (the lambda has a different type from the Continuation).
Clearly it means that VirtualThreadNettyScheduler should:

have a new executeFromGlobal (better named, I'm bad w names!) to submit vs this new mpsc queue

a reference to the global executor instance to remove the v thread continuation once completed

This can be done later too, eh: just brainstorming in what the existing behaviour could differ.

franz1981 · 2025-10-14T12:23:20Z

...marks/src/main/java/io/netty/loom/benchmark/GlobalDelegateThreadNettySchedulerBenchmark.java

+@Fork(value = 2, jvmArgs = {"--add-opens=java.base/java.lang=ALL-UNNAMED", "-XX:+UnlockExperimentalVMOptions",
+        "-XX:-DoJVMTIVirtualThreadTransitions", "-Djdk.trackAllThreads=false", "-Djdk.virtualThreadScheduler.implClass=io.netty.loom.GlobalDelegateThreadNettyScheduler"})
+@State(Scope.Thread)
+public class GlobalDelegateThreadNettySchedulerBenchmark {


Another interesting benchmark could be to measure the difference in cost for Virtual Thread which are NOT managed by Netty instead, when the GlobalDelegateThreadNettyScheduler is used/not used.

franz1981 · 2025-10-14T12:25:40Z

I'll give this a shot on my machine as well @dreamlike-ocean 🙏
Will take few days, but this is already a very good job, and I plan to merge it

franz1981 · 2025-10-15T13:25:47Z

@dreamlike-ocean I will soon send a PR to your branch with some changes (if you agree) so we can proceed on this: I'm currently adding more fine grain tests to shutdown/verify pollers based on poller mode etc etc

dreamlike-ocean · 2025-10-15T13:33:34Z

@dreamlike-ocean I will soon send a PR to your branch with some changes (if you agree) so we can proceed on this: I'm currently adding more fine grain tests to shutdown/verify pollers based on poller mode etc etc

You can submit code directly to my branch — there should be no protection on it. I’d be very happy to collaborate with you to improve this work.

…tion

franz1981 · 2025-10-15T14:51:38Z

@dreamlike-ocean I added very few comments and split tests, really 🙏
Check if they looks ok and I'll still band my head on this a bit more; regardless we're ready to go

I'll add as a follow up:

some additional benchmarks
eventually optimizations (if I can find any relevant)
test coverage for coexistence of "alien" custom schedulers - to make sure they well-behave (they should looking at the code)

franz1981 · 2025-10-15T15:13:39Z

@dreamlike-ocean
I'm currently playing with this idea...

   @Override
   public void execute(Thread vthread, Runnable task) {
      // we can have 3 types of executions:
      // 1. vthreads which belong to some Netty scheduler
      // 2. vthreads which belong to the JDK built-in scheduler
      // 3. vthreads which belong to some unknown scheduler
      // We want to preserve the inheritance of the scheduler just for vthreads spawned from Netty vthreads.
      // For all other vthreads, we simply use the JDK built-in scheduler
      // It means we are not forced to track ALL vthreads, including the ones running on the JDK built-in scheduler
      // or from unknown schedulers.
      // We can just fail to recognize them and use perform an informed decision at scheduling time.
      VirtualThreadNettyScheduler mappedScheduler = inheritedNettyVthreads.get(vthread);
      if (mappedScheduler != null) {
         mappedScheduler.execute(vthread, () -> {
            try {
               task.run();
            } finally {
               if (vthread.getState() == Thread.State.TERMINATED) {
                  inheritedNettyVthreads.remove(vthread);
               }
            }
         });
         return;
      }
      Thread.VirtualThreadScheduler scheduler = determineScheduler();
      if (scheduler == jdkBuildinScheduler) {
         // we don't track vthreads running on the JDK built-in scheduler
         scheduler.execute(vthread, task);
      } else if (scheduler instanceof VirtualThreadNettyScheduler nettyScheduler) {
         inheritedNettyVthreads.put(vthread, nettyScheduler);
         nettyScheduler.execute(vthread, () -> {
            try {
               task.run();
            } finally {
               if (vthread.getState() == Thread.State.TERMINATED) {
                  inheritedNettyVthreads.remove(vthread);
               }
            }
         });
      } else {
         scheduler.execute(vthread, task);
      }
   }

Which, if will work, enable me to find a slightly different way to perform the mapping for the inherited-only v threads (maybe using just primitive values), but as usual, maybe I'm doing some terrible logic mistake and I'm taking some time to think if is just wrong

I see some io.netty.loom.MultithreadVirtualEventExecutorGroupTest#testVirtualThreadSpawnsVirtualThreads() stack overflow errors on this (meaning you already tried?)

and can be fixed by

    private Thread.VirtualThreadScheduler determineScheduler() {
        Thread callerThread = Thread.currentThread();
        // platform thread
        if (!callerThread.isVirtual()) {
            return jdkBuildinScheduler;
        }
        VirtualThreadNettyScheduler current = VirtualThreadNettyScheduler.current();
        // The current thread was spawned from a specific VirtualThreadNettyScheduler,
        // so we continue using that scheduler.
        if (current != null) {
            return current;
        }
        Thread.VirtualThreadScheduler parentScheduler = inheritedNettyVthreads.get(callerThread);
        if (parentScheduler != null) {
            return parentScheduler;
        }

        // The current thread was spawned from an unknown scheduler that is not managed by GlobalDelegateThreadNettyScheduler,
        // so we directly use the parent’s scheduler instead to avoid potential stack overflow.
        var currentScheduler = Thread.VirtualThreadScheduler.current();
        if (currentScheduler == this) {
            // if the caller thread is not known we assume it won't benefit from inheriting, so its children
            // will use the JDK built-in scheduler
            return jdkBuildinScheduler;
        }
        return currentScheduler;
    }
    ```

franz1981 · 2025-10-15T16:04:26Z

To be precise on determine scheduler at #27 (comment)
we don't need to query Thread.VirtualThreadScheduler.current() but just return the builtin one: if a virtual thread is not tracked nor it belong to anything managed by the global scheduler, we could just enforce the default policy I.e. don't make it inherit the parent scheduler. WDYT?

franz1981 · 2025-10-15T19:38:47Z

I am still not convinced by my proposal at #27 (comment) and I would like to write a test which:

create a virtual thread from the test harness thread (which is a normal platform thread) waiting on a countdown latch (it should not be assigned to run on the netty scheduler)
from the netty virtual thread (or an inherited child of it), increment the countdown latch, unblocking the awaiting one
verify that the unblocked v thread doesn't run on the netty scheduler

This test should stress the case of submitting a vThread to the global scheduler from a virtual thread which is assigned to the Netty scheduler.
According to the code I have modified, this test should fail. I will check tomorrow.

franz1981 · 2025-10-16T02:30:45Z

The concerns re my proposed changes at #27 (comment) are valid: I'm adding this test and merging this.
Any improvement (if possible) will be made as a follow-up.
The "issue" behind the approach I've proposed is that if the scheduling is caller-sensitive and a v thread is unparked from a "known" Netty scheduler, will assume it has to enforce "scheduler inheritance" whilst not required: the sole valid state where we need to consider scheduling inheritance is when a vThread is starting, but AFAIK this information is not available.

This last point make me think that if the existing VirtualThreadScheduler API would be

public void execute(Thread vThread, SchedulingReason reason, Runnable task)

with SchedulingReason

STARTING, UNPARK, TIMED_UNPARK, ... // which mimic the expected previous Thread::state?

the approach I've suggested could work because:

we perform the decision to inherit scheduling only if reason == STARTING which HAS to happen for ALL v threads
if reason != STARTING if is a known Netty-scheduled v thread will run it there, and if not will just run it on the built-in scheduler (because as said in build GlobalDelegateThreadNettyScheduler #27 (comment) we won't allow alien non-Netty custom schedulers to benefit of scheduler inheritance)

Thanks to this, we could reduce the surface of tracked v threads in the global scheduler, which is already performed by the default tracking ThreadContainer within Hotspot and optimize the queries to the previous scheduling decisions, because we can modify them only on STARTING.

And furthermore, the global scheduling code can be much lighter too e.g.

   @Override
   public void execute(Thread vthread, SchedulingReason reason, Runnable task) {
      if (reason == STARTING) {      
           // we could early stop here to run on the built-in if the caller thread is not virtual 
           VirtualThreadNettyScheduler nettyScheduler = VirtualThreadNettyScheduler.current()
           // who start vThread is bound to run on a specific Netty scheduler?
           if (nettyScheduler == null) {
               // who start vThread has inherited to run on a specific Netty scheduler?
               nettyScheduler = inheritedNettyVthreads.get(Thread.currentThread());       
           } else {
               // remember this vThread once it's rescheduled 
               inheritedNettyVthreads.put(vThread, nettyScheduler);
           }
           if (nettyScheduler != null) {
                   nettyScheduler.execute(vthread, reason, () -> {
                      try {
                         task.run();
                      } finally {
                         if (vthread.getState() == Thread.State.TERMINATED) {
                            inheritedNettyVthreads.remove(vthread);
                         }
                      }
                   });
                  return;
           }                     
           jdkBuildinScheduler.execute(vthread, reason, task);   
           return;   
      }       
      // is not the first time we saw it OR we're not interested into it
      VirtualThreadNettyScheduler mappedScheduler = inheritedNettyVthreads.get(vthread);
      if (mappedScheduler != null) {
         mappedScheduler.execute(vthread, () -> {
            try {
               task.run();
            } finally {
               if (vthread.getState() == Thread.State.TERMINATED) {
                  inheritedNettyVthreads.remove(vthread);
               }
            }
         });
         return;
      }
      jdkBuildinScheduler.execute(vthread, reason, task);
   }

NOTE (for myself of the future):
The approach suggested at #27 (comment) cannot be fixed, without the mentioned VirtualThreadScheduler API change even if we check if STATE != TERMINATED while executing them built-in scheduler, because this case can happen:

on Thread.OfVirtual::start the vThread is scheduled on the built-in scheduler

once the built-in carrier execute it and it completes the scheduling with Thread::getState() != TERMINATED we are ready to add the vThread to a CHM of builtInRunnableVThreads e.g. if the vThread is parked/descheduled

the same vThread is unparked by a Netty vThread before builtInRunnableVThreads is modified

the vThread is scheduled wrong :"(

franz1981 · 2025-10-16T05:14:25Z

@dreamlike-ocean I've added with 2d1a9fb few tests including some for the per carrier (sub)poller

which seems to fail if I run the tests with

-Djdk.virtualThreadScheduler.implClass=io.netty.loom.GlobalDelegateThreadNettyScheduler 
-Djdk.traceVirtualThreadLocals=false 
--add-opens=java.base/java.lang=ALL-UNNAMED 
-Djdk.pollerMode=3

It looks like there's a problem on terminating the per-carrier read sub-poller in testPerCarrierSubpollersInheritingNettyScheduler:

java.util.concurrent.RejectedExecutionException: event loop is shutting down
	at io.netty.loom.VirtualThreadNettyScheduler.execute(VirtualThreadNettyScheduler.java:190)
	at io.netty.loom.GlobalDelegateThreadNettyScheduler.execute(GlobalDelegateThreadNettyScheduler.java:40)
	at java.base/java.lang.VirtualThread.submitRunContinuation(VirtualThread.java:373)
	at java.base/java.lang.VirtualThread.submitRunContinuation(VirtualThread.java:397)
	at java.base/java.lang.VirtualThread.unpark(VirtualThread.java:894)
	at java.base/java.lang.System$1.unparkVirtualThread(System.java:2304)
	at java.base/java.util.concurrent.locks.LockSupport.unpark(LockSupport.java:181)
	at java.base/sun.nio.ch.Poller.polled(Poller.java:218)
	at java.base/sun.nio.ch.EPollPoller.poll(EPollPoller.java:148)
	at java.base/sun.nio.ch.Poller.pollerLoop(Poller.java:306)
	at java.base/java.lang.Thread.run(Thread.java:1584)
	at java.base/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:148)

org.opentest4j.AssertionFailedError: 
Expected :true
Actual   :false
<Click to see difference>


	at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
	at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
	at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
	at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
	at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)
	at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183)
	at io.netty.loom.MultithreadVirtualEventExecutorGroupTest.assertContainsJustBuiltinPollers(MultithreadVirtualEventExecutorGroupTest.java:580)
	at io.netty.loom.MultithreadVirtualEventExecutorGroupTest.testPerCarrierSubpollersOnNettyScheduler(MultithreadVirtualEventExecutorGroupTest.java:521)
	at io.netty.loom.MultithreadVirtualEventExecutorGroupTest.testPerCarrierSubpollersInheritingNettyScheduler(MultithreadVirtualEventExecutorGroupTest.java:458)
	at java.base/java.lang.reflect.Method.invoke(Method.java:565)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1604)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1604)

which means that the per-carrier inherited subpoller bound to the event loop carrier is not able to complete/run
and removed by the v thread managed/mapped by the global scheduler.
Even removing the check to prevent rejecting it, won't fix it: there's something related the termination of VirtualThreadNettyScheduler that need to be fixed.
This seems another form of the problem reported by #16 .
The same problem won't happen with pollerMode != 3.

I can merge this if you agree on the changes or we can try fixing the failure(s) before merging, let me know what you prefer @dreamlike-ocean !

dreamlike-ocean · 2025-10-16T05:47:05Z

please merge it！thx

dreamlike-ocean · 2025-10-16T06:05:59Z

I checked the related code, and I believe that when pollerMode = 3, although this exception occurs, it will not cause a resource leak. When our eventLoop shuts down, the thread will also exit.

franz1981 · 2025-10-16T06:10:59Z

Yep @dreamlike-ocean the leak Is not due to what hotspot does, but the existing terminating logic of the virtual thread Netty scheduler is not correct.
As said, this is not a regression introduced by your changes but was likely present before

franz1981 · 2025-10-16T07:14:04Z

Thanks @dreamlike-ocean 🙏🙏

dreamlike-ocean marked this pull request as draft October 12, 2025 16:24

build GlobalDelegateThreadNettyScheduler

837fc84

dreamlike-ocean force-pushed the loom_new_scheduler branch from 04ebef3 to 837fc84 Compare October 12, 2025 16:28

replace computeIfAbsent with a simple get + put to reduce overhead.

8b5d44f

franz1981 reviewed Oct 13, 2025

View reviewed changes

core/src/main/java/io/netty/loom/GlobalDelegateThreadNettyScheduler.java Outdated Show resolved Hide resolved

franz1981 reviewed Oct 13, 2025

View reviewed changes

core/src/main/java/io/netty/loom/GlobalDelegateThreadNettyScheduler.java Outdated Show resolved Hide resolved

dreamlike-ocean added 4 commits October 13, 2025 13:30

fix wrong VirtualThreadScheduler

8d1cd14

only call Thread::currentThread once

86034fc

Added a benchmark comparing GlobalDelegateThreadNettyScheduler with t…

7cd5ddc

…he default behavior of inheriting the scheduler from the parent thread.

Merge branch 'loom_new_scheduler' of github.com:dreamlike-ocean/Netty…

0ed1221

…-VirtualThread-Scheduler into loom_new_scheduler

dreamlike-ocean marked this pull request as ready for review October 13, 2025 17:29

franz1981 reviewed Oct 14, 2025

View reviewed changes

franz1981 added the enhancement New feature or request label Oct 14, 2025

franz1981 added 2 commits October 15, 2025 16:39

Splitting tests, some more comments and additional surefire test configs

baef840

fix test completion due to running event loop continuation on termina…

f80c304

…tion

More tests and enforced builtin inheritance

2d1a9fb

franz1981 merged commit 248119f into franz1981:master Oct 16, 2025

franz1981 mentioned this pull request Oct 20, 2025

Don't run Java CI with Maven / build (push) with JDK 21 anymore #29

Closed

Copilot AI mentioned this pull request Oct 20, 2025

Remove JDK 21 workflow after Loom Build Tests implementation #30

Merged

franz1981 mentioned this pull request Oct 20, 2025

Limit unit tests execution #31

Open

Copilot AI mentioned this pull request Oct 20, 2025

Add multi-layer timeout configuration to prevent tests from hanging #32

Closed

build GlobalDelegateThreadNettyScheduler #27

build GlobalDelegateThreadNettyScheduler #27

Uh oh!

Conversation

dreamlike-ocean commented Oct 12, 2025

Uh oh!

franz1981 commented Oct 12, 2025

Uh oh!

dreamlike-ocean commented Oct 12, 2025

Uh oh!

franz1981 commented Oct 12, 2025

Uh oh!

franz1981 commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dreamlike-ocean commented Oct 13, 2025

Uh oh!

franz1981 Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

dreamlike-ocean Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

franz1981 Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

franz1981 Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

franz1981 commented Oct 14, 2025

Uh oh!

franz1981 commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dreamlike-ocean commented Oct 15, 2025

Uh oh!

franz1981 commented Oct 15, 2025

Uh oh!

franz1981 commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

franz1981 commented Oct 15, 2025

Uh oh!

franz1981 commented Oct 15, 2025

Uh oh!

franz1981 commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

franz1981 commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dreamlike-ocean commented Oct 16, 2025

Uh oh!

dreamlike-ocean commented Oct 16, 2025

Uh oh!

franz1981 commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

franz1981 commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

franz1981 commented Oct 12, 2025 •

edited

Loading

franz1981 commented Oct 15, 2025 •

edited

Loading

franz1981 commented Oct 15, 2025 •

edited

Loading

franz1981 commented Oct 16, 2025 •

edited

Loading

franz1981 commented Oct 16, 2025 •

edited

Loading

franz1981 commented Oct 16, 2025 •

edited

Loading