Skip to content

Conversation

@Eideren
Copy link
Collaborator

@Eideren Eideren commented Nov 28, 2025

PR Details

Used a non-synthetic private scene to test these changes out. The scene is comprised of around 20k SyncScript of different types and priorities.
Out of the box ScriptSystem.Update takes ~9.5ms with min/max of 9.0 and 10.4.
With this change, it takes ~4.7ms with min/max of 4.6 and 5.25.

Profile capture: https://share.firefox.dev/49LWPGK

Previously, scheduling any script was log N, so total cost of N log N. It now is log N only for each unique priority scheduled. Otherwise scheduling is O1

The implementation uses a Deque to provide ordered insertion at the start, end and removal from the start.
Then we have a PriorityQueue which sorts these Deque based on their priority to consume the most prioritized queue first.
And finally a Dictionary for O1 subscription to a particular Deque.
Then we have a couple of minor details to further amortize the cost of all of this.

Related Issue

fix: #2942

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My change requires a change to the documentation.
  • I have added tests to cover my changes. - there already are tests covering these areas
  • All new and existing tests passed.
  • I have built and run the editor to try this change out.

@Eideren Eideren added the area-Core Issue of the engine unrelated to other defined areas label Nov 28, 2025
/// Either a microthread or an action with priority.
/// </summary>
internal struct SchedulerEntry : IComparable<SchedulerEntry>
internal class SchedulerEntry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested this, but just out of curiosity:
How does this change from struct to class fare from the garbage/GC standpoint? In your extreme sample scene with many scripts, doesn't this result in thousands of SchedulerEntry instances?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, yes. Previous implementation wrapped this struct inside of a priority node, so the memory overhead is about the same per instance between both implementation. In this new implementation, those are not pooled, so they will be GCed once the script or microthread are collected. If creating a script/microthread did not allocate otherwise, I would worry, but given that it does, here it's just another allocation on top of the base one which is not nearly as problematic.

Right now it was swapped into a class to mark it as unscheduled once it runs, which is required to skip an unnecessary lookup when unscheduling it outside of the scheduler loop. We could do without and bare one binary search on every syncscript removal, or build additional logic around this to conditionally skip it when ran after a scheduler run, but that's a bit too fragile. I'll check monday if there's anything else I could do on that side.

@Eideren
Copy link
Collaborator Author

Eideren commented Dec 2, 2025

Looked into making the SchedulerEntry a struct, not sure it's worth it given that we have to pay the cost of an additional hashset lookup whenever we schedule, process or unschedule one. We're kind of limited in what we can setup given that the Scheduler is re-entrant and can schedule actions while processing scheduled actions.
You can see the changes required for that over here: cac41bb
I'll move back to using a pool instead.

@Eideren Eideren marked this pull request as ready for review December 2, 2025 11:38
@Eideren
Copy link
Collaborator Author

Eideren commented Dec 2, 2025

This PR is ready for review

@xen2
Copy link
Member

xen2 commented Dec 3, 2025

LGTM!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a significant performance optimization for the ScriptSystem, reducing script scheduling from O(N log N) to O(log N) per unique priority. In a test scene with ~20k SyncScript instances, ScriptSystem.Update execution time improved from ~9.5ms to ~4.7ms (approximately 2x faster).

Key Changes:

  • Replaced per-script priority queue scheduling with batched execution grouped by priority
  • Introduced pooling for SchedulerEntry and ExecutionQueue objects to reduce GC pressure
  • Refactored SchedulerEntry from struct to class to support the new pooling architecture

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
sources/engine/Stride.Engine/Engine/SyncScript.cs Added ScriptSystem reference and priority change tracking via PriorityUpdated() override
sources/engine/Stride.Engine/Engine/StartupScript.cs Changed StartSchedulerNode from PriorityQueueNode to nullable SchedulerEntry for pooling
sources/engine/Stride.Engine/Engine/Processors/ScriptSystem.cs Core refactoring: batches sync scripts by priority, implements object pooling, new batch execution method ExecuteBatchOfSyncScripts
sources/core/Stride.Core/Collections/Dequeue.cs Made public, moved to Collections namespace, added power-of-2 capacity requirement, optimized with bitwise mask operations, added BinarySearch method
sources/core/Stride.Core.Tests/TestDeque.cs New comprehensive test suite for Deque functionality including wrapping, range insertion, and binary search
sources/core/Stride.Core.MicroThreading/SchedulerEntry.cs Converted from struct to class, removed priority/counter fields (moved to queue level), added queue tracking fields
sources/core/Stride.Core.MicroThreading/Scheduler.cs Complete scheduler rewrite: priority-based bucketing with Deque per priority, object pooling for entries and buckets, new locking strategy
sources/core/Stride.Core.MicroThreading/MicroThread.cs Simplified priority handling, removed internal Reschedule method, delegates to Scheduler.Reschedule
sources/core/Stride.Core.MicroThreading/MicroThreadYieldAwaiter.cs Updated to use new HasNoEntriesScheduled() method instead of direct queue access
sources/core/Stride.Core.Design/Threading/Internal/DefaultAsyncWaitQueue.cs Added import for Stride.Core.Collections namespace (Deque moved)
Comments suppressed due to low confidence (1)

sources/core/Stride.Core/Collections/Dequeue.cs:540

  • The comment refers to the non-existent method DequeIndexToBufferIndex. This method was renamed to DequeIndexToBufferRef. Please update the comment to reflect the current method name.

Copy link
Member

@Kryptos-FR Kryptos-FR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. If possible improve the documentation on some types.

@VaclavElias
Copy link
Contributor

@Kryptos-FR, are you testing copilot? You would like copilot to create xml documentation?

@Kryptos-FR
Copy link
Member

@Kryptos-FR, are you testing copilot? You would like copilot to create xml documentation?

I was trying to have it make a review but apparently I can't. Thanks for doing it.

My other review comments were directed at the original author but if copilot can do it, why not?

@VaclavElias
Copy link
Contributor

I do it this way, click the icon on the top right to request a review and select copilot. You can request for sure also review but can you see there aslo copilot?
image

@VaclavElias
Copy link
Contributor

We need to update these instructions https://github.com/stride3d/stride/blob/master/.github/copilot-instructions.md, and include to always suggest xml doc for public APIs if they are missing, or needs to be updated or corrected.

@VaclavElias
Copy link
Contributor

Copilot can't do it on existing PR at the moment.. so I am listing it below. @Eideren, use if any of those are helpful..

    /// <summary>
    /// Base class for scripts that are updated synchronously every frame on the main thread.
    /// </summary>
    /// <remarks>
    /// Sync scripts are batched and executed in priority order during <see cref="Stride.Engine.Processors.ScriptSystem.Update(Stride.Games.GameTime)"/>.
    /// </remarks>
    public abstract class SyncScript : StartupScript
    {
        /// <summary>
        /// Called every frame to update the script.
        /// </summary>
        public abstract void Update();

        /// <summary>
        /// Notifies the system that the script priority has changed and triggers rescheduling within the update batch.
        /// </summary>
        /// <remarks>
        /// This ensures the script moves to the correct priority bucket for synchronous updates.
        /// </remarks>
        protected internal override void PriorityUpdated();
    }
    /// <summary>
    /// Base class for scripts that run a startup action before their update loop begins.
    /// </summary>
    public abstract class StartupScript : ScriptComponent
    {
        /// <summary>
        /// Called before the script enters its update loop.
        /// </summary>
        /// <remarks>
        /// This method is scheduled by <see cref="Stride.Engine.Processors.ScriptSystem"/> based on the script’s priority.
        /// </remarks>
        public abstract void Start();

        /// <summary>
        /// Notifies the system that the script priority has changed and should be rescheduled if necessary.
        /// </summary>
        /// <remarks>
        /// Scripts changing their priority should call or rely on this to update scheduling buckets accordingly.
        /// </remarks>
        protected internal virtual void PriorityUpdated() { /* default implementation in base */ }
    }
    /// <summary>
    /// Manages the lifecycle and execution of scripts in a game: startup, synchronous updates, and asynchronous micro threads.
    /// </summary>
    public sealed class ScriptSystem : GameSystemBase
    {
        /// <summary>
        /// Initializes a new instance of the <see cref="ScriptSystem"/> class.
        /// </summary>
        /// <param name="registry">The service registry.</param>
        public ScriptSystem(IServiceRegistry registry);

        /// <summary>
        /// Gets the scheduler used to run micro threads and scheduled script actions.
        /// </summary>
        public Scheduler Scheduler { get; }

        /// <summary>
        /// Schedules all pending script starts, batches synchronous script updates by priority,
        /// and runs the scheduler for the current frame.
        /// </summary>
        /// <param name="gameTime">Time information for the current frame.</param>
        /// <remarks>
        /// - Startup scripts are scheduled according to their priority.
        /// - Synchronous scripts are grouped by priority and executed in batches to reduce scheduling overhead.
        /// - Asynchronous scripts run on micro threads via the scheduler.
        /// </remarks>
        public override void Update(GameTime gameTime);

        /// <summary>
        /// Returns an awaiter that completes on the next frame boundary.
        /// </summary>
        /// <returns>An awaiter to resume on the next frame.</returns>
        public ChannelMicroThreadAwaiter<int> NextFrame();

        /// <summary>
        /// Adds a new micro thread to the scheduler.
        /// </summary>
        /// <param name="microThreadFunction">The micro thread function to execute.</param>
        /// <param name="priority">Lower values run the associated micro thread sooner.</param>
        /// <returns>The created and scheduled <see cref="MicroThread"/>.</returns>
        public MicroThread AddTask(Func<Task> microThreadFunction, long priority = 0);

        /// <summary>
        /// Registers a script for execution and schedules its startup and (if applicable) asynchronous execution.
        /// </summary>
        /// <param name="script">The script component to add.</param>
        /// <remarks>
        /// - Startup scripts are scheduled to run <see cref="StartupScript.Start"/> with their priority.
        /// - Async scripts create a micro thread for <c>Execute()</c>.
        /// - Sync scripts will be batched by priority and executed during <see cref="Update"/>.
        /// </remarks>
        public void Add(ScriptComponent script);

        /// <summary>
        /// Removes a script from execution and unschedules any pending actions associated with it.
        /// </summary>
        /// <param name="script">The script component to remove.</param>
        public void Remove(ScriptComponent script);

        /// <summary>
        /// Replaces a live script instance with a new instance and flags it for live reload startup execution.
        /// </summary>
        /// <param name="oldScript">The current script instance.</param>
        /// <param name="newScript">The new script instance.</param>
        public void LiveReload(ScriptComponent oldScript, ScriptComponent newScript);

        /// <summary>
        /// A bit that indicates the “Update” scheduling band for scripts.
        /// </summary>
        /// <remarks>
        /// Script priorities can be combined with this bit (by OR) to ensure sync updates are separated from other work.
        /// </remarks>
        internal const long UpdateBit = 1L << 32;
    }
    /// <summary>
    /// Coordinates the execution of micro threads and scheduled actions according to priority.
    /// </summary>
    /// <remarks>
    /// - Uses per-priority execution queues and a priority heap to select the next work item.
    /// - Provides channel-based frame synchronization and exception propagation options.
    /// </remarks>
    public class Scheduler : IDisposable
    {
        /// <summary>
        /// Executes scheduled work until the current frame’s queues are drained, yields frame channels, and finalizes completed micro threads.
        /// </summary>
        /// <remarks>
        /// Reentrant-safe. On reentry it will avoid re-pooling unused priority buckets until the outermost call returns.
        /// </remarks>
        public void Run();

        /// <summary>
        /// Creates a new micro thread associated with this scheduler.
        /// </summary>
        /// <returns>A new <see cref="MicroThread"/> that can be started and scheduled.</returns>
        public MicroThread Create();

        /// <summary>
        /// Returns a task that completes when all provided micro threads have completed.
        /// </summary>
        /// <param name="microThreads">The micro threads to wait for completion.</param>
        /// <returns>A task that completes when all micro threads have completed.</returns>
        public Task WhenAll(params MicroThread[] microThreads);

        // Note: Public events (MicroThreadStarted/Ended/CallbackStart/CallbackEnd) should retain their existing docs or add them if missing.

        // Internal methods and types introduced by this PR are not part of the public API surface and thus don’t need public XML docs.
    }
    public readonly struct MicroThreadYieldAwaiter
    {
        /// <summary>
        /// Gets a value indicating whether the awaiter has completed.
        /// </summary>
        /// <remarks>
        /// Returns <c>true</c> if the micro thread has finished or if the scheduler currently has no entries scheduled.
        /// </remarks>
        public bool IsCompleted { get; }
    }
    public class MicroThread
    {
        /// <summary>
        /// Gets or sets the scheduling priority of this micro thread.
        /// </summary>
        /// <remarks>
        /// - Lower values run earlier; higher values run later.
        /// - Changing priority while scheduled will reschedule the micro thread at the new priority.
        /// - Priority can be combined with system-specific flags (e.g. <see cref="Stride.Engine.Processors.ScriptSystem.UpdateBit"/>).
        /// </remarks>
        public long Priority { get; set; }

        /// <summary>
        /// Starts the micro thread by scheduling its function at the specified <paramref name="scheduleMode"/>.
        /// </summary>
        /// <param name="microThreadFunction">The micro thread function to run.</param>
        /// <param name="scheduleMode">
        /// Whether to schedule at the front or back of the priority queue for the current priority band.
        /// </param>
        /// <exception cref="ArgumentNullException">Thrown when <paramref name="microThreadFunction"/> is null.</exception>
        public void Start(Func<Task> microThreadFunction, ScheduleMode scheduleMode = ScheduleMode.Last);

        /// <summary>
        /// Runs the micro thread’s queued callbacks and yields appropriately within the scheduler.
        /// </summary>
        /// <remarks>
        /// This method is intended to be called by the scheduler and will process pending continuations.
        /// </remarks>
        public Task Run();

        // Other existing public members should retain or get XML docs if missing in source (not shown here).
    }
 /// <summary>
    /// A double-ended queue (deque), which provides O(1) indexed access, O(1) removals from the front and back,
    /// amortized O(1) insertions to the front and back, and O(N) insertions and removals anywhere else (with the
    /// operations getting slower as the index approaches the middle).
    /// </summary>
    /// <typeparam name="T">The type of elements contained in the deque.</typeparam>
    /// <remarks>
    /// - Capacity is always a power of two and grows by rounding up to the next power of two.
    /// - Indices are stable and contiguous from 0 to <see cref="Count"/> - 1.
    /// - The implementation uses a circular buffer and bit-masked index arithmetic for speed.
    /// </remarks>
    public sealed class Deque<T> : IList<T>, System.Collections.IList
    {
        /// <summary>
        /// Initializes a new instance of the <see cref="Deque{T}"/> class with the specified capacity.
        /// </summary>
        /// <param name="capacity">The initial capacity. Must be a power of two greater than <c>0</c>.</param>
        /// <exception cref="ArgumentOutOfRangeException">Thrown when <paramref name="capacity"/> is less than 1.</exception>
        /// <exception cref="InvalidOperationException">Thrown when <paramref name="capacity"/> is not a power of two.</exception>
        public Deque(int capacity) { /* existing code */ }

        /// <summary>
        /// Gets or sets the total number of elements the internal data structure can hold without resizing.
        /// </summary>
        /// <remarks>
        /// - Setting <see cref="Capacity"/> to a value less than <see cref="Count"/> throws.
        /// - The value must be a power of two; otherwise an <see cref="InvalidOperationException"/> is thrown.
        /// - When increased, existing elements are preserved in logical order.
        /// </remarks>
        /// <exception cref="InvalidOperationException">Thrown if the new capacity is not a power of two or less than <see cref="Count"/>.</exception>
        public int Capacity { get; set; }

        /// <summary>
        /// Gets the number of elements contained in the <see cref="Deque{T}"/>.
        /// </summary>
        public int Count { get; private set; }

        /// <summary>
        /// Performs a binary search for <paramref name="item"/> in the deque assuming elements are sorted according to <paramref name="comparer"/>.
        /// </summary>
        /// <param name="item">The item to locate.</param>
        /// <param name="comparer">
        /// The comparer used to order elements. If <c>null</c>, <see cref="Comparer{T}.Default"/> is used.
        /// </param>
        /// <returns>
        /// The zero-based index of <paramref name="item"/> in the deque if found; otherwise, a negative number that is the bitwise complement
        /// of the index of the next element that is larger than <paramref name="item"/> or, if there is no larger element, the bitwise complement
        /// of <see cref="Count"/>.
        /// </returns>
        /// <remarks>
        /// This method functions like <see cref="Array.BinarySearch{T}(T[], T)"/> semantics adapted for the deque’s split circular buffer.
        /// Elements must already be sorted by <paramref name="comparer"/>.
        /// </remarks>
        public int BinarySearch(T item, IComparer<T>? comparer = null);

        /// <summary>
        /// Adds an item to the end of the deque.
        /// </summary>
        /// <param name="value">The element to insert.</param>
        public void AddToBack(T value);

        /// <summary>
        /// Adds an item to the front of the deque.
        /// </summary>
        /// <param name="value">The element to insert.</param>
        public void AddToFront(T value);

        /// <summary>
        /// Removes and returns the item at the end of the deque.
        /// </summary>
        /// <returns>The former last element.</returns>
        /// <exception cref="InvalidOperationException">Thrown when the deque is empty.</exception>
        public T RemoveFromBack();

        /// <summary>
        /// Removes and returns the item at the front of the deque.
        /// </summary>
        /// <returns>The former first element.</returns>
        /// <exception cref="InvalidOperationException">Thrown when the deque is empty.</exception>
        public T RemoveFromFront();

        /// <summary>
        /// Inserts the elements of a collection into the deque at the specified <paramref name="index"/>.
        /// </summary>
        /// <param name="index">The zero-based index at which the new elements should be inserted.</param>
        /// <param name="collection">The collection whose elements should be inserted into the deque.</param>
        /// <exception cref="ArgumentOutOfRangeException">Thrown when <paramref name="index"/> is less than 0 or greater than <see cref="Count"/>.</exception>
        /// <remarks>
        /// - If needed, capacity grows to the next power of two that can fit <paramref name="collection"/> and existing elements.
        /// - Insertion near ends is optimized; insertion near middle is O(N).
        /// </remarks>
        public void InsertRange(int index, IEnumerable<T> collection);

        // Note: Other IList<T> members already have typical semantics; ensure they have or retain XML docs elsewhere if exposed.
    }

@Kryptos-FR
Copy link
Member

I do it this way, click the icon on the top right to request a review and select copilot. You can request for sure also review but can you see there aslo copilot?

I don't have that option. Copilot is not listed for me.

@Eideren
Copy link
Collaborator Author

Eideren commented Dec 6, 2025

@VaclavElias Well, it somehow missed every single cases that should actually be documented aside from the binary search ... :D

@VaclavElias
Copy link
Contributor

I guees, that's all we can get from it today 🤣.

@Eideren Eideren merged commit 158f200 into stride3d:master Dec 7, 2025
9 checks passed
@Eideren Eideren deleted the script_system_perf branch December 7, 2025 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-Core Issue of the engine unrelated to other defined areas

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Script Prioritization has significant overhead

5 participants