-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Add NVIDIA Reflex #35678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add NVIDIA Reflex #35678
Conversation
As I've invested time and money into my Reflex Latency Analyzer setup, I'm hoping I can help out! Testing method:Tests conducted on your reflex branch for osu and framework.
ALL tests have pre-rendered frames set to 1 (nvidia low latency mode in driver) Expected results:
Findings:Tools used for analysis: https://eskezje.github.io/Frametime-Analysis/ D3D11 in-game 2x limiter, multithreading, Reflex OFF/ON/BOOSTLatency:
As expected there is a minor decrease in latency when using the Reflex Boost option. Frame pacing:
(Dataset A and B are Reflex Off and On respectively) Pairing the in-game frame limiter with Reflex seems to promote increased consistency for frame pacing. D3D11 Multithreading GSYNC Scenario # 1: Naive setup ("enable vsync+gsync and call it a day")People love giving this advice for some reason, even though it's completely wrong. Latency:
When ignoring driver settings and just enabling vsync in-game, gsync is not doing anything to help reduce latency. Framepacing results are ignored for now, as it's expected of vsync to behave well in this case anyways. D3D11 Multithreading GSYNC Scenario # 2: "Optimal" setup (Driver level vsync and Ultra Low Latency Mode)In this scenario, NVIDIA limits our fps to a lower rate to support low latency vsync (327fps in my case)
It's quite apparent that gsync is only worth using when set up properly. This scenario is how players have to set up gsync currently in osu!lazer, it's test results serves the purpose of comparison to our new Reflex behaviour. D3D11 Multithreading GSYNC Scenario # 3: Reflex setupLatency:
Behaviour is as expected, players don't have to mess with any driver settings to engage low latency gsync when enabling the in-game Reflex option. Boost seems to support a slightly lower latency average with an increase in deviation. Frame pacing:
Reflex Boost seems to support more consistent frame pacing even if we are already vsync'd. Conclusion:Reflex seems incredibly useful to guide players into using gsync correctly, without having them mess with driver level changes which they might not understand. |
|
@Spok5508 Thank you so much for the tests!!! Especially the GSYNC ones as I hadn't even considered Reflex's impact on GSYNC First off I have a question: When testing, did you run the game in the Questions aside, I did do some basic testing of my own using CapFrameX. My testing focused on frametime behaviour rather than latency. I found that overall, reflex traded frame pacing and consistency for frame stability. Meaning reflex reduced worst-case stutters, but introduced slightly more micro stutters as a result of its overhead. I'm currently on the subway so I'm not able to pull up my charts but I'll send them when I can. |
Yes! I forgot to mention this as well. |
A Report of Reflex's Effects on osu!(lazer)Testing specs:
Glossary:
TLDRNVIDIA Reflex improves frame consistency by 6.27% in the "On" mode, making the game feel smoother, at the expense of a 0.42% increase in system latency. Reflex reduces system latency by 2.92% in the "Boost" mode, making the game feel more responsive, at the expense of a 4.41% reduction in frame consistency, meaning more stutters. Overall a net positive, with most benefits in the "On" mode. (Percentages sourced from @Spok5508) My Thoughts on Reflex and LatencyWhile the effects of Reflex are going to vary from machine to machine, I believe that for osu! in specific, a game that is pretty much entirely CPU bottlenecked, realistically the perceivable effects of Reflex are going to be slim, if not imperceptible for most, when looking at render latency alone. After playing the game with Reflex (boost) and the Reflex testing overlay on, I noticed an average decrease of 500 microseconds (0.5ms) on render latency, a difference of a couple hundred microseconds compared to Reflex (off). Going off of render latency alone, I doubt that difference will prove to accomplish much. When including the testing done by @Spok5508 on system latency across the reflex modes without gsync, we don't see a sizable change in system latency across the different modes, we're still looking at a difference of hundreds of microseconds, and single digits in percentages. The Reflex (boost) mode has the lowest system latency reduction at roughly 265 microseconds (0.265ms), a 2.92% decrease. But this is STATISTICALLY better than Reflex (on), which adds 38 microseconds (0.038ms) of latency, a 0.42% increase in system latency. However, based on humanity's current understanding of human physiology and psychology, there is no plausible way an increase of 38 microseconds in latency should be at all perceivable to any human being. This number is so inconceivably low that I would go far as to say it is completely impossible to detect. The decrease of 265 microseconds from Reflex (boost) adds a bit more plausibility, but the number is still pretty low to where I would only call it marginally better than the baseline Reflex (off). Regardless I would not say the latency decrease in Reflex (boost) is beneficial considering the mode's decrease of frame consistency (4.41%) and overall FPS (~10fps). Overall: Reflex doesn't really change much in terms of latency. What Does Reflex Do Then?Reflex (on) reduces larger frame drops, the ones that would be perceivable. Going off of Spok's tests, we see a 6.27% reduction in standard deviation from the Reflex (on) mode, roughly meaning a 6.27% reduction in stuttering compared to Reflex (off). My own testing also confirms this:
This chart shows the frametime variability across the 3 different reflex states. Top is off, middle is on, bottom is boost. Reflex (on) reduces the frames within the red section (stutters), but in turn has more frames in the orange section (micro stutters). Reflex (boost) has more variability and more stutters overall. This confirms Spok's findings, and confirms Reflex's improvements in frame pacing and consistency.
This chart shows the FPS differences with the different Reflex modes. First off we observe a gradual reduction in FPS as we go through the 3 modes, consistent with their presumable overhead. The 1% percentile is higher in the Reflex (on) state which tells us that Reflex (on) reduced dropped and uneven frames, overall improving worst case stutters. Reflex (boost) is slightly worse than Reflex (on) when it comes to reducing dropped and uneven frames, but is seemingly still better than Reflex (off), likely because the Reflex (boost) mode has the most reductions to system latency, which is better than stuttering on high latency. Overall: Reflex makes frames more consistent, reduces dropped frames, and reduces stuttering. OverviewOverall it can be said that Reflex DOES improve the game's performance. Despite it's overhead, we observe improvements in frame consistency, frame pacing, and reduced stuttering. This comes at the cost of possibly more micro stutters (however these are not likely to be noticeable), and reduced FPS. A decent trade off. I want to make the case that even though Reflex slightly increases latency (unless with the Boost mode, in which it slightly reduces latency), the improved frame consistency alone should assist players. Better frame consistency allows players to accurately predict when new frames will be presented. Players should find it easier to predict when approach circles will meet the circles, and I would additionally expect improvements in Unstable Rate given the improvement in frame pacing, allowing the game to feel more smooth and less jittery. Big thanks to @Spok5508 for the system latency testing which I'm incapable of testing myself. Reflex is pretty promising and I hope it finds its way into players hands, increasing their performance and their love for the game. |
|
I assume most of this stuff is targeted at click to display latency. How does this affect click to audio latency? I would expect it to not change, but depending on how the syncing of the update and render threads and the GPU is implemented, it might increase audio latency.
If I have a 60 Hz monitor and use Reflex, would this entail locking the update and draw threads at 60 FPS? If that's the case, it's really bad for audio latency: I click my mouse, and instead of the update thread quickly responding by playing a hitsound, it's waiting for some rendering synchronisation mechanism. I think discussion about the impact of Reflex on audio latency is missing in your conversation. |
No, Reflex should never directly touch audio whatsoever. The main goal of Reflex is to effectively sync the render submission phase, which is performed by the CPU, so that it finishes just in time for the GPU to render. Any audio and input work happen on their own threads, and are left completely untouched. The only way I could possibly see any impact on audio latency would be purely in the case of Reflex increasing CPU contention or load in a manner where it slows down the audio thread, but this is a massive what-if and highly unlikely to occur in the real world. I can agree with measuring audio latency in addition to Reflex to test your theory rather than dismiss it without any empirical basis, but I do remain firm in my belief that Reflex should have no measurable impact on audio latency
I will clarify that most of my testing and explanations in this PR have been assuming a multi-threaded context, as I haven't gotten around to testing and understanding Reflex's effects on single-threaded games. I do believe that Reflex can negatively impact game performance on the single-threaded mode. To be honest, when implementing Reflex, I did get the feeling that Reflex was built with games being multi-threaded in mind. I don't think allowing Reflex on single-threaded environments would be beneficial, I'm skeptical at best. |
Then why is |
|
I explained it incorrectly, allow me to revise: According to the Reflex Docs:
I placed To put things another way, you fear this to be the pipeline: When in reality it should look more like this: What I was attempting to convey in the original message is that no, Reflex does not lock the Update thread mid-input. It delays the start of the frame so that when input is sampled, it is as fresh as possible (as close to the rendering time as possible). I was definitely incorrect in saying it only locks the Draw thread, my bad 😅 As for the placement of |
This will cause up to a 16.67 ms delay in processing input, judgements and hitsound playback (compare with 4.17 ms when the update thread is running at the default 240 Hz on a 60 Hz display). I would say that this is a noticeable difference and undesirable in a rhythm game like osu!. It's a trade-off between input to display and input to audio (+judgement) latency. To avoid this problem, I propose not calling NVIDIA Reflex SDK Integration Guide.pdf, pp. 14–15
cc @smoogipoo for your opinion |
I personally believe this is overkill. While the 16.67ms delay would exist if we locked both the update thread and draw thread to 60 FPS, we don't have to lock the update thread to be that low. We don't have to have the GPU driver touch framerate at all. I don't believe reducing the effectiveness of Reflex across the board is justifiable just to be able to include what was meant to be a cool side-project stemming from this feature. I propose we scale down the goals of what this PR is trying to achieve for now. Instead of locking the Update & Draw threads to users refresh rates, we can instead keep |
|
I've gone through and fixed some issues mentioned. I decided that to push this PR forwards, I won't add in FPS limiting via Reflex. Instead, I used our built in FPS limiter to limit the Draw thread to monitor refresh rate, as anything over that is unnecessary given reflex's just-in-time rendering. I've edited the "FPS Limiting" section in the OP to reflect this. This PR should now be ready for review |










Prerequisites:
Resources:
NVIDIA Reflex SDK Integration Guide.pdf. Due to the copyright license attached to the SDK I can't provide a direct link to the implementation docs.Introduction
NVIDIA Reflex is an API for Windows + NVIDIA GPU's that allows the game to effectively report its internal render loop timings to the GPU, allowing it to sync the render queue with the CPU, enabling just-in-time rendering, which results in lower input-to-image latency, improved frame consistency, and reduced stuttering. Additionally, Reflex allows us to gather detailed analytics on the game's render latency, paving the way for future optimization, both in terms of game performance, and for end users attempting to lower system latency.
This video does a good job of explaining how Reflex works.
Implementation & Notes
Frame Limiting
From the NVIDIA Reflex Docs:What this means is that we can use NVIDIA Reflex to limit FPS instead of limiting FPS ourselves on NVIDIA + Windows systems. This allows the GPU driver to automatically apply the lowest FPS limit. Say the user sets an FPS limit in their NVIDIA settings for when their PC is running on battery, offloading frame limiting to the GPU driver allows the game to respect that FPS limit set by the user. This also includes if the user set a limit for when the game is out of focus, or if the user set a global frame limit in their settings. And importantly, it lets the GPU driver automatically cap the FPS to just under the user's refresh rate if they're using GSYNC. Functioning much like an "Optimal" or otherwise intelligent FPS limiting mode.
Also, speaking of capping FPS to just under the user's refresh rate, a goal of mine when it comes to this implementation is to
completely disable our built-in FPS limiting system on NVIDIA + Windows systems. It has been stated that going forwards, the goal is to lock FPS to the monitor refresh rate with ideally no impact on latency. This feature of NVIDIA Reflex should hopefully achieve that.As a result, this PR will try to move that vision forwards by completely disabling the FPS limit setting when NVIDIA Reflex is detected to be on and enabled. When Reflex is on, the game will limit its Draw FPS to the monitors refresh rate. With Reflex off, the game will allow modification of the FPS limit (to what is currently available),
but still attempt to use the GPU driver to set that limit, rather than our own fps limiting logic.Note: For any concerned players, Lazer already does this on MacOS (Metal renderer) and achieves low latency despite it. More frames does not mean better performance.
Update: After some testing on my part, and discussion on the topic, using NVIDIA Reflex's built in frame limiter would introduce audio latency as a result of limiting the refresh rate of the Update thread in addition to the Draw thread. This is sub-optimal, and is good reason to not offload frame limiting to Reflex.
The most likely direction with this feature going forwards is:
Markers & Cross Platform Behaviour
From the NVIDIA Reflex Docs:The last sentence is heavily significant as continuing to report the markers even if Reflex is off enables the functionality described in the Front-End Render Latency Telemetry section below.
As for cross-platform behaviour, there is a rather clever way to ensure that the markers are only reported if the game is running on Windows. We initialize and set the default
LowLatencyProviderinGameHostas a No-Op implementation. Then, inosu.Desktop, only if NVAPI is available, do we switch theLowLatencyProviderto theNVAPILowLatencyProvider. This basically means that the marker code does absolutely nothing on non-NVIDIA and non-Windows environments, and therefore has no overhead.As a small side note, if lazer's Vulkan API is brought back from deprecation, NVIDIA Reflex can (theoretically) be made to work on Linux by leveraging the Reflex Vulkan API.
Boost Mode
Boost Mode is an NVIDIA Reflex feature which aims to optimize the game in CPU-bound scenarios.When a game is CPU-bound, the GPU will automatically lower its clocks to save on power consumption. This is typically fine, but it becomes an issue when the game transitions to being GPU-bound. The GPU now has to increase its clocks to keep up with the CPU, which causes stutters while the GPU spools up. Reflex's Boost Mode aims to fix this by constantly running the GPU at the maximum possible clock. This comes at the cost of significantly higher power consumption & possibly more latency (due to the overhead of running the GPU at max performance), but usually less stuttering as the GPU is always ready and in a high-performance state.
In this implementation, I added the option to use Boost as is recommended by the NVIDIA Reflex SDK Docs, but I added a warning about power draw and potential for it to actually backfire and reduce performance.
Render Latency Telemetry
Front-End
Reflex, even in its Off mode, will continue to set markers and gather telemetry on latency. This allows players to use the NVIDIA Overlay to measure render latency, which is objectively a better metric for players to obsess about rather than the current frametime metric in the "Show FPS" panel. Aside from the obvious benefit of transparency, this allows players to measure total system latency if they have a compatible 360hz GSYNC monitor.
Back-End
The Reflex latency telemetry is highly accurate, and according to NVIDIA, completely replaces the need for a high speed camera to measure latency. This can be useful for internal analysis for the core team, and can help shed light on areas of the game that drive up latency, and might require optimization.
Testing
Our own methodological testing can be found here
Warning
Please backup your osu!(lazer) data and read the contributing docs before doing this. Running lazer in the
Releaseconfiguration can brick your typical lazer installation, but is necessary for a change like this.osuandosu-frameworkfolders are under the same parent directoryUseLocalFramework.(ps1|sh)script inosuaccording to your systemdotnet run osu.Desktop -c ReleaseinosuDirectXrendererTo find the render latency metric, you can either use the NVIDIA Overlay that comes with the NVIDIA app or GeForce Experience app, or you can use the Reflex Testing HUD found in the Reflex SDK linked above.
What's Left?
AMD has a Reflex-like SDK for reducing latency on AMD cards. This SDK is called AMD Radeon Anti-Lag 2. While I would like to implement this beside Reflex (in a different PR), I don't own an AMD GPU to be able to test it out and make sure it works.