β‘ Profiler: Gaussian Splatting Loop Tiling #380
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
π The Bottleneck: The
blend_gaussiansfunction was bandwidth-bound, reloading 88 bytes of Gaussian data for every single pixel (100M loads for 100 frames).π The Boost: Before: 1.60s -> After: 1.56s (~2.7% improvement).
π» Technical Detail: Implemented
blend_gaussians_block_2x2which processes 2x2 pixel blocks. This reuses the loaded Gaussian data (mean, conic, color) for 4 pixels, reducing memory loads by 4x. Also added#[inline]to hot math functions.π§ͺ Verification: Verified correctness with
cargo testand benchmark output comparison (results match). PDR entry added to.jules/profiler.md.PR created automatically by Jules for task 4031989690945417743 started by @fderuiter