-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Goal
Implement a memory-efficient buffer management system in the Rust backend so algorithms reuse two buffers in ping-pong style instead of allocating a fresh pixel buffer for every step. The system must support integer pixel buffers (u8/u16) and floating-point pixel buffers (f32), because downstream algorithms may change the pixel type.
Benefits:
- Reduce peak memory usage from N × image_size to 2 × image_size (N = number of algorithms chained)
- Improve performance by reusing allocations instead of repeatedly creating/dropping large Vecs
- Support chaining many algorithms (blur → clip → linear → median → ...) without peak memory blowup
Current situation
At the moment, many algorithms in api/src/algorithms.rs create new allocations for their output buffers. When users chain multiple operations, each step allocates a new Vec for the modified pixel data and drops the previous one. This leads to:
- Linear growth in total allocations while chaining algorithms
- Higher peak memory usage when many operations are applied in sequence
- Unnecessary runtime overhead due to repeated allocations/deallocations
Additionally, some algorithms may convert the image representation from integer to floating point (for e.g., linear transforms, convolution with higher precision). The buffer-management system must therefore support both integer and float pixel representations.
Proposal
Introduce a simple, generic ping-pong buffer abstraction and update algorithms to write into a provided scratch buffer instead of always allocating new output buffers.
Core idea:
- Maintain two buffers of equal length (image_size):
activeandscratch. - Each algorithm reads from the
activebuffer and writes its result intoscratch. - After an algorithm finishes, call a swap method so
scratchbecomes the newactivebuffer and the previousactivebecomes the newscratch. - Repeat for the next algorithm.
Key requirements:
- The system must be generic over pixel element types so it supports
u8,u16, andf32(andi16if used elsewhere). - The ping-pong buffers must be reusable and pre-allocated to the image size to avoid reallocation between steps.
- Public algorithm functions should be refactored to accept an output buffer or to use an Image buffer-pool API that supplies scratch buffers.
- Maintain compatibility with wasm builds. Avoid heavy platform-specific allocations or threading assumptions in the core design.
Required changes (✅)
- Refactor algorithms to accept an output buffer to write into instead of always allocating a new Vec internally. This can be done incrementally; start with core algorithms (gaussian_blur_image, median_blur_image, clip, linear_function, etc.).
- Add a ping-pong buffer abstraction (either helper or full
ImageBuffers<T>type) which maintains two pre-allocated Vecs and swaps. - Ensure the buffer abstraction (and algorithm signatures) support u8, u16, and f32 pixel types. If
i16is used elsewhere, support it too. - Update higher-level pipeline code (if present) to use the ping-pong buffers and swap between steps instead of creating new allocations.
Acceptance criteria
- Algorithms are refactored to write their output into a provided scratch buffer (no per-algorithm Vec reallocation).
- Both integer buffers (
u8,u16) and floating buffers (f32) are supported and tested. - A buffer-swap mechanism exists so algorithms can be chained without allocating new buffers per step.
- No regression in correctness for existing algorithms.