Skip to content

Memory-efficient ping-pong buffer system for image algorithms #17

@Marek55S

Description

@Marek55S

Goal

Implement a memory-efficient buffer management system in the Rust backend so algorithms reuse two buffers in ping-pong style instead of allocating a fresh pixel buffer for every step. The system must support integer pixel buffers (u8/u16) and floating-point pixel buffers (f32), because downstream algorithms may change the pixel type.

Benefits:

  • Reduce peak memory usage from N × image_size to 2 × image_size (N = number of algorithms chained)
  • Improve performance by reusing allocations instead of repeatedly creating/dropping large Vecs
  • Support chaining many algorithms (blur → clip → linear → median → ...) without peak memory blowup

Current situation

At the moment, many algorithms in api/src/algorithms.rs create new allocations for their output buffers. When users chain multiple operations, each step allocates a new Vec for the modified pixel data and drops the previous one. This leads to:

  • Linear growth in total allocations while chaining algorithms
  • Higher peak memory usage when many operations are applied in sequence
  • Unnecessary runtime overhead due to repeated allocations/deallocations

Additionally, some algorithms may convert the image representation from integer to floating point (for e.g., linear transforms, convolution with higher precision). The buffer-management system must therefore support both integer and float pixel representations.

Proposal

Introduce a simple, generic ping-pong buffer abstraction and update algorithms to write into a provided scratch buffer instead of always allocating new output buffers.

Core idea:

  • Maintain two buffers of equal length (image_size): active and scratch.
  • Each algorithm reads from the active buffer and writes its result into scratch.
  • After an algorithm finishes, call a swap method so scratch becomes the new active buffer and the previous active becomes the new scratch.
  • Repeat for the next algorithm.

Key requirements:

  • The system must be generic over pixel element types so it supports u8, u16, and f32 (and i16 if used elsewhere).
  • The ping-pong buffers must be reusable and pre-allocated to the image size to avoid reallocation between steps.
  • Public algorithm functions should be refactored to accept an output buffer or to use an Image buffer-pool API that supplies scratch buffers.
  • Maintain compatibility with wasm builds. Avoid heavy platform-specific allocations or threading assumptions in the core design.

Required changes (✅)

  1. Refactor algorithms to accept an output buffer to write into instead of always allocating a new Vec internally. This can be done incrementally; start with core algorithms (gaussian_blur_image, median_blur_image, clip, linear_function, etc.).
  2. Add a ping-pong buffer abstraction (either helper or full ImageBuffers<T> type) which maintains two pre-allocated Vecs and swaps.
  3. Ensure the buffer abstraction (and algorithm signatures) support u8, u16, and f32 pixel types. If i16 is used elsewhere, support it too.
  4. Update higher-level pipeline code (if present) to use the ping-pong buffers and swap between steps instead of creating new allocations.

Acceptance criteria

  • Algorithms are refactored to write their output into a provided scratch buffer (no per-algorithm Vec reallocation).
  • Both integer buffers (u8, u16) and floating buffers (f32) are supported and tested.
  • A buffer-swap mechanism exists so algorithms can be chained without allocating new buffers per step.
  • No regression in correctness for existing algorithms.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions