Skip to content

Images are always encoded as PNG → excessive memory usage and file size (need JPEG artifact + conversion operator) #2

@nathabee

Description

@nathabee

Issue: Images are always encoded as PNG → excessive memory usage and file size (need JPEG artifact + conversion operator)

Problem

Currently, all images inside the pipeline are effectively treated as PNG (RGBA) when:

  • stacking images (imageList → image)
  • converting ImageData to bytes (e.g. before PDF generation)
  • exporting intermediate results

This causes extreme memory and size inflation.

Example:

  • Input: 5 JPEG files (6 + 3 + 4 + 6 + 4 MB ≈ 23 MB total)
  • After stacking / PNG re-encoding: result ~128 MB

This happens because:

  • ImageData is uncompressed RGBA (4 bytes per pixel)
  • canvas.toDataURL("image/png") produces lossless PNG
  • PNG is not appropriate for photographic images
  • We always convert to PNG regardless of original format

This is not scalable for large image sets.


Root Cause

The pipeline currently models images as:

{ type: "image", width, height, image: ImageData }

There is no concept of:

  • Original format (jpeg/png/webp)
  • Compression quality
  • Lossy vs lossless encoding
  • Byte representation separate from ImageData

As a result, every export step defaults to PNG.


Required Direction

We need format-aware image artifacts and a conversion operator.


Proposed Solution

1️⃣ Extend Artifact model

Add a new artifact type:

type ImageJpegArtifact = {
  type: "imageJpeg";
  width: number;
  height: number;
  image: ImageData;
  quality?: number; // 0–1
};

Or alternatively:

type ImageArtifact = {
  type: "image";
  width: number;
  height: number;
  image: ImageData;
  format: "png" | "jpeg";
  quality?: number;
};

The second option is cleaner long-term (format as metadata).


2️⃣ Add operator: Convert Image → JPEG

New op:

op.image.toJpeg
io: image → image

Parameters:

  • quality (default 0.85)
  • optional subsampling control (future)

Implementation:

  • Use canvas.toBlob("image/jpeg", quality)
  • Return an artifact flagged as JPEG

3️⃣ Improve PDF generation

imagesToPdf should:

  • Accept already-compressed JPEG bytes when available
  • Avoid re-encoding PNG unnecessarily
  • Prefer JPEG for photographic content

4️⃣ Optional future operator

op.image.optimizeForPdf

Automatically chooses:

  • JPEG for photos
  • PNG for masks / flat graphics

Acceptance Criteria

  • Stacking 5 JPEG images does not inflate to 100+ MB
  • Image artifacts can carry format metadata
  • Builder includes an “Image → JPEG” conversion operator
  • PDF generation prefers JPEG when available
  • Memory usage remains proportional to input size

Notes

  • This is not just about file size — it affects:

    • memory pressure
    • performance
    • mobile viability (Android wrapper)
  • PNG should remain available for:

    • masks
    • SVG previews
    • lossless workflows

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions