Skip to content

Commit 169dd9a

Browse files
staging-devin-ai-integration[bot]streamkit-devinstreamer45
authored
fix(compositor): improve image overlay quality, caching, aspect ratio, and selectability (#78)
* fix(compositor): improve image overlay quality, caching, aspect ratio, and selectability - Replace nearest-neighbor prescaling with bilinear (image crate Triangle filter) for much better rendering of images containing text or fine detail - Cache decoded image overlays across UpdateParams calls — only re-decode when data_base64 or target rect dimensions change, reusing existing Arc<DecodedOverlay> otherwise - Lock aspect ratio for image layers during resize (same as video layers) - Show actual image thumbnail in compositor canvas UI for easier selection; switch border from dotted to solid, remove crosshatch pattern Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * fix(compositor): guard against index mismatch in image overlay cache Use old_imgs.get(i) instead of old_imgs[i] to avoid a panic when a previous decode_image_overlay call failed, leaving old_imgs shorter than old_cfgs. Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * fix(compositor): address review — proper index mapping for cache, broader MIME detection - Build a HashMap<usize, &Arc<DecodedOverlay>> by walking old configs and decoded overlays in tandem, so cache lookups use config index rather than assuming positional alignment (which breaks when a previous decode failed) - Add WebP and GIF magic-byte detection for image thumbnail data URIs Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * style(compositor): apply cargo fmt formatting Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * fix(compositor): fix HashMap type and double-deref in overlay cache Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * fix(compositor): content-keyed overlay cache with dimension-based matching Replace incorrect positional index mapping with a content-keyed cache that matches decoded overlays to configs by comparing prescaled bitmap dimensions against the config's target rect. This correctly handles the case where a mid-list decode failure makes the decoded slice shorter than the config vec — failed configs are skipped (not consumed) because their target dimensions won't match the next decoded overlay. Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * fix(compositor): default image overlay z-index to 200 so it renders above video layers Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * style(compositor): add rationale comment for clippy::expect_used suppression Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * style: apply formatting fixes (cargo fmt + prettier) Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * fix(compositor): address review findings #1-#4, #7 - Replace dimension-matching cache heuristic with index-based mapping using image_overlay_cfg_indices (finding #1) - Only update x/y position on cache hit, not full rect clone (finding #2) - Fix MIME sniffing comment wording to 'base64-encoded magic bytes', add BMP detection (finding #3) - Switch from data-URI to URL.createObjectURL with cleanup for image overlay thumbnails (finding #4) - Change SAFETY comment to Invariant in prescale_rgba (finding #7) Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * fix(compositor): preserve image aspect ratio, add image layer controls, optimize base64 decode - Backend: prescale images with aspect-ratio preservation (scale-to-fit instead of stretch-to-fill) and centre within the target rect. - Backend: re-centre cached overlays on position update. - Frontend: detect natural image dimensions on add and set initial rect to match source aspect ratio. - Frontend: add opacity/rotation slider controls for selected image overlays (matching video and text layer controls). - Frontend: fix findAnyLayer to pass through rotationDegrees and zIndex for image overlays instead of hardcoding 0. - Frontend: replace O(n) atob + byte-by-byte loop with fetch(data-URI) for more efficient base64-to-blob conversion. - Frontend: remove BMP MIME detection (inconsistent browser support). - Frontend: add z-index band allocation comments (video 0-99, text 100-199, image 200+). Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * fix(compositor): apply rotation transform to image overlay layer in canvas preview Co-Authored-By: Claudio Costa <cstcld91@gmail.com> * fix(compositor): include rotationDegrees and zIndex in overlay sync change detection Add rotationDegrees and zIndex to the image overlay change-detection comparisons in the params sync effect so that YAML or backend changes to these fields are reflected in the UI. Also add the missing zIndex check to the text overlay change detection for consistency. Co-Authored-By: Claudio Costa <cstcld91@gmail.com> --------- Co-authored-by: StreamKit Devin <devin@streamkit.dev> Co-authored-by: Claudio Costa <cstcld91@gmail.com>
1 parent 787e0cb commit 169dd9a

File tree

5 files changed

+300
-87
lines changed

5 files changed

+300
-87
lines changed

crates/nodes/src/video/compositor/mod.rs

Lines changed: 84 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ use config::CompositorConfig;
3434
use kernel::{CompositeResult, CompositeWorkItem, LayerSnapshot};
3535
use overlay::{decode_image_overlay, rasterize_text_overlay, DecodedOverlay};
3636
use schemars::schema_for;
37+
use std::collections::HashMap;
3738
use std::sync::Arc;
3839
use streamkit_core::control::NodeControlMessage;
3940
use streamkit_core::pins::PinManagementMessage;
@@ -252,8 +253,16 @@ impl ProcessorNode for CompositorNode {
252253

253254
// Decode image overlays (once). Wrap in Arc so per-frame clones
254255
// into the work item are cheap reference-count bumps.
256+
//
257+
// `image_overlay_cfg_indices` records, for each successfully decoded
258+
// overlay, the index of the originating `ImageOverlayConfig` in
259+
// `config.image_overlays`. This allows the cache in
260+
// `apply_update_params` to map decoded bitmaps back to their configs
261+
// without relying on dimension-matching heuristics.
255262
let mut image_overlays_vec: Vec<Arc<DecodedOverlay>> =
256263
Vec::with_capacity(self.config.image_overlays.len());
264+
let mut image_overlay_cfg_indices: Vec<usize> =
265+
Vec::with_capacity(self.config.image_overlays.len());
257266
for (i, img_cfg) in self.config.image_overlays.iter().enumerate() {
258267
match decode_image_overlay(img_cfg) {
259268
Ok(overlay) => {
@@ -268,6 +277,7 @@ impl ProcessorNode for CompositorNode {
268277
overlay.rect.height,
269278
);
270279
image_overlays_vec.push(Arc::new(overlay));
280+
image_overlay_cfg_indices.push(i);
271281
},
272282
Err(e) => {
273283
tracing::warn!("Failed to decode image overlay {}: {}", i, e);
@@ -408,6 +418,7 @@ impl ProcessorNode for CompositorNode {
408418
Self::apply_update_params(
409419
&mut self.config,
410420
&mut image_overlays,
421+
&mut image_overlay_cfg_indices,
411422
&mut text_overlays,
412423
params,
413424
&mut stats_tracker,
@@ -489,6 +500,7 @@ impl ProcessorNode for CompositorNode {
489500
Self::apply_update_params(
490501
&mut self.config,
491502
&mut image_overlays,
503+
&mut image_overlay_cfg_indices,
492504
&mut text_overlays,
493505
params,
494506
&mut stats_tracker,
@@ -529,6 +541,7 @@ impl ProcessorNode for CompositorNode {
529541
Self::apply_update_params(
530542
&mut self.config,
531543
&mut image_overlays,
544+
&mut image_overlay_cfg_indices,
532545
&mut text_overlays,
533546
params,
534547
&mut stats_tracker,
@@ -651,6 +664,7 @@ impl CompositorNode {
651664
fn apply_update_params(
652665
config: &mut CompositorConfig,
653666
image_overlays: &mut Arc<[Arc<DecodedOverlay>]>,
667+
image_overlay_cfg_indices: &mut Vec<usize>,
654668
text_overlays: &mut Arc<[Arc<DecodedOverlay>]>,
655669
params: serde_json::Value,
656670
stats_tracker: &mut NodeStatsTracker,
@@ -666,17 +680,82 @@ impl CompositorNode {
666680
"Updating compositor config"
667681
);
668682

669-
// Always re-decode image overlays (content may have changed
670-
// even if the count is the same).
671-
let mut new_image_overlays =
683+
// Re-decode image overlays only when their content or
684+
// target rect changed. When only video-layer positions
685+
// are updated (the common case) the existing decoded
686+
// bitmaps are reused via Arc, avoiding redundant base64
687+
// decode + bilinear prescale work.
688+
//
689+
// The cache is keyed by (data_base64, width, height).
690+
// `image_overlay_cfg_indices` provides an exact mapping
691+
// from each decoded overlay back to its originating
692+
// config index, eliminating any heuristic guessing
693+
// about which decoded bitmap belongs to which config.
694+
let old_imgs = image_overlays.clone();
695+
let old_cfgs = &config.image_overlays;
696+
697+
let mut cache: HashMap<(&str, u32, u32), Vec<Arc<DecodedOverlay>>> =
698+
HashMap::new();
699+
700+
// Each decoded overlay has a recorded config index in
701+
// `image_overlay_cfg_indices`. Use this to look up
702+
// the originating config directly — no dimension
703+
// matching needed.
704+
for (dec_idx, decoded) in old_imgs.iter().enumerate() {
705+
if let Some(&cfg_idx) = image_overlay_cfg_indices.get(dec_idx) {
706+
if let Some(old_cfg) = old_cfgs.get(cfg_idx) {
707+
let key = (
708+
old_cfg.data_base64.as_str(),
709+
old_cfg.transform.rect.width,
710+
old_cfg.transform.rect.height,
711+
);
712+
cache.entry(key).or_default().push(Arc::clone(decoded));
713+
}
714+
}
715+
}
716+
717+
let mut new_image_overlays: Vec<Arc<DecodedOverlay>> =
718+
Vec::with_capacity(new_config.image_overlays.len());
719+
let mut new_cfg_indices: Vec<usize> =
672720
Vec::with_capacity(new_config.image_overlays.len());
673-
for img_cfg in &new_config.image_overlays {
721+
for (new_idx, img_cfg) in new_config.image_overlays.iter().enumerate() {
722+
let key = (
723+
img_cfg.data_base64.as_str(),
724+
img_cfg.transform.rect.width,
725+
img_cfg.transform.rect.height,
726+
);
727+
if let Some(entries) = cache.get_mut(&key) {
728+
if let Some(existing) = entries.pop() {
729+
// Content and target dimensions unchanged —
730+
// reuse the decoded bitmap. The overlay's
731+
// rect may be smaller than the config rect
732+
// due to aspect-ratio-preserving prescale,
733+
// so re-centre within the new config rect.
734+
let mut ov = (*existing).clone();
735+
let cfg_w = img_cfg.transform.rect.width.cast_signed();
736+
let cfg_h = img_cfg.transform.rect.height.cast_signed();
737+
let ov_w = ov.rect.width.cast_signed();
738+
let ov_h = ov.rect.height.cast_signed();
739+
ov.rect.x = img_cfg.transform.rect.x + (cfg_w - ov_w) / 2;
740+
ov.rect.y = img_cfg.transform.rect.y + (cfg_h - ov_h) / 2;
741+
ov.opacity = img_cfg.transform.opacity;
742+
ov.rotation_degrees = img_cfg.transform.rotation_degrees;
743+
ov.z_index = img_cfg.transform.z_index;
744+
new_image_overlays.push(Arc::new(ov));
745+
new_cfg_indices.push(new_idx);
746+
continue;
747+
}
748+
}
674749
match decode_image_overlay(img_cfg) {
675-
Ok(ov) => new_image_overlays.push(Arc::new(ov)),
750+
Ok(ov) => {
751+
new_image_overlays.push(Arc::new(ov));
752+
new_cfg_indices.push(new_idx);
753+
},
676754
Err(e) => tracing::warn!("Image overlay decode failed: {e}"),
677755
}
678756
}
679757
*image_overlays = Arc::from(new_image_overlays);
758+
*image_overlay_cfg_indices = new_cfg_indices;
680759

681760
// Re-rasterize text overlays.
682761
let new_text_overlays: Vec<Arc<DecodedOverlay>> = new_config

crates/nodes/src/video/compositor/overlay.rs

Lines changed: 51 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -47,18 +47,49 @@ pub fn decode_image_overlay(config: &ImageOverlayConfig) -> Result<DecodedOverla
4747
let target_w = config.transform.rect.width;
4848
let target_h = config.transform.rect.height;
4949

50-
// Pre-scale the decoded image to the target rect dimensions so that
51-
// the per-frame `scale_blit_rgba_rotated` call hits the identity-scale
52-
// fast path (direct memcpy) instead of doing nearest-neighbor scaling
53-
// every frame.
50+
// Pre-scale the decoded image to fit within the target rect while
51+
// preserving the source aspect ratio. This ensures the per-frame
52+
// `scale_blit_rgba_rotated` call hits the identity-scale fast path
53+
// (direct memcpy) and the image is never stretched.
5454
if target_w > 0 && target_h > 0 && (w != target_w || h != target_h) {
55+
// Aspect-ratio-preserving fit: scale so the image fits inside
56+
// the target box without distortion.
57+
#[allow(clippy::cast_precision_loss)]
58+
let scale = {
59+
let sw = w as f32;
60+
let sh = h as f32;
61+
(target_w as f32 / sw).min(target_h as f32 / sh)
62+
};
63+
#[allow(
64+
clippy::cast_possible_truncation,
65+
clippy::cast_sign_loss,
66+
clippy::cast_precision_loss
67+
)]
68+
let fit_w = ((w as f32 * scale).round() as u32).max(1);
69+
#[allow(
70+
clippy::cast_possible_truncation,
71+
clippy::cast_sign_loss,
72+
clippy::cast_precision_loss
73+
)]
74+
let fit_h = ((h as f32 * scale).round() as u32).max(1);
75+
5576
let raw = rgba.into_raw();
56-
let scaled = prescale_rgba(&raw, w, h, target_w, target_h);
77+
let scaled = prescale_rgba(&raw, w, h, fit_w, fit_h);
78+
79+
// Adjust the rect to match the fitted dimensions so the blit
80+
// stays on the identity-scale path and the image is centred
81+
// within the originally requested area.
82+
let mut rect = config.transform.rect.clone();
83+
rect.x += (target_w.cast_signed() - fit_w.cast_signed()) / 2;
84+
rect.y += (target_h.cast_signed() - fit_h.cast_signed()) / 2;
85+
rect.width = fit_w;
86+
rect.height = fit_h;
87+
5788
Ok(DecodedOverlay {
5889
rgba_data: scaled,
59-
width: target_w,
60-
height: target_h,
61-
rect: config.transform.rect.clone(),
90+
width: fit_w,
91+
height: fit_h,
92+
rect,
6293
opacity: config.transform.opacity,
6394
rotation_degrees: config.transform.rotation_degrees,
6495
z_index: config.transform.z_index,
@@ -76,24 +107,19 @@ pub fn decode_image_overlay(config: &ImageOverlayConfig) -> Result<DecodedOverla
76107
}
77108
}
78109

79-
/// Nearest-neighbor scale an RGBA8 buffer from `(sw, sh)` to `(dw, dh)`.
80-
/// Used once at config time so the per-frame blit is a 1:1 copy.
110+
/// Bilinear-filtered scale of an RGBA8 buffer from `(sw, sh)` to `(dw, dh)`.
111+
/// Uses the `image` crate's `resize` with `Triangle` (bilinear) filter for
112+
/// high-quality prescaling — much better than nearest-neighbor for images
113+
/// containing text or fine detail. Called once at config time so the
114+
/// per-frame blit is a 1:1 copy.
81115
fn prescale_rgba(src: &[u8], sw: u32, sh: u32, dw: u32, dh: u32) -> Vec<u8> {
82-
let sw = sw as usize;
83-
let sh = sh as usize;
84-
let dw = dw as usize;
85-
let dh = dh as usize;
86-
let mut out = vec![0u8; dw * dh * 4];
87-
for dy in 0..dh {
88-
let sy = dy * sh / dh;
89-
for dx in 0..dw {
90-
let sx = dx * sw / dw;
91-
let si = (sy * sw + sx) * 4;
92-
let di = (dy * dw + dx) * 4;
93-
out[di..di + 4].copy_from_slice(&src[si..si + 4]);
94-
}
95-
}
96-
out
116+
// Invariant: caller guarantees src.len() == sw * sh * 4.
117+
#[allow(clippy::expect_used)]
118+
// from_raw only fails if buffer length != w*h*4; caller guarantees this
119+
let src_img = image::RgbaImage::from_raw(sw, sh, src.to_vec())
120+
.expect("prescale_rgba: source dimensions do not match buffer length");
121+
let resized = image::imageops::resize(&src_img, dw, dh, image::imageops::FilterType::Triangle);
122+
resized.into_raw()
97123
}
98124

99125
// ── Bundled default font ────────────────────────────────────────────────────

ui/src/components/CompositorCanvas.tsx

Lines changed: 58 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -111,18 +111,6 @@ const InlineTextInput = styled.textarea`
111111
font-family: inherit;
112112
`;
113113

114-
/** Icon badge for image overlay layers */
115-
const ImageBadge = styled.div`
116-
position: absolute;
117-
inset: 0;
118-
display: flex;
119-
align-items: center;
120-
justify-content: center;
121-
pointer-events: none;
122-
z-index: 1;
123-
opacity: 0.5;
124-
`;
125-
126114
const LayerDimensions = styled.div`
127115
position: absolute;
128116
bottom: 2px;
@@ -201,26 +189,6 @@ const ResizeHandles: React.FC<{
201189
));
202190
ResizeHandles.displayName = 'ResizeHandles';
203191

204-
// ── Image icon SVG ──────────────────────────────────────────────────────────
205-
206-
const ImageIcon: React.FC<{ size?: number }> = ({ size = 24 }) => (
207-
<svg
208-
width={size}
209-
height={size}
210-
viewBox="0 0 24 24"
211-
fill="none"
212-
stroke="currentColor"
213-
strokeWidth="1.5"
214-
strokeLinecap="round"
215-
strokeLinejoin="round"
216-
style={{ color: 'rgba(255,255,255,0.5)' }}
217-
>
218-
<rect x="3" y="3" width="18" height="18" rx="2" ry="2" />
219-
<circle cx="8.5" cy="8.5" r="1.5" />
220-
<polyline points="21 15 16 10 5 21" />
221-
</svg>
222-
);
223-
224192
// ── Video input layer ───────────────────────────────────────────────────────
225193

226194
const VideoLayer: React.FC<{
@@ -437,6 +405,46 @@ const ImageOverlayLayer: React.FC<{
437405
const borderColor = isSelected ? 'var(--sk-primary)' : `hsla(${hue}, 70%, 65%, 0.8)`;
438406
const bgColor = isSelected ? `hsla(${hue}, 60%, 50%, 0.25)` : `hsla(${hue}, 60%, 50%, 0.12)`;
439407

408+
// Build a blob URL for the image thumbnail. Using fetch() with a
409+
// data-URI lets the browser decode the base64 natively, which is
410+
// more efficient than the manual atob() + byte-by-byte Uint8Array
411+
// copy for large images.
412+
//
413+
// MIME detection: we inspect the base64-encoded magic bytes at the
414+
// start of the string to pick the correct MIME type. The fallback
415+
// is JPEG, which covers the common `/9j/` prefix and variants.
416+
const [imgSrc, setImgSrc] = useState<string | undefined>();
417+
418+
useEffect(() => {
419+
if (!overlay.dataBase64) {
420+
setImgSrc(undefined);
421+
return;
422+
}
423+
let mime = 'image/jpeg'; // default fallback
424+
if (overlay.dataBase64.startsWith('iVBOR')) mime = 'image/png';
425+
else if (overlay.dataBase64.startsWith('R0lGOD')) mime = 'image/gif';
426+
else if (overlay.dataBase64.startsWith('UklGR')) mime = 'image/webp';
427+
428+
let cancelled = false;
429+
let url: string | undefined;
430+
431+
fetch(`data:${mime};base64,${overlay.dataBase64}`)
432+
.then((r) => r.blob())
433+
.then((blob) => {
434+
if (cancelled) return;
435+
url = URL.createObjectURL(blob);
436+
setImgSrc(url);
437+
})
438+
.catch(() => {
439+
// Ignore decode failures — no thumbnail shown.
440+
});
441+
442+
return () => {
443+
cancelled = true;
444+
if (url) URL.revokeObjectURL(url);
445+
};
446+
}, [overlay.dataBase64]);
447+
440448
return (
441449
<LayerBox
442450
ref={layerRef}
@@ -447,19 +455,31 @@ const ImageOverlayLayer: React.FC<{
447455
width: overlay.width,
448456
height: overlay.height,
449457
opacity: overlay.visible ? overlay.opacity : 0.2,
458+
transform:
459+
overlay.rotationDegrees !== 0 ? `rotate(${overlay.rotationDegrees}deg)` : undefined,
450460
zIndex: overlay.zIndex ?? 200 + index,
451-
border: `2px dotted ${borderColor}`,
461+
border: `2px solid ${borderColor}`,
452462
background: bgColor,
453463
filter: overlay.visible ? undefined : 'grayscale(0.6)',
454-
backgroundImage:
455-
'repeating-linear-gradient(45deg, transparent, transparent 6px, rgba(255,255,255,0.04) 6px, rgba(255,255,255,0.04) 12px)',
456464
}}
457465
onPointerDown={handlePointerDown}
458466
>
467+
{imgSrc && (
468+
<img
469+
src={imgSrc}
470+
alt={`Image overlay ${index}`}
471+
style={{
472+
position: 'absolute',
473+
inset: 0,
474+
width: '100%',
475+
height: '100%',
476+
objectFit: 'contain',
477+
pointerEvents: 'none',
478+
opacity: 0.85,
479+
}}
480+
/>
481+
)}
459482
<LayerLabel>IMG #{index}</LayerLabel>
460-
<ImageBadge>
461-
<ImageIcon size={24} />
462-
</ImageBadge>
463483
{isSelected && <ResizeHandles layerId={overlay.id} onResizeStart={onResizeStart} />}
464484
</LayerBox>
465485
);

0 commit comments

Comments
 (0)