Description
The chunked and non-chunked paths for streaming layout (layoutNextLine, layoutNextLineRange, and internal simple layout functions) contain multiple edge case bugs regarding how line starts are calculated when whiteSpace modes are altered (like pre-wrap) or when combining elements like ZWSP, spaces, and soft-hyphens. Additionally, these streaming layout functions drift out of sync with the batched layoutWithLines function when simpleLineWalkFastPath is disabled (e.g. whiteSpace: 'pre-wrap' or wordBreak: 'keep-all').
Reproducing the bug
Run the following code using bun test_bug.ts:
import { prepareWithSegments, layoutWithLines, layoutNextLine } from './src/layout.ts';
// Mock document and canvas for the reproduction to work outside a browser
globalThis.document = {
body: { appendChild: () => {}, removeChild: () => {} },
createElement: () => ({
style: {},
getContext: () => ({ measureText: (t) => ({ width: t.length * 10 }) }),
getBoundingClientRect: () => ({width: 10}),
})
} as any;
const text = "بام \u200DB bا \u00ADb\u060C b\f \u061F\uD83D\uDE80\u061F\u0639 \u0631 \u672C \u061F\na a A\u200B \u8A9E \u8A9E\u200D\u062D";
const font = "16px Inter";
const p1 = prepareWithSegments(text, font, {whiteSpace: 'normal', wordBreak: 'normal'}); // normal
const lwl1 = layoutWithLines(p1, 56.57, 20);
const streamingLines = [];
let cursor = {segmentIndex: 0, graphemeIndex: 0};
while (true) {
const line = layoutNextLine(p1, cursor, 56.57);
if (!line) break;
streamingLines.push(line);
cursor = line.end;
}
console.log("layoutWithLines vs layoutNextLine length mismatch:", lwl1.lines.length !== streamingLines.length);
Root Cause
normalizeSimpleLineStartSegmentIndex and normalizeLineStartInChunk are hardcoded to skip past 'space', 'zero-width-break', and 'soft-hyphen' by advancing segmentIndex unconditionally:
while (segmentIndex < prepared.widths.length) {
const kind = prepared.kinds[segmentIndex]!
if (kind !== 'space' && kind !== 'zero-width-break' && kind !== 'soft-hyphen') break
segmentIndex++
}
- However, other whitespace configurations introduce different
kinds (e.g. 'preserved-space', 'hard-break', 'tab') that these normalization functions fail to take into account properly. When { whiteSpace: 'pre-wrap' } is on, spaces become 'preserved-space' instead of 'space'.
- This skipping logic causes
layoutNextLine to "eat" the wrong tokens when trying to find the beginning of the next line, or conversely, failing to skip things it should skip. This makes layoutNextLine start on the wrong segment/grapheme offset, causing infinite loops, dropping lines entirely, or missing/adding content.
- Also, these functions do not accurately handle
zero-width-break characters followed by spaces when determining where the next streamed line should begin, causing mismatch with the batch API.
Description
The chunked and non-chunked paths for streaming layout (
layoutNextLine,layoutNextLineRange, and internal simple layout functions) contain multiple edge case bugs regarding how line starts are calculated whenwhiteSpacemodes are altered (likepre-wrap) or when combining elements likeZWSP, spaces, andsoft-hyphens. Additionally, these streaming layout functions drift out of sync with the batchedlayoutWithLinesfunction whensimpleLineWalkFastPathis disabled (e.g.whiteSpace: 'pre-wrap'orwordBreak: 'keep-all').Reproducing the bug
Run the following code using
bun test_bug.ts:Root Cause
normalizeSimpleLineStartSegmentIndexandnormalizeLineStartInChunkare hardcoded to skip past'space','zero-width-break', and'soft-hyphen'by advancingsegmentIndexunconditionally:kinds(e.g.'preserved-space','hard-break','tab') that these normalization functions fail to take into account properly. When{ whiteSpace: 'pre-wrap' }is on, spaces become'preserved-space'instead of'space'.layoutNextLineto "eat" the wrong tokens when trying to find the beginning of the next line, or conversely, failing to skip things it should skip. This makeslayoutNextLinestart on the wrong segment/grapheme offset, causing infinite loops, dropping lines entirely, or missing/adding content.zero-width-breakcharacters followed by spaces when determining where the next streamed line should begin, causing mismatch with the batch API.