perf(text-buffer-view): stream word wrap for large chunks #471
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note: I wasn't sure weather to add this as an issue or a PR, so I went with PR,
but feel free to close if you want to discuss approach first.
The gist of this change is to improve word wrapping performance on large
single-line chunks by switching to a streaming approach instead of
precomputing all word boundaries.
before
main.mp4
after
word-wrap.mp4
Word wrapping large single-line files (minified JavaScript, continuous
logs) was slow. The old getWrapOffsets() precomputed all word boundary
positions for the entire chunk before wrapping began. Multi-megabyte
files produced arrays with tens of thousands of entries, most unused
since wrapping only needs boundaries within the current wrap width.
Added a hybrid strategy based on chunk size. Chunks larger than
64KB now use findWordWrapPosition(), which scans only up to wrap_width
columns per line and returns the last word boundary within that
window. This stops early instead of walking the full chunk. Smaller
chunks keep the cached approach where the upfront cost pays off
through cache locality.
Note: wrap-break detection now honors
width_methodin the cachedpath. This changes semantics for
.wcwidthand.no_zwj(per‑codepoint breaks; ZWJ forces a break), while
.unicodebehavioris unchanged. This aligns cached offsets with the streaming path and
cursor movement.
The streaming path uses per-codepoint widths without full grapheme
state, so complex emoji or Indic sequences may wrap differently than
cached, but only in chunks over 64KB containing such sequences at wrap
boundaries.
Benchmarks:
Baseline
31a5cc2(main) -> Currentc688b4a(perf/word-wrap)