Skip to content

Conversation

@bglw
Copy link
Member

@bglw bglw commented Dec 29, 2025

  1. When searching a short enough prefix, such as doc, it was possible for Pagefind to only retrieve the index chunks for dock* and not docu*, meaning you could get a result for docker but not document. This has been fixed to load all extensions.
  2. Fixed some UTF-8 handling in how chunks were compared against the search query, which could cause chunks to not be loaded for certain words.
  3. Pagefind's chunks span between a from word and a to word, for example apple -> cinnamon. These are then trimmed down based on neighbors, so it might be stored as app -> cin. It was possible, though, for this to end up in a bad state like apple -> app, which cause Pagefind to struggle matching this chunk (it would, but less efficiently). Fixed.
  4. Searching with multiple words could cause the chunk loading to miss some of the words. Chunks are loaded either strictly (preferred) or loosely (as a fallback). If any of the words in the query got their chunk through the strict path, Pagefind didn't correctly fallback on the other words to the loose path, and their chunks would remain unloaded and produce weird results. Fixed.
  5. Pagefind loaded chunks before stemming, but this was faulty logic. This bug was seen when indexing with the words "rebase" and "rebasemerge", where the chunks unluckily split betwixt the two. With stemming, this means the earlier chunk ends at "rebas". Then, when searching and loading the chunks for "rebase", the chunk with "rebasemerge" would be loaded, but not the chunk with "rebas". Now both will be loaded.

This should address some of the hard-to-reproduce reports, i.e:

@bglw bglw merged commit 5c52301 into main Dec 30, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants