-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
[Bug]: Search is broken in desktop apps for non-latin and non-CJK words #14595
Copy link
Copy link
Open
Description
What happened?
Hi!
Affine electron apps have an issue that leads to zero search results for document contents if you search with non-latin and non-CJK keywords. Affected languages are Russian, Ukrainian (and other slavic non-latin), Greek, Armenian, Arabian, Hebrew, etc.
How to reproduce:
In desktop app:
- Add "Hello world" phrase to any document
- Add "Привет мир" phrase to any document
- Try to search for "Hello world" -- it should work as intended
- Try to search for "Привет мир" -- no results found, but expected
The problem is related to memory-indexer Rust component. As far as I understood during my investigation, Affine splits search request to individual words and sends them to memory-indexer, and memory indexer is unable to handle single-word non-latin and non-CJK searches correctly.
Here's a simple test for memory-indexer that triggers bug.
#[test]
fn bug_cyrillic_word_inside_phrase_not_found() {
let mut index = InMemoryIndex::default();
index.add_doc(INDEX, "doc-ru", "привет, мир!", true);
assert!(index.search_hits(INDEX, "привет, мир!").iter().any(|h| h.doc_id == "doc-ru"));
assert!(
index.search_hits(INDEX, "привет").is_empty(),
"BUG: single Cyrillic word should be found, but is not"
);
assert!(
index.search_hits(INDEX, "мир").is_empty(),
"BUG: single Cyrillic word should be found, but is not"
);
}Distribution version
macOS ARM 64 (Apple Silicon)
App Version
0.26.3
What browsers are you seeing the problem on if you're using web version?
No response
Are you self-hosting?
- Yes
Self-hosting Version
No response
Relevant log output
Anything else?
No response
Is your content generated by AI?
- I confirm that the content I submitted was not generated by AI / merely contained minimal AI edits.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
🆕 *Untriaged