Skip to content

[Bug]: Search is broken in desktop apps for non-latin and non-CJK words #14595

@Konf

Description

@Konf

What happened?

Hi!
Affine electron apps have an issue that leads to zero search results for document contents if you search with non-latin and non-CJK keywords. Affected languages are Russian, Ukrainian (and other slavic non-latin), Greek, Armenian, Arabian, Hebrew, etc.

How to reproduce:
In desktop app:

  1. Add "Hello world" phrase to any document
  2. Add "Привет мир" phrase to any document
  3. Try to search for "Hello world" -- it should work as intended
  4. Try to search for "Привет мир" -- no results found, but expected

The problem is related to memory-indexer Rust component. As far as I understood during my investigation, Affine splits search request to individual words and sends them to memory-indexer, and memory indexer is unable to handle single-word non-latin and non-CJK searches correctly.

Here's a simple test for memory-indexer that triggers bug.

#[test]
fn bug_cyrillic_word_inside_phrase_not_found() {
    let mut index = InMemoryIndex::default();
    index.add_doc(INDEX, "doc-ru", "привет, мир!", true);

    assert!(index.search_hits(INDEX, "привет, мир!").iter().any(|h| h.doc_id == "doc-ru"));
    assert!(
        index.search_hits(INDEX, "привет").is_empty(),
        "BUG: single Cyrillic word should be found, but is not"
    );
    assert!(
        index.search_hits(INDEX, "мир").is_empty(),
        "BUG: single Cyrillic word should be found, but is not"
    );
}

Distribution version

macOS ARM 64 (Apple Silicon)

App Version

0.26.3

What browsers are you seeing the problem on if you're using web version?

No response

Are you self-hosting?

  • Yes

Self-hosting Version

No response

Relevant log output

Anything else?

No response

Is your content generated by AI?

  • I confirm that the content I submitted was not generated by AI / merely contained minimal AI edits.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    🆕 *Untriaged

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions