Skip to content

Query words checking in X relevance model #316

@okradze

Description

@okradze

The query relevance check in check_tweet_content fails for queries that contain only search operators without content keywords.

The current implementation splits the entire query string into words and checks if any of them appear in the tweet text, username, or name:

query_words = synapse.get("query", "").strip().lower().split(" ")

This includes search operators like from:, min_faves:, since:, filter:, etc. These operators are query instructions, not content terms, and will never appear in tweet text.

For queries containing only operators, valid tweets will always fail the relevance check.

Example:

  • Query: from:elonmusk
  • query_words = ["from:elonmusk"]
  • Tweet text: "Bitcoin is the future", username: "elonmusk"
  • "from:elonmusk" is not found in text, username, or name
  • Valid tweet gets score 0

Proposed Solution
Write advanced parsing for search query, exclude operators when comparing query words.
We may need LLM model also to check relevance.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions