Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -3556,13 +3556,18 @@ func (se *SearchEngine) searchAllTables(tokens []string, limit int) ([]SearchRes
slog.Debug("Built FTS query", "fts_query", ftsQuery)

// Use pure FTS5 search with bm25() column weights for title prioritization
// bm25(search, 1.0, 3.0, 1.0, 1.0, 1.0, 1.0) weights: type, title(3x), body, url, repository, author
// Multiply by boost to prioritize user's authored content (2x boost)
// bm25(search, 1.0, 5.0, 1.0, 1.0, 1.0, 1.0) weights: type, title(5x), body, url, repository, author
// Multiply by boost for user's contributed repos (2x), state_boost for open items (1.5x),
// and recency_boost based on created_at age (<30d: 1.0, 30-180d: 0.85, >180d: 0.7)
query := `
SELECT type, title, body, url, repository, author, created_at, state
FROM search
WHERE search MATCH ?
ORDER BY (bm25(search, 1.0, 3.0, 1.0, 1.0, 1.0, 1.0) * boost)
ORDER BY (bm25(search, 1.0, 5.0, 1.0, 1.0, 1.0, 1.0) * boost *
CASE WHEN state = 'open' THEN 1.5 ELSE 1.0 END *
CASE WHEN julianday('now') - julianday(created_at) < 30 THEN 1.0
WHEN julianday('now') - julianday(created_at) < 180 THEN 0.85
ELSE 0.7 END)
LIMIT ?`

slog.Debug("Executing FTS query", "sql", query, "search_table", "search", "fts_query", ftsQuery, "limit", limit)
Expand Down
19 changes: 13 additions & 6 deletions main.md
Original file line number Diff line number Diff line change
Expand Up @@ -1201,7 +1201,7 @@ Where `<invalid_fields>` is a comma-separated list of invalid fields, and `<avai
Next, prepare the FTS5 search query using the `search` table. Build the query with:

- Use FTS5 MATCH operator for the search query
- Order by `bm25(search)` for optimal relevance ranking (titles are weighted 3x higher)
- Order by `bm25(search)` for optimal relevance ranking (titles are weighted 5x higher in BM25 base scoring, then additional multipliers applied for state and recency)
- Limit to 20 results
- Use the unified SearchEngine implementation shared with the UI

Expand Down Expand Up @@ -1434,11 +1434,18 @@ const SCHEMA_GUID = "550e8400-e29b-41d4-a716-446655440001" // Change this GUID o
- FTS5 virtual table for full-text search across discussions, issues, and pull requests
- Indexed columns: `type`, `title`, `body`, `url`, `repository`, `author`
- Unindexed columns: `created_at`, `state`, `boost`
- `boost`: Numeric value (e.g., `1.0`, `2.0`) used to multiply BM25 scores for ranking
- Uses `bm25(search, 1.0, 2.0, 1.0, 1.0, 1.0, 1.0)` ranking with 2x title weight for relevance scoring
- Search results should be ordered by: `(bm25(search) * boost)` for optimal relevance
- Items from user's repositories get 2x boost, ensuring they appear higher in results
- This approach is more flexible than boolean flags and allows for future ranking adjustments
- `boost`: Numeric value (1.0 or 2.0) used to multiply BM25 scores for ranking, stored at index time (2.0 for user's contributed repos, 1.0 otherwise)
- Uses `bm25(search, 1.0, 5.0, 1.0, 1.0, 1.0, 1.0)` ranking with 5x title weight for relevance scoring
- Search results should be ordered by: `(bm25(search) * boost * state_boost * recency_boost)` for optimal relevance
- **Ranking factors:**
- **Title weight (5x):** Title matches are weighted 5x higher than other fields in BM25 scoring (column order: type=1.0, title=5.0, body=1.0, url=1.0, repository=1.0, author=1.0)
- **User-contributed repos (2x):** Items from repositories where the user has contributed get 2x boost (stored in `boost` column at index time)
- **Open state (1.5x):** Open items get 1.5x boost at query time: `CASE WHEN state = 'open' THEN 1.5 ELSE 1.0 END`
- **Recency decay:** Time-based decay calculated at query time based on `created_at`:
- Recent (<30 days): 1.0 (full score)
- Medium (30-180 days): 0.85
- Older (>180 days): 0.7
- SQL: `CASE WHEN julianday('now') - julianday(created_at) < 30 THEN 1.0 WHEN julianday('now') - julianday(created_at) < 180 THEN 0.85 ELSE 0.7 END`

#### table:schema_version

Expand Down