refactor SearchService to optimize candidate note retrieval by PrivateGER · Pull Request #33 · PrivateGER/Sharkey

PrivateGER · 2026-01-06T13:21:30Z

…ly use indexes

What

Why

Additional info (optional)

Checklist

Read the contribution guide
Test working in a local environment
(If needed) Add story of storybook
(If needed) Update CHANGELOG.md
(If possible) Add tests

Summary by CodeRabbit

Refactor
- Optimized search functionality to deliver faster and more efficient results through improved database query processing, reducing load times for note searches.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…ly use indexes

coderabbitai · 2026-01-06T13:21:47Z

Walkthrough

The search service refactors note search from a single monolithic query into a two-phase approach: first selecting note IDs via indexed conditions and text filters, then fetching complete note records with visibility, blocking, and muting enforcement applied in the second query.

Changes

Cohort / File(s)	Summary
Search Query Refactoring `packages/backend/src/core/SearchService.ts`	Restructures `searchNoteByLike` flow with two-phase query pattern: candidate selection phase (applies full-text filters, user/channel/host/filetype conditions, sorting) followed by full-note retrieval phase (enforces visibility, block, and mute rules); adds early exit for empty candidates.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description follows the required template structure but has all substantive sections (What, Why, Additional info) left completely empty with no explanation of changes.	Fill in the What section describing the two-phase query approach, the Why section explaining the optimization rationale, and Additional info with relevant testing considerations.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and clearly describes the main change: refactoring SearchService to optimize candidate note retrieval, which aligns perfectly with the code summary.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

packages/backend/src/core/SearchService.ts (2)
328-334: Consider adding basic visibility filtering to reduce false candidates.

The 5x multiplier accounts for filtering, but if the database has many follower-only or specified-visibility notes, this could still result in underfetching. Adding a simple pre-filter on the candidate query could improve hit rates without adding complex joins:
🔎 Proposed enhancement
 		if (opts.filetype) {
 			candidateQuery.andWhere('note."attachedFileTypes" && :types', { types: fileTypes[opts.filetype] });
 		}
+
+		// Pre-filter to searchable visibility levels to reduce false candidates
+		candidateQuery.andWhere('note.visibility IN (:...visibilities)', { 
+			visibilities: ['public', 'home'] 
+		});

 		// Fetch more candidates than needed since some will likely be filtered by visibility checks
 		const candidateRows = await candidateQuery.limit(pagination.limit * 5).getRawMany();
This is safe because generateVisibilityQuery in the second phase already filters to these visibility levels for non-authenticated or non-follower users, and notes with restricted visibility won't match text search indexes anyway in most configurations.

425-428: Pre-existing: MeiliSearch result ordering doesn't match SQL path behavior.

Not introduced by this PR, but worth noting: the MeiliSearch path hardcodes descending sort (a.id > b.id ? -1 : 1) regardless of pagination direction, while the refactored SQL path correctly uses sortOrder based on sinceId/untilId. Consider aligning this behavior for consistency in a follow-up.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b4ae91 and b9559c2.

📒 Files selected for processing (1)

packages/backend/src/core/SearchService.ts

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-12-19T14:14:22.397Z

Learnt from: CR
Repo: PrivateGER/hydrus-nextbooru PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-19T14:14:22.397Z
Learning: Applies to src/app/api/tags/search/**/*.{ts,tsx} : Implement progressive tag search filtering - only show tags that co-occur with already-selected tags

Applied to files:

packages/backend/src/core/SearchService.ts

🔍 Remote MCP

The Context7 library search didn't yield specific documentation for SearchService or Sharkey. However, the web search provided useful technical context. Let me search for more specific information about Sharkey and the PR's implementation approach.

Summary of Relevant Context for PR Review

Project Context

Sharkey is a Misskey fork that follows upstream changes when possible while adding its own features. It is ActivityPub software that can interface with the fediverse—an interconnected social network connecting with other software such as Akkoma, Mastodon, and Pixelfed. Sharkey uses TypeORM as a database tool to make database migrations easier, and the PR targets the SearchService component in the backend.

Database Search Architecture Context

This PR optimizes note search through a two-phase query approach. The refactor is strategically aligned with PostgreSQL full-text search best practices:

Full-Text Search Technologies Referenced:

PostgreSQL uses tsvector and tsquery data types for full text searches, and GIN indexes are recommended for full-text search vectors (tsvector)
In PostgreSQL, a tsvector is a special data type used to represent text in a form optimized for full-text search. A tsvector is a vector of lexemes (the basic units of text)—it simplifies language for search operations
The combination of tsvector and tsquery provides dynamic search capabilities, while the strategic use of GIN indexes on generated tsvector columns significantly enhances search performance

Query Optimization Principles

The PR's two-phase candidate selection pattern aligns with recognized PostgreSQL optimization techniques:

PostgreSQL generates various execution plans and the optimizer evaluates these with the intention of choosing the most efficient plan, considering factors like available indexes, join strategies, and data distribution
The planner/optimizer's task is to create an optimal execution plan. A given SQL query can be executed in many different ways, and if computationally feasible, the query optimizer will examine each possible execution plan, selecting the one expected to run fastest
Reducing the amount of data to be sorted is vital to performance and important to the user experience

Relevant Indexing Considerations

For Sharkey/Akkoma instances, additional indexes can be created on note and user tables to provide much faster search at the cost of additional disk space, with the PostgreSQL pg_trgm extension enabled to support queries using LIKE and ILIKE operators.

Key Review Points for This PR

The refactor decouples candidate selection (using indexed conditions with full-text filters) from full data retrieval (with visibility/blocking/muting enforcement), which reduces query planner complexity by narrowing the candidate set before expensive joins are performed. This approach leverages indexed access patterns identified as performance optimizations in PostgreSQL documentation and aligns with best practices for full-text search implementations using tsvector and GIN indexes.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Analyze (javascript-typescript)
GitHub Check: semgrep-cloud-platform/scan

🔇 Additional comments (2)

packages/backend/src/core/SearchService.ts (2)

279-308: Candidate query setup looks solid.

The two-phase approach correctly separates ID selection from full data retrieval, allowing PostgreSQL to utilize indexes more effectively. The pagination logic properly mirrors makePaginationQuery behavior, and the provider-specific full-text search filters are appropriate.

336-354: Full notes retrieval with visibility checks is well-structured.

The second query correctly fetches complete note data with all necessary joins and applies comprehensive visibility, blocking, and muting checks. The sort order is preserved via the orderBy clause matching the candidate query's order.

refactor SearchService to optimize candidate note retrieval to actual…

b9559c2

…ly use indexes

coderabbitai bot reviewed Jan 6, 2026

View reviewed changes

coderabbitai bot approved these changes Jan 6, 2026

View reviewed changes

PrivateGER merged commit 1ca64e4 into develop Jan 6, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor SearchService to optimize candidate note retrieval#33

refactor SearchService to optimize candidate note retrieval#33
PrivateGER merged 1 commit intodevelopfrom
fix/tsvector

PrivateGER commented Jan 6, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 6, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

PrivateGER commented Jan 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Additional info (optional)

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Summary of Relevant Context for PR Review

Project Context

Database Search Architecture Context

Query Optimization Principles

Relevant Indexing Considerations

Key Review Points for This PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PrivateGER commented Jan 6, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 6, 2026 •

edited

Loading