Draft
Conversation
- Add pnpm override for preact >=10.28.2 to fix high severity JSON VNode Injection vulnerability (GHSA-36hm-qxxp-pg3m) - Add missing rel="noopener noreferrer" to external links in PageFrame.astro and MoveReferenceDisabled.astro to prevent potential tabnapping attacks
- Add query preprocessing to split concatenated words (e.g., 'indexertable' → 'indexer table') - Split camelCase/PascalCase words for better matching - Add dictionary of common Aptos documentation terms for intelligent splitting - Configure better Algolia search parameters: - Enable typo tolerance with smaller word size thresholds - Use 'allOptional' for removeWordsIfNoResults to improve partial matches - Use 'prefixAll' queryType for prefix search on all words This addresses the search issues reported in Slack where queries like 'indexertable' or 'indexertablerefefence' (with or without spaces) would fail to find the expected 'Indexer Table Reference' page.
|
Cursor Agent can help with this pull request. Just |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
When users search for common short terms like 'CLI', 'SDK', 'API', etc., the search now boosts entry-point pages to appear first: - Overview, introduction, getting-started pages get highest boost (+20) - Primary/index pages for topics (e.g., /cli/, /sdk/) get moderate boost (+15) - Pages where the search term appears in the URL path get boost (+10) - Shallow pages (less nested) get small boost (+5) This ensures that searching 'CLI' surfaces the CLI overview and setup pages rather than random pages that just mention the CLI.
Simplified the boosting logic to be more effective: - +100 points: URL ends with the search term (e.g., /build/cli for 'CLI') This strongly prioritizes landing/overview pages - +50 points: URL has a segment exactly matching the search term (e.g., /build/smart-contracts/book/enums for 'enums') - +20 points: URL contains the search term somewhere - Depth bonus: shallower pages (fewer path segments) get priority - Depth 1-2: +30 points - Depth 3: +20 points - Depth 4: +10 points This ensures: - 'CLI' search → /build/cli landing page appears first - 'enums' search → /build/smart-contracts/book/enums appears high - Deeply nested pages don't outrank their parent landing pages
Added console.log statements to see: 1. If transformItems is being called 2. What boost scores are being calculated 3. What the final sorted order looks like This will help diagnose why the boosting doesn't appear to be working.
Shows: - Original order from Algolia - Boost scores for each item - Final sorted order Also enables getRankingInfo to see Algolia's ranking details.
Key discovery: DocSearch's transformItems is called once PER ITEM, not for the entire result set. This means we cannot reorder results there. New approach - boost at query time using optionalFilters: - Add optionalFilters based on the search query - Boost hierarchy.lvl1 and hierarchy.lvl0 that match query terms - This tells Algolia to rank pages with matching hierarchy higher For example, searching 'CLI' will add: - hierarchy.lvl1:cli<score=3> - hierarchy.lvl0:cli<score=2> This should boost the CLI landing page above random pages that just mention CLI in their content.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improve documentation search by preprocessing queries to handle concatenated and camelCase words and enhancing Algolia search parameters.
The existing Algolia DocSearch struggled with queries containing concatenated words (e.g., "indexertable") or camelCase terms (e.g., "IndexerTable"), leading to inconsistent results. This PR introduces a
transformSearchClientfunction that preprocesses queries to intelligently split these terms into separate words before sending them to Algolia, significantly improving search relevance for such queries. Additionally, Algolia's typo tolerance and word removal parameters are adjusted for better fuzzy matching and partial result handling.Slack Thread