Skip to content

Implement server-side search endpoint for scalable file searching #2

@bryanchriswhite

Description

@bryanchriswhite

Summary

Currently, file search is implemented entirely client-side using Fuse.js in the React UI. While this works well for small-to-medium datasets, it has scalability limitations as the dataset grows larger.

This issue proposes implementing a server-side search endpoint to handle search queries on the backend, which will:

  • Reduce client-side data transfer for large datasets
  • Enable pagination of search results
  • Support more advanced search features
  • Improve performance for users with slower devices

Current Implementation

Client-side search in pinshare-ui/src/pages/Browse.jsx:

  • Uses Fuse.js for fuzzy matching
  • Searches across: fileName, ipfsCID, fileType, fileSHA256
  • Field weights: fileName (0.4), ipfsCID (0.3), fileType (0.2), fileSHA256 (0.1)
  • Threshold: 0.3 (fuzzy matching tolerance)
  • All files are fetched via GET /files then filtered in browser

Proposed Changes

Backend API

Add new endpoint: GET /files/search

Query Parameters:

  • q (required): Search query string
  • limit (optional): Max results to return (default: 50)
  • offset (optional): Pagination offset (default: 0)
  • fields (optional): Comma-separated fields to search (default: all)
  • threshold (optional): Fuzzy match threshold (default: 0.3)

Response:

{
  "results": [...],
  "total": 123,
  "limit": 50,
  "offset": 0,
  "query": "example"
}

Implementation Options

  1. Go native implementation with string matching libraries
  2. SQLite FTS (Full-Text Search) if/when migrating to SQLite
  3. PostgreSQL full-text search if/when migrating to PostgreSQL
  4. Dedicated search engine (Elasticsearch, Meilisearch, etc.) for large-scale deployments

Migration Strategy

The client-side search should remain as a fallback:

  1. Try server-side search first
  2. If endpoint not available (older backend), fall back to client-side Fuse.js
  3. This ensures backward compatibility during rollout

Benefits

  • Scalability: Handles thousands of files without performance degradation
  • Reduced bandwidth: Only matching results sent to client
  • Pagination: Large result sets can be paginated
  • Advanced features: Can add filters, facets, aggregations, etc.
  • Consistent results: Same search algorithm for all clients

Related Issues

  • #TBD: PostgreSQL migration (will enable PostgreSQL full-text search)

Implementation Priority

Medium Priority - Current client-side search works well for typical use cases. This becomes important as the dataset grows beyond ~1000 files.


Migrated from bryanchriswhite/PinShare#2

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions