Skip to content

Enhance Search Functionality with Dedicated Search Columns #98

@sanchitram1

Description

@sanchitram1

Currently, searching for Package.name, Canon.name, Canon.url, and URL.url relies on basic database indexing and trigram matching (added in migration 20250508_1752-add_trgm_indexes.py). While the trigram indexes improved search performance, the current approach has limitations in terms of:

  1. Case sensitivity handling
  2. Search accuracy
  3. Performance at scale
  4. Consistency across different search fields

Current State

  • Trigram indexes exist on urls.url and canons.name
  • Basic database indexes on other searchable fields
  • No standardized approach to case handling
  • Search operations may require case-sensitive matches

Proposed Enhancement

Add dedicated search columns for key searchable fields:

-- Example schema changes
ALTER TABLE packages ADD COLUMN search_name TEXT GENERATED ALWAYS AS (LOWER(name)) STORED;
ALTER TABLE canons ADD COLUMN search_name TEXT GENERATED ALWAYS AS (LOWER(name)) STORED;
ALTER TABLE canons ADD COLUMN search_url TEXT GENERATED ALWAYS AS (LOWER(url)) STORED;
ALTER TABLE urls ADD COLUMN search_url TEXT GENERATED ALWAYS AS (LOWER(url)) STORED;

Benefits

  1. Consistent case-insensitive search across all fields
  2. Hopefully Better performance through dedicated indexed columns
  3. Cleaner search queries in application code

Implementation Areas

Database Changes

  • Add new generated columns
  • Create appropriate indexes
  • Migration script for existing data

Core Changes

  • Update search-related queries to use new columns
  • Transformers should write to that field as well (or db defaults, gotta figure out which approach is better)
  • Bigram indexes might be better for lower case?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions