Skip to content

1.1 TF-IDF from Scratch #1

@philberryman

Description

@philberryman

Goal

Build TF-IDF search for a small document corpus (product descriptions from FasterShops).

Learn

  • Term frequency calculation
  • Inverse document frequency and why it matters
  • Sparse vector representation
  • Cosine similarity for ranking

Deliverable

  • Python notebook with implementation
  • TypeScript implementation
  • Comparison on sample queries

Proof Point

Can explain why "the" gets low weight and rare terms get high weight.

Directory

search-fundamentals/01-tfidf-from-scratch/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions