RAG Learning Project

A simplified Retrieval Augmented Generation (RAG) system built to understand how the core components work together.

What It Does

Upload markdown documents, then ask questions in natural language. The system retrieves relevant chunks from your documents and generates answers grounded in that content.

Features

Document upload — Upload .md files which are automatically chunked, embedded, and indexed
Natural language search — Ask questions like "What's the refund policy?" instead of keyword matching
Vector similarity search — Finds semantically relevant content even without exact word matches
Reranker toggle — Compare results with and without Cohere reranking to see how it affects retrieval quality
Debug view — Inspect retrieved chunks, similarity scores, reranker scores, and the full prompt sent to the LLM

Debug View

The debug panel lets you see exactly what's happening under the hood:

Expand "Retrieved Chunks" to see how each chunk was scored:

Each chunk shows both Vector and Rerank scores side-by-side. This lets you compare how the two ranking methods differ:

Vector score: Cosine similarity between query embedding and chunk embedding (fast, but purely geometric)
Rerank score: Cohere's cross-encoder relevance score (slower, but understands the actual question-answer relationship)

You can sort by either score to see how rankings change. Notice how a chunk might rank high by vector similarity but low by reranker (or vice versa) — this illustrates why two-stage retrieval improves quality.

Reranker Trade-offs

Reranker	Speed	Quality	Best For
Off	~200ms faster	Vector similarity only	Straightforward queries, lower latency requirements
On	Additional API call	More accurate retrieval	Ambiguous queries where multiple chunks seem similar but only some answer the question

Purpose

This project ties together the key components of a RAG system:

Embeddings — Converting text to vectors using OpenAI's embedding API
Vector Search — Storing and querying vectors with pgvector (PostgreSQL extension)
Chunking — Splitting documents into searchable pieces
Reranking — Using Cohere's rerank API to improve retrieval quality
LLM Integration — Generating answers from retrieved context using GPT-4o-mini

This is a learning exercise, not production code. Chunking is intentionally basic. The goal was to understand the end-to-end flow, not optimize individual components.

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                         Frontend (React)                        │
│                    Search UI with debug info                    │
└─────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
┌─────────────────────────────────────────────────────────────────┐
│                        Backend (Express)                        │
│                                                                 │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────────────┐ │
│  │   Upload    │    │   Search    │    │   Chunker Worker    │ │
│  │  Controller │    │  Controller │    │   (pg-boss queue)   │ │
│  └─────────────┘    └─────────────┘    └─────────────────────┘ │
│         │                  │                     │              │
│         ▼                  ▼                     ▼              │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                      Services Layer                         ││
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  ││
│  │  │UploadService │  │SearchService │  │  ChunkerService  │  ││
│  │  └──────────────┘  └──────────────┘  └──────────────────┘  ││
│  └─────────────────────────────────────────────────────────────┘│
│                              │                                  │
│         ┌────────────────────┼────────────────────┐            │
│         ▼                    ▼                    ▼            │
│  ┌─────────────┐      ┌─────────────┐      ┌─────────────┐    │
│  │   OpenAI    │      │   Cohere    │      │  pgvector   │    │
│  │ (embeddings │      │ (reranker)  │      │ (PostgreSQL)│    │
│  │  + chat)    │      │             │      │             │    │
│  └─────────────┘      └─────────────┘      └─────────────┘    │
└─────────────────────────────────────────────────────────────────┘

The RAG Flow

Upload: Documents are uploaded → chunked → embedded → stored in pgvector
Search: User query is embedded → vector similarity search finds candidates
Rerank (optional): Cohere reranker scores chunks by relevance to the question
Generate: Top chunks + query sent to LLM → answer generated

Additional Tools

pg-boss — PostgreSQL-based job queue for async document processing. Alternative to SQS/RabbitMQ that keeps everything in Postgres.
Inversify — Dependency injection container. Built without NestJS to maintain explicit control over DI and adhere to SOLID principles.

Tech Stack

Component	Technology
Backend	Express + TypeScript
Frontend	React
Database	PostgreSQL + pgvector
Embeddings	OpenAI text-embedding-3-small
Reranker	Cohere rerank-english-v3.0
LLM	GPT-4o-mini
Job Queue	pg-boss
DI	Inversify

Setup

See SETUP.md for installation instructions and required environment variables.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
backend		backend
frontend		frontend
shared/types		shared/types
.gitignore		.gitignore
README.md		README.md
SETUP.md		SETUP.md
chunks.png		chunks.png
debuginfo.png		debuginfo.png
general.png		general.png
package-lock.json		package-lock.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Learning Project

What It Does

Features

Debug View

Reranker Trade-offs

Purpose

Architecture Overview

The RAG Flow

Additional Tools

Tech Stack

Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Learning Project

What It Does

Features

Debug View

Reranker Trade-offs

Purpose

Architecture Overview

The RAG Flow

Additional Tools

Tech Stack

Setup

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages