CLASSMATE-RAG

A Retrieval-Augmented Generation (RAG) system for course materials. It ingests documents (PDF, DOCX, PPTX, EPUB, HTML, CSV, TXT, MD), indexes them in BM25 + Chroma vector DB, and answers questions with grounded citations using LLaMA/Mistral GGUF models.

✨ Features

CLI-first workflow (rag command)
Ingestion with metadata (course, unit, tags, language, semester, author)
Hybrid retrieval (BM25 keyword + vector embeddings, fused with RRF)
Cited answers generated with local LLMs
Admin tools: stats, preview, backup/restore, vacuum, rebuild embeddings, reingest
Document loaders: PDF, DOCX, PPTX, EPUB, HTML, CSV, TXT, Markdown
Multilingual support with E5 embeddings (intfloat/multilingual-e5-base)

📦 Installation

See docs/installation.md for details. Quick setup (Linux/macOS):

./quicksetup.sh
source .venv/bin/activate
rag --help

Windows (PowerShell):

.\quicksetup.ps1
.\.venv\Scripts\Activate.ps1
rag --help

🚀 Usage

Ingest a document:

rag add path/to/file.pdf --course "Math101" --unit "1" --language "en" --tags exam,week1

Ask a question:

rag ask "What is the chain rule?" --course "Math101"

Preview retrieval (no generation):

rag preview "Explain entropy"

See docs/usage.md for more.

🛠️ Maintenance

Show stats: rag stats
Backup: rag dump --path dumps/corpus.jsonl
Restore: rag restore --path dumps/corpus.jsonl
Vacuum: rag vacuum
Rebuild embeddings: rag rebuild --model intfloat/multilingual-e5-large
Manage entries: rag list, rag show, rag delete, rag reingest

Details in docs/configuration.md.

📖 Documentation

🧩 Project Structure

cli/           # CLI entrypoint
rag/           # Core RAG system
  admin/       # Backup, restore, manage, inspect
  chunking/    # Text splitting into chunks
  embeddings/  # Embedding models & cache
  generation/  # LLM runner, prompting, postprocessing
  loaders/     # File loaders
  retrieval/   # BM25, Chroma, hybrid fusion
  pipeline/    # Ingestion, query orchestration
docs/          # Documentation
tools/         # Benchmark scripts

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github/workflows		.github/workflows
cli		cli
data		data
docs		docs
indexes		indexes
models		models
rag		rag
tests		tests
tools		tools
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
quicksetup.ps1		quicksetup.ps1
quicksetup.sh		quicksetup.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLASSMATE-RAG

✨ Features

📦 Installation

🚀 Usage

🛠️ Maintenance

📖 Documentation

🧩 Project Structure

About

Uh oh!

Languages

taha-kms/CLASSMATE-RAG

Folders and files

Latest commit

History

Repository files navigation

CLASSMATE-RAG

✨ Features

📦 Installation

🚀 Usage

🛠️ Maintenance

📖 Documentation

🧩 Project Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages