Skip to content

simonkral1/Deleuze-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deleuze AI RAG

Retrieval-augmented generation stack purpose-built for Gilles Deleuze's corpus. The system extracts long-form passages from the PDF books, indexes them in Qdrant, plans dense search queries, and answers questions through Claude while exposing the full trace (terms, passages, quotes, thinking) in a simple web UI.

Prerequisites

  • macOS / Linux with uv installed (pip install uv once, then reuse)
  • Python 3.11+ (managed automatically by uv)
  • Qdrant (cloud URL/API key or local persistence)
  • Anthropic API key for Claude Opus (planner + answer)
  • Hugging Face Inference endpoint(s) for BGE embeddings and reranker, or local GPU with sentence-transformers

Setup

uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
cp config/example.env .env  # fill in the secrets

Environment variables

config/example.env lists the minimal variables:

  • LLMs: ANTHROPIC_API_KEY, ANSWER_PROVIDER, ANSWER_MODEL, ANSWER_MAX_TOKENS
  • Planner: QUERY_PLANNER_PROVIDER, QUERY_PLANNER_MODEL, QUERY_PLANNER_USE_LEXICON, PLANNER_VERBOSE
  • Embeddings/Reranker: HF_API_KEY, HF_EMBEDDING_ENDPOINT, HF_RERANKER_ENDPOINT, EMBED_MODEL_NAME, EMBED_DIM, RERANKER_MODEL, ENABLE_LOCAL_RERANKER
  • Vector store: QDRANT_URL, QDRANT_API_KEY, QDRANT_COLLECTION
  • Retrieval tuning: INITIAL_CANDIDATE_COUNT, RERANK_TOP_K, RETRIEVER_MQ_TERMS, PRF_ENABLE, PRF_TOP_P, PRF_TERMS, PASSAGE_TOKENS, PASSAGE_MAX_TOKENS, RETRIEVAL_CACHE_TTL, SESSIONS_DIR
  • Observability: LOG_LEVEL

Set optional keys (OPENAI_API_KEY, KIMI_API_KEY) if you plan to swap providers; otherwise they can be omitted.

Building the corpus

  1. Extract structured text from PDFs

    uv run python -m src.pipeline.pdf_extractor

    PDFs are read from data/raw/pdf_books/ and snapshots write to data/processed/ by default. Override with --pdf-dir and --output-dir if needed.

  2. Embed and index in Qdrant

    uv run python -m src.pipeline.embed_corpus --snapshot data/processed/deleuze_corpus_<date>.jsonl

    Use --recreate to rebuild the collection and --max-records for smoke tests. Embeddings respect Apple Silicon (torch.mps) if available.

Snapshots and manifest files are stored under data/processed/; the Qdrant local persistence lives in data/qdrant/ when no remote URL is provided.

Running the API & UI

uv run uvicorn src.main:app --host 0.0.0.0 --port 8000
  • POST /upload-pdf/ ingests a new PDF into the index
  • POST /plan/ returns the latest query-plan terms
  • POST /ask/ returns answer, citations, thinking trace, passages, and raw search terms
  • GET / serves the chat interface from static/chat.html

Console logs include the planner terms, PRF terms, reranker activity, and Claude's full analysis (search terms + thinking).

Evaluation (optional)

src/evaluation/retrieval_eval.py measures hit rate / MRR over eval/dataset.jsonl. Run with:

uv run python -m src.evaluation.retrieval_eval

Project layout

  • src/main.py – FastAPI entrypoint and web routes
  • src/rag/ – ingestion, hybrid retrieval, passage builder, vector store
  • src/llm/ – planner & answer clients (Anthropic/Kimi/OpenAI)
  • src/pipeline/ – PDF extraction and embedding utilities
  • src/observability/ – logging and tracing helpers
  • static/ – web UI assets
  • tests/ – smoke tests for retrieval evaluation

With unused training scripts and legacy vector stores removed, only the modules above are required for the current RAG flow.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors