A very simple local Retrieval-Augmented Generation (RAG) system for querying PDF documents using LLMs. This tool uses Ollama for running local language models and ChromaDB for vector storage.
-
Install Python dependencies:
pip install -r requirements.txt
-
Install Ollama:
- Download and install Ollama from https://ollama.com/download
- Start Ollama:
ollama serve
- Pull required models (see
src/utils.py):ollama pull llama3
Place your PDF files in the data/ directory. Then run:
python src/db_ingest.pyThis will process all PDFs in data/ and store their embeddings in ChromaDB. Use --clean to clean the database before ingestion.
Run the main script and follow the prompts:
python main.pyEnter your question when prompted. The system will retrieve relevant document chunks and generate an answer using the local LLM via Ollama.
Run the app:
streamlit run app.py- Ensure Ollama is running before querying.
- The database is stored in the
chroma/directory. - Models are configured in
src.utils.Models - For advanced usage, see comments in the source files in
src/.- e.g.
python main.py --top_k 5 --similarity_threshold 0.25
- e.g.
- Interface: add file browser and db_ingest
- Testing & Eval
- Include Metadata
Special thanks to pixegami for the informative tutorial that inspired this project.