This project is a local Retrieval-Augmented Generation (RAG) pipeline built with LangChain, ChromaDB, and Ollama to extract structured information from invoice PDFs using LLaMA3.2:latest running on CPU.
✅ No GPU required.
✅ Entirely offline and private.
✅ Powered by sentence-transformers (all-mpnet-base-v2) for embeddings and ChromaDB for vector storage.
-
Upload any text-based PDF invoice
-
Ask structured or natural language questions like "What is the invoice number?" or "Who is the client?"
-
Built on LLaMA3.2:latest (7B model via Ollama)
-
Uses Sentence Transformers for semantic chunking and vector similarity
-
Full offline support, privacy-preserving, and works on CPU
-
UI built with Streamlit for PDF upload and querying
-
🧠 LLM: LLaMA3.2:latest via Ollama
-
🧲 Embedding Model:
sentence-transformers/all-mpnet-base-v2 -
📚 LangChain: For RAG pipeline and chaining
-
📦 ChromaDB: Fast and simple vector store
-
📄 PDF Loader: LangChain’s PyPDFLoader
-
🎛️ Streamlit: UI interface
pip install -r requirements.txtFollow instructions at https://ollama.com Then pull the model:
ollama pull llama3:latestPlace your text-based invoice PDFs in the data/ directory.
python ingest.pypython main.py "What is the client name?".
├── data/ # Folder with PDF invoices
├── vectorstore/ # Persisted ChromaDB vectors
├── rag/
│ ├── pipeline.py # RAG chain builder
├── app.py # Streamlit interface
├── ingest.py # PDF to vector processor
├── main.py # CLI for querying
├── config.yml # Config settings
├── requirements.txt
python main.py "What is the invoice date?"Output:
{
"invoice_date": "2024-03-18"
}