Skip to content

jane-muthoka/Insurance-Chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛡️ Insurance FAQ Chatbot

A local Retrieval-Augmented Generation (RAG) chatbot that answers insurance questions based on your own PDF documents. Built with TinyLlama, FAISS, and Gradio.


Chatbot answering insurance question Chatbot answering insurance question

User Experience

PDFs are static: users have to read and search manually. Chatbot is interactive: users can ask natural language questions and get direct answers.

For insurance FAQs, this means: “What’s the claim process for car insurance?” → instant answer. No scrolling, no guessing, no searching multiple pages.

How It Works

  1. PDF Ingestion — loads and chunks your insurance FAQ document
  2. Embedding — converts chunks into vector embeddings using all-MiniLM-L6-v2
  3. Vector Store — stores embeddings in a FAISS index for fast similarity search
  4. RAG Retrieval — on each query, retrieves the top-3 most relevant chunks
  5. LLM Generation — feeds retrieved context + question to TinyLlama to generate an answer
  6. Gradio UI — serves a chat interface accessible via browser
User Query
    │
    ▼
Embedding Model ──► FAISS Index ──► Top-K Chunks
                                         │
                                         ▼
                                   TinyLlama LLM
                                         │
                                         ▼
                                      Answer

Project Structure

llm_insurance-chatbot/
├── main.py                  # Main application
├── requirements.txt         # Python dependencies
├── data/
│   └── Insurance_FAQs.pdf   # Your source document (add this yourself)
└── embeddings/
    └── vector_store.pkl     # Auto-generated on first run

Setup

1. Clone / download the project

cd llm_insurance-chatbot

2. Create a virtual environment (recommended)

python -m venv venv

# Windows
venv\Scripts\activate

# macOS/Linux
source venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Add your PDF

Place your insurance FAQ document at:

data/Insurance_FAQs.pdf

5. Run the app

python main.py

On first run, the vector store will be built automatically. Subsequent runs will load it from cache.


Accessing the UI

After running, you'll see output like:

* Running on local URL:  http://127.0.0.1:7860
* Running on public URL: https://xxxxxx.gradio.live
  • Local URL — open in your browser on the same machine
  • Public URL — share with anyone for 72 hours (enabled via share=True)

System Requirements

Component Minimum
RAM 6GB free
Python 3.10+
Disk ~3GB (for model cache)
GPU Optional (runs on CPU)

Note for Windows users: You may see a symlinks warning from HuggingFace. This is harmless. To fix it, enable Developer Mode in Windows Settings or run Python as Administrator.


Configuration

Key parameters you can tweak in main.py:

Parameter Location Default Description
chunk_size chunk_text() 500 Characters per chunk
overlap chunk_text() 50 Overlap between chunks
top_k retrieve_relevant() 3 Number of chunks retrieved
max_new_tokens generate_answer() 300 Max response length
temperature generate_answer() 0.7 Response creativity (0=deterministic)

Troubleshooting

Segmentation fault during model load → Not enough RAM. The model needs ~4GB free. Close other applications and retry.

localhost is not accessible error → Add share=True to iface.launch() and use the public URL instead.

Slow responses → Expected on CPU — TinyLlama takes 30–90 seconds per response without a GPU.

Poor answer quality → Check that your PDF text is extractable (not a scanned image). Try increasing top_k to 5.


Dependencies

Package Purpose
pdfplumber PDF text extraction
sentence-transformers Generating embeddings
faiss-cpu Vector similarity search
transformers Loading TinyLlama LLM
torch Model inference backend
gradio Chat web interface
accelerate Optimized model loading

About

A locally-running AI chatbot that answers insurance questions from your own PDF documents. Built with RAG (Retrieval-Augmented Generation) — no fine-tuning, no cloud dependency. Drop in your policy PDF, run one command, and get a chat interface powered by TinyLlama and FAISS vector search.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages