🤖 LLM_Local_RAG

A simple and fast local RAG chatbot built with Python, FAISS, and Ollama. It reads your personal documents (PDFs), finds the most relevant info, and gives clear answers using a local AI model. No API keys, no internet, everything runs on your machine.

✨ The story behind this project

I built this because I wanted a chatbot that could read my own data without needing cloud access or paid APIs. At first, I just wanted to understand how RAG really works, but then it became something I could actually use and show. This project helped me learn how retrieval and generation really connect in practice. It’s simple, fast, and works offline, which was exactly the goal.

📋 Features

Reads and processes PDF files locally
Converts text into vector embeddings using multilingual models
Uses FAISS for fast and accurate similarity search
Answers questions through local AI models on Ollama (Phi-3, Mistral, Gemma)
Works fully offline
Code is clean and modular, split into Retrieval and Generation parts

⚙️ How it works

Load a document
Split the text into smaller chunks
Convert each chunk into an embedding
Store everything inside FAISS
When you ask a question, it finds the most similar chunks
Sends them to the local AI model and returns the answer

🧩 System Architecture

Two main sections:

Retrieval – handles reading, embedding, and searching chunks
Generation – builds the prompt and generates the final answer

Designed for clarity — simple enough to extend with Chroma or LangChain later.

💻 Example Usage and Ouput

python main.py
# Choose a PDF file
# Ask: What Is Pokemon?
# Bot: Pokémon are creatures that inhabit the world of the Pokémon universe. The core idea revolves around friendship, adventure, and growth, both for the Pokémon themselves and their trainers.

🛠️ Tech Stack

Python

The main programming language that connects every part of the system.

FAISS

Used for fast vector similarity search.
Stores and retrieves embeddings efficiently.

SentenceTransformers

Converts text and questions into embeddings.
Works well with multiple languages including Bahasa Indonesia.

NumPy

Handles numerical operations and converts embeddings into the right format for FAISS.

PyPDF2

Reads and extracts text from PDF files before they are processed.

Requests

Sends the formatted question and context to Ollama’s local API for generating responses.

Ollama

Runs the local AI models like Phi-3, Mistral, or Gemma.
Generates the final answer directly on your machine.

Tkinter

Opens a simple file picker so you can select the document to analyze.

Dotenv

Keeps model names and settings clean and separate inside a .env file.

🫡 What I learned

A good embedding model changes everything
FAISS makes search extremely fast
Keeping Retrieval and Generation separate makes the code easier to manage
You can build a real RAG chatbot without relying on APIs

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Workflow		Workflow
document_ingestion		document_ingestion
files		files
local_ai		local_ai
.env		.env
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🤖 LLM_Local_RAG

✨ The story behind this project

📋 Features

⚙️ How it works

🧩 System Architecture

💻 Example Usage and Ouput

🛠️ Tech Stack

🫡 What I learned

About

Uh oh!

Releases

Packages

Languages

YandLim/Local-RAG

Folders and files

Latest commit

History

Repository files navigation

🤖 LLM_Local_RAG

✨ The story behind this project

📋 Features

⚙️ How it works

🧩 System Architecture

💻 Example Usage and Ouput

🛠️ Tech Stack

🫡 What I learned

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages