This project is a local implementation of a Retrieval-Augmented Generation (RAG) chatbot, originally based on the tutorial from Hugging Face. The implementation has been modified to run entirely locally using Ollama for model inference.
- π Fully Local: No API keys or internet connection required after initial setup
- π± Cat Facts Knowledge Base: Comes pre-loaded with interesting cat facts
- π Semantic Search: Finds relevant information using vector similarity
- π¬ Interactive Chat: Ask questions and get answers based on the knowledge base
- Python 3.8+
- Ollama installed and running
- Required Python packages (install via
pip install -r requirements.txt)
-
Install Ollama Download and install Ollama from ollama.ai
-
Start the Ollama server
ollama serve
-
Pull required models (in a new terminal):
ollama pull nomic-embed-text ollama pull llama3
-
Install Python dependencies:
pip install -r requirements.txt
-
Run the chatbot:
python main.py
-
When prompted, type your question about cats and press Enter
-
Type 'quit', 'exit', or 'q' to exit the program
- Data Loading: The script loads cat facts from
cat-facts.txt - Embedding Generation: Each fact is converted into a vector embedding using
nomic-embed-text - Query Processing: When you ask a question:
- The question is converted to an embedding
- The system finds the most similar facts using cosine similarity
- The relevant context is sent to the language model (
llama3) - The model generates a response based on the retrieved context
You can change the models in main.py by modifying these lines:
EMBEDDING_MODEL = 'nomic-embed-text' # Other options: 'all-minilm', 'bge-small', etc.
LANGUAGE_MODEL = 'llama3' # Other options: 'mistral', 'llama2', etc.- Replace
cat-facts.txtwith your own text file - Each line should contain a single fact or piece of information
- The system will automatically process the new file when you run
main.py
- If you get model not found errors, make sure you've pulled the models with
ollama pull - Ensure the Ollama server is running before starting the script
- For large knowledge bases, embedding generation might take some time on the first run
Based on the tutorial: Make Your Own RAG with Hugging Face
This project is open source and available under the MIT License.