Skip to content
/ kira Public

KIRA - Knowledge Interface Retrieval Agent - is a small, local RAG AI with a gradio web interface. You can upload .pdf and .txt files.

Notifications You must be signed in to change notification settings

BVoermann/kira

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local RAG Document Chat

KIRA - Knowledge Interface Retrieval Agent - ist a local RAG system that allows you to chat with your documents using open source, locally run LLMs. Think of it as a small but self-hosted and private alternative to services like Google NotebookLM.

Features

  • Private & Local - No data leaves your machine, no need for API keys
  • Multi-format Support - Supports .pdf and .txt files
  • Open Source - Uses Mistral, can also use Llama 3.2 via Ollama
  • Interactive Chat - Simple web-based UI built with Gradio
  • Semantic Search - Find relevant information in your documents

Prerequisites

  • Python 3.8 or higher
  • Ollama installed and running
  • 8GB RAM at minimum, 16GB RAM recommended

Installation

  1. Clone this repository
git clone https://github.com/BVoermann/kira.git
cd kira
  1. Create virtual environment and install requirements

Linux

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Windows

python -m venv venv
source venv/Scripts/activate
pip install -r requirements.txt
  1. Install and set up Ollama

Download from Ollama

Then either download llama3.2 or mistral.

ollama pull llama3.2
ollama pull mistral

Usage

  1. Start application

Linux

python3 app.py

Windows

python app.py
  1. Open your Browser

Navigate to http://127.0.0.1:7860

  1. Upload Documents
  • Select one or more PDF or TXT files
  • Click "Process Documents" and wait for them to be processed
  • Ask questions in the chat
  • The AI will answer based on the content of the documents

Project Structure

local-rag-chat/
├── app.py                    # Main Gradio interface
├── document_processor.py     # Document loading and vectorization
├── rag_engine.py            # RAG query engine with LLM
└── chroma_db/               # Vector database storage (created on first run)

Configuration

Change the LLM Model

Edit app.py line 19:

rag_engine = RAGEngine(doc_processor.vectorstore, model_name="mistral")

Adjust Chunk Size

Edit document_processor.py lines 34-36:

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,      # Adjust this
    chunk_overlap=200,    # And this
    length_function=len
)

Change Embedding Model

Edit document_processor.py line 12:

self.embeddings = HuggingFaceEmbeddings(
    model_name="all-MiniLM-L6-v2"  # Adjust this
)

Troubleshooting

Ollama Connection Error

Make sure Ollama is running:

ollama list  # Should show installed models

Memory Issues

  • Reduce chunk_size in document_processor.py
  • Use a smaller model like mistral instead of llama3.2
  • Process fewer documents at once

Slow Performance

  • Use a smaller embedding model like all-MiniLM-L6-v2
  • Reduce the number of retrieved chunks (change k=4 in rag_engine.py)

About

KIRA - Knowledge Interface Retrieval Agent - is a small, local RAG AI with a gradio web interface. You can upload .pdf and .txt files.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages