A lightweight Retrieval-Augmented Generation (RAG) application that allows you to ask questions about your documents using natural language. Built with FastAPI and designed for local development.
- Simple & Lightweight - Minimal dependencies, easy to set up and run
- FastAPI Backend - Built with FastAPI for high performance
- Document Support - Works with .txt, .md, and .markdown files
- REST API - Fully documented API endpoints
- Test Coverage - Comprehensive test suite included
- Python 3.9+
- pip (Python package manager)
-
Clone the repository
git clone https://github.com/yourusername/mini-rag-app.git cd mini-rag-app -
Create and activate a virtual environment
# Windows python -m venv venv .\venv\Scripts\activate # macOS/Linux python3 -m venv venv source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
-
Start the FastAPI server
uvicorn app.main:app --reload
-
Access the application
- Web Interface: http://127.0.0.1:8000
- API Documentation: http://127.0.0.1:8000/docs
- Alternative Docs: http://127.0.0.1:8000/redoc
POST /ingest- Add documents to the knowledge base{ "documents": [ { "text": "Your document text here", "metadata": {"source": "example.txt"} } ] }
POST /query- Query the knowledge base{ "question": "Your question here", "k": 4 }
GET /health- Check API status
Run the test suite with:
pytest tests/ -vOr use the test runner script:
python run_tests.pyjupyter notebook notebook/ingest_and_build.ipynb
uvicorn app.main:app --reload
## 📚 Usage
### 1. Add Documents
Place your text or markdown files in the `docs/` directory and run the ingestion notebook:
```bash
jupyter notebook notebook/ingest_and_build.ipynb
curl -X POST "http://localhost:8000/query" \
-H "Content-Type: application/json" \
-d '{"question": "What is this document about?"}'import requests
response = requests.post(
"http://localhost:8000/query",
json={"question": "What is this document about?"}
)
print(response.json())-
POST /query- Ask a question about your documents{ "question": "Your question here", "k": 4 // Optional: number of results to return } -
POST /ingest- Add new documents (advanced) -
GET /health- Check API status -
GET /- API documentation
- Embeddings:
sentence-transformers/all-MiniLM-L6-v2 - Language Model:
ggml-gpt4all-j-v1.3-groovy
mini-rag-app/
├── README.md # This file
├── requirements.txt # Python dependencies
├── .env.example # Example environment variables
├── docs/ # Your documents go here
├── notebook/ # Jupyter notebook for document processing
│ └── ingest_and_build.ipynb
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI application
│ ├── retriever.py # Document retrieval with FAISS
│ └── utils.py # Helper functions
├── Dockerfile # For containerization
└── docker-compose.yml # For easy Docker deployment
For production, consider using:
- Gunicorn with multiple workers
- Nginx as a reverse proxy
- Process manager like PM2 or systemd
Copy .env.example to .env and adjust as needed:
# Model configuration
EMBEDDING_MODEL=all-MiniLM-L6-v2
CHAT_MODEL=ggml-gpt4all-j-v1.3-groovy
# Server configuration
HOST=0.0.0.0
PORT=8000
This project is licensed under the MIT License - see the LICENSE file for details.
- sentence-transformers for the embedding model
- FAISS for efficient similarity search
- GPT4All for the local language model
- FastAPI for the web framework