A CLaRa-Inspired RAG Chat Application Powered by Mistral AI
Features • Quick Start • Architecture • Usage • Deployment
Lumina is a RAG chat application like notebooklm that implements the CLaRa (Classification and Retrieval Augmented) approach for enhanced information retrieval (https://arxiv.org/abs/2511.18659). Built with Mistral AI's state-of-the-art language models, Lumina transforms your documents into an intelligent knowledge base that powers contextual conversations.
Unlike traditional RAG systems that embed raw text chunks, Lumina uses CLaRa-style evidence generation:
- Compressed Summaries: Each chunk is distilled into factual summaries
- QA Pairs: Automatically generated question-answer pairs for better retrieval
- Precise Retrieval: Search optimized evidence instead of raw text
- Source Citations: Every response includes traceable sources
-
Intelligent Chat Interface
- Real-time streaming responses from Mistral AI
- Toggle between standard and RAG-enhanced modes
- Clean, modern UI with dark mode support
-
Smart Document Processing
- Support for PDF and TXT files
- Intelligent text chunking with sentence boundary detection
- Automatic evidence generation (summaries + QA pairs)
- Progress tracking for document ingestion
-
Advanced RAG System
- FAISS vector database for fast similarity search
- Mistral embeddings (1024 dimensions)
- Top-k retrieval with relevance scoring
- Citation tracking with source document references
-
Knowledge Management
- Upload and process multiple documents
- View processing status in real-time
- Delete documents with automatic cleanup
- Persistent storage with SQLite
- Framework: FastAPI (high-performance async API)
- Database: SQLAlchemy + SQLite (easily migrates to PostgreSQL)
- Vector Store: FAISS (Facebook AI Similarity Search)
- Embeddings: Mistral Embed (mistral-embed)
- LLM: Mistral Small (mistral-small-latest)
- Package Manager: uv (ultra-fast Python package installer)
- Framework: Next.js 14 (React 18, App Router)
- Language: TypeScript
- Styling: Tailwind CSS
- UI Components: Custom components with dark mode
- HTTP Client: Fetch API with SSE (Server-Sent Events)
- Python 3.11 or higher
- Node.js 18 or higher
- Mistral AI API key (Get one here)
- uv package manager (Install uv)
git clone https://github.com/yourusername/lumina.git
cd luminacd backend
# Create virtual environment and install dependencies
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e .
# Configure environment
cp .env.example .env
# Edit .env and add your MISTRAL_API_KEYcd frontend
# Install dependencies
npm install
# Optional: Configure API URL (defaults to http://localhost:8000)
echo "NEXT_PUBLIC_API_URL=http://localhost:8000" > .env.localTerminal 1 - Backend:
cd backend
uv run fastapi dev app/main.py
# Backend runs on http://localhost:8000Terminal 2 - Frontend:
cd frontend
npm run dev
# Frontend runs on http://localhost:3000Visit http://localhost:3000 to start chatting!
- Navigate to Manage Knowledge from the homepage
- Click Choose File and select a PDF or TXT file (max 1MB for now)
- Click Upload Document
- Wait for processing to complete (status will show "completed")
- Return to the chat interface
- Toggle Use Knowledge Base ON
- Ask questions about your uploaded documents
- View citations below each response showing source documents and relevance scores
- Toggle Use Knowledge Base OFF for standard Mistral AI chat
- Great for general questions not requiring your documents
┌─────────────────┐ ┌──────────────────┐
│ Next.js │ HTTP/SSE│ FastAPI │
│ Frontend │◄────────┤ Backend │
│ (Port 3000) │ │ (Port 8000) │
└─────────────────┘ └──────────────────┘
│
┌────────────────┼────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ SQLite │ │ FAISS │ │ Mistral │
│ Database │ │ Vectors │ │ API │
└──────────┘ └──────────┘ └──────────┘
Document Ingestion:
Upload → Parse (PDF/TXT) → Chunk (512 chars) → Generate Evidence →
Create Embeddings → Store in FAISS + SQLite
RAG Chat:
User Query → Embed Query → Search FAISS (top-5) → Retrieve Evidence →
Inject Context → Mistral Chat → Stream Response + Citations
- Documents: Uploaded files with metadata and processing status
- Chunks: Text segments from documents with position tracking
- Evidence: Generated summaries and QA pairs for each chunk
- EmbeddingMetadata: FAISS index mapping and embedding details
Create backend/.env:
# Required
MISTRAL_API_KEY=your_mistral_api_key_here
# Optional - Models
MISTRAL_CHAT_MODEL=mistral-small-latest
MISTRAL_EMBEDDING_MODEL=mistral-embed
# Optional - Processing
CHUNK_SIZE=512
CHUNK_OVERLAP=128
EVIDENCE_QA_PAIRS=3
# Optional - Storage
DATABASE_URL=sqlite:///./lumina.db
VECTOR_STORE_PATH=./vector_store
VECTOR_DIMENSION=1024Create frontend/.env.local:
NEXT_PUBLIC_API_URL=http://localhost:8000# Build and run with docker-compose
export MISTRAL_API_KEY=your_key_here
docker-compose up --build
# Access at http://localhost:3000Backend:
- Migrate to PostgreSQL + pgvector for production scale
- Use Redis for caching
- Deploy on Cloud Run, Fly.io, or Railway
- Enable CORS for your frontend domain
Frontend:
- Deploy to Vercel or Netlify
- Set
NEXT_PUBLIC_API_URLto your backend URL - Enable production optimizations
Once the backend is running, visit:
- Interactive Docs: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
POST /api/chat/stream- Streaming chat with optional RAGPOST /api/knowledge/upload- Upload documentGET /api/knowledge/documents- List documentsDELETE /api/knowledge/documents/{id}- Delete document
lumina/
├── backend/
│ ├── app/
│ │ ├── api/ # API routes
│ │ ├── clients/ # External API clients (Mistral)
│ │ ├── core/ # Config, logging
│ │ ├── db/ # Database models, vector store
│ │ ├── schemas/ # Pydantic models
│ │ ├── services/ # Business logic
│ │ └── utils/ # Utilities (chunking, parsing)
│ ├── pyproject.toml # Python dependencies
│ └── .env.example
│
├── frontend/
│ ├── src/
│ │ ├── app/ # Next.js pages
│ │ ├── components/ # React components
│ │ └── lib/ # API client, utilities
│ ├── package.json # Node dependencies
│ └── tailwind.config.ts
│
├── docker-compose.yml # Docker orchestration
├── LICENSE # Apache 2.0
└── README.md
- Backend: Follow PEP 8
- Frontend: ESLint + Prettier
- Commits: Conventional Commits
Backend won't start:
- Verify Python 3.11+ is installed
- Check
.envhas validMISTRAL_API_KEY - Ensure port 8000 is not in use
Frontend won't connect:
- Verify backend is running on port 8000
- Check
NEXT_PUBLIC_API_URLin.env.local - Clear browser cache
Document processing stuck:
- Check backend logs for errors
- Verify Mistral API key has sufficient quota
- Try with smaller documents first
No citations appearing:
- Ensure documents are fully processed (status: "completed")
- Verify "Use Knowledge Base" toggle is ON
- Check that your query relates to uploaded documents
- Multi-file upload with batch processing
- Support for additional file formats (DOCX, HTML, Markdown)
- Advanced filtering (search specific documents)
- Chat history persistence
- User authentication
- PostgreSQL + pgvector migration
- Conversation memory and context management
Contributions are welcome! This is a portfolio project, but feel free to:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Copyright 2026 Lumina Contributors
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
- Mistral AI for providing powerful open-weight models and APIs
- CLaRa Paper for the evidence generation approach from apple research.
- FAISS by Facebook AI Research for efficient vector search
- FastAPI and Next.js communities for excellent frameworks
Built by Shashikanth Bokka.
- LinkedIn: linkedin.com/in/shashikanthbokka
- GitHub: @shknth
⭐ Star this repo if you find it useful! ⭐
Made with ❤️ using Mistral AI