python-ai-ragbot is a modular and framework-agnostic Python package for building intelligent chatbots and voicebots with Retrieval-Augmented Generation (RAG), powered by OpenAI and LangChain.
It provides a simple interface to attach ready-made request handlers into popular frameworks (FastAPI, Flask). You can quickly add both text chat (/chat) and voice (/voice) endpoints into your app. For other frameworks we are aggresively working to release the update
- Supports knowledge sources:
- Local files (
.pdf,.docx,.txt,.md) - Website scraping (URLs, sitemaps)
- Local files (
/chatendpoint (text query → answer)/voiceendpoint (speech-to-text via Whisper, TTS for responses)- Fully configurable models, embeddings, voices, chunking, logging
- In-memory FAISS vector store via LangChain
- Adapters for:
- FastAPI
- Starlette
- Flask
- Django
- Raw WSGI
- Raw ASGI
- Sync (
init_rag_voice_bot) and Async (init_rag_voice_bot_async) APIs
- Python 3.9+
- An OpenAI API key (
OPENAI_API_KEYin.env)
pip install python-ai-ragbotFor local development:
git clone https://github.com/your-org/python-ai-ragbot.git
cd python-ai-ragbot
pip install -e .# examples/server.py
from fastapi import FastAPI
from python_ai_ragbot import init_rag_voice_bot_async
from python_ai_ragbot.http.adapters import use_in_fastapi
from fastapi.middleware.cors import CORSMiddleware
from dotenv import load_dotenv
import os
load_dotenv()
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.on_event("startup")
async def startup_event():
bot = await init_rag_voice_bot_async({
"sources": {"files": ["examples/knowledge.txt"]},
"openai": {
"apiKey": os.getenv("OPENAI_API_KEY"),
"chat": {"model": "gpt-4o"},
"stt": {"model": "whisper-1"},
"tts": {"model": "tts-1-hd", "voice": "nova"},
}
})
use_in_fastapi(app, bot["chat_handler"], bot["voice_handler"], prefix="/api/bot")Run:
uvicorn examples.server:app --reload --port 3001Endpoints:
POST /api/bot/chatPOST /api/bot/voice
from flask import Flask
from python_ai_ragbot import init_rag_voice_bot
from python_ai_ragbot.http.adapters import use_in_flask
import os
from dotenv import load_dotenv
load_dotenv()
app = Flask(__name__)
bot = init_rag_voice_bot({
"sources": {"files": ["examples/knowledge.txt"]},
"openai": {"apiKey": os.getenv("OPENAI_API_KEY")},
})
use_in_flask(app, bot["chat_handler"], bot["voice_handler"], prefix="/api/bot")
if __name__ == "__main__":
app.run(port=3001){
"sources": {
"files": ["knowledge.txt", "knowledge.pdf"],
"urls": ["https://docs.example.com"]
},
"rag": {
"textSplit": {"chunkSize": 1000, "chunkOverlap": 200},
"topK": 3
},
"openai": {
"apiKey": "...",
"embeddings": {"model": "text-embedding-3-small"},
"chat": {"model": "gpt-4o", "temperature": 0.3},
"stt": {"model": "whisper-1"},
"tts": {"model": "tts-1-hd", "voice": "nova"}
},
"logger": "console"
}- POST JSON
{"question": "What is in the knowledge base?"}- POST raw audio (
audio/webm,audio/wav, etc.)
my-app/
├── examples/
│ ├── server.py
│ ├── knowledge.txt
├── src/
│ └── python_ai_ragbot/
├── .env
└── pyproject.toml
- Use
init_rag_voice_bot_asyncin ASGI frameworks (FastAPI, Starlette). - Use
init_rag_voice_botin WSGI frameworks (Flask, Django, raw WSGI). - Vector store is in-memory only; data is reloaded on each startup.