A Flask-based backend for document ingestion, semantic search, and influencer discovery using OpenAI and Google Gemini APIs. This project enables users to upload documents, generate embeddings, perform question answering over their data, and discover influencers for marketing campaigns.
- Document Upload & Embedding: Upload
.txt,.pdf,.docx,.csv, or.xlsxfiles. The system extracts, chunks, and embeds the content using OpenAI’s embedding models and stores them with FAISS for efficient retrieval. - Semantic Q&A: Ask questions about your uploaded documents. The system retrieves relevant chunks and generates answers using OpenAI’s chat models.
- Influencer Discovery: Generate a list of influencers matching campaign needs using Google Gemini’s generative AI.
- Async Flask API: All endpoints are asynchronous for high performance.
- Python 3.8+
- See
requirements.txtfor all dependencies.
-
Clone the repository:
git clone <repo-url> cd RAG
-
Install dependencies:
pip install -r requirements.txt
-
Set environment variables:
AZURE_OPENAI_API_KEY,AZURE_OPENAI_API_VERSION,AZURE_OPENAI_ENDPOINT,AZURE_OPENAI_CHAT_DEPLOYMENT_NAME,AZURE_DEPLOYMENT_EMBEDDINGfor OpenAI.GEMINI_API_KEYfor Google Gemini.- Optionally,
SECRET_KEYandFLASK_ENV.
-
Run the server:
python run.py
The API will be available at
http://localhost:5000.
POST /upload
- Upload a document for embedding and storage.
- Body (form-data):
file(the document file) - Response:
{ "success": true, "message": "...", "uuid": "<doc_id>" }
POST /qna
- Ask a question about an uploaded document.
- Body (JSON):
{ "uuid": "<doc_id>", // optional, if omitted uses default "question": "Your question here" } - Response: Answer, context usage, logs, and search parameters.
POST /discover-influencers
- Generate a list of influencers based on campaign/search parameters.
- Body (JSON):
{ "search_parameters": { /* campaign criteria */ } } - Response: List of influencers, count, and logs.
app/- Main application code (routes, services)embeddings/- Stores FAISS indices and chunk datastorage/- Stores uploaded filesconfig.py- Configurationrun.py- Entrypoint
MIT (or specify your license)