A comprehensive AI agent management system that combines Retrieval-Augmented Generation (RAG) capabilities with real-time voice calling using LiveKit and OpenAI's Realtime API.
- AI Agent Creation & Management: Create custom AI agents with unique personas and behaviors
- RAG Integration: Upload documents (PDF, TXT, DOCX, URLs) to create knowledge bases for agents
- Outbound Voice Calling: Make real-time voice calls using LiveKit SIP integration
- Real-time Conversations: Powered by OpenAI's Realtime API for natural voice interactions
- Vector Database: Pinecone integration for efficient knowledge retrieval
- User Authentication: JWT-based authentication system
- Analytics Dashboard: Call metrics and agent performance tracking
- Document Management: Upload, process, and manage agent knowledge bases
- Multiple Voice Options: Customizable voice actors and tones for calls
The system consists of two main components:
- FastAPI Backend (
main.py): RESTful API for agent management, RAG processing, and call orchestration - LiveKit Agent Worker (
agent_worker.py): Real-time voice conversation handler with knowledge base integration
- Python 3.8+
- MongoDB database
- Pinecone account and API key
- OpenAI API key
- LiveKit Cloud account
- SIP trunk for outbound calling
Clone the repository
git clone https://github.com/shubhamprasad318/ai_wao_agent
cd ai_rag_agent_sip
Install dependencies
pip install -r requirements.txt
Environment Setup
Create a .env file in the root directory:
# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here
# MongoDB Configuration
MONGODB_URI=mongodb://localhost:27017
MONGO_DB_NAME=ai_agent_demo
# Pinecone Configuration
PINECONE_API_KEY=your_pinecone_api_key_here
PINECONE_INDEX_NAME=luminous-pine
# LiveKit Configuration
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
SIP_OUTBOUND_TRUNK_ID=your_sip_trunk_id
# Security
SECRET_KEY=your-secret-key-change-this-in-production
# Optional
PORT=8000
ENVIRONMENT=development
-
Database Setup
Ensure MongoDB is running and accessible. The application will create the necessary collections automatically.
-
Pinecone Index Setup
Create a Pinecone index with the following specifications:
- Dimension: 1536 (for OpenAI text-embedding-3-small)
- Metric: Cosine similarity
- Index name: Should match
PINECONE_INDEX_NAMEin your.env
python main.py
The API will be available at http://localhost:8000
python agent_worker.py
Once the server is running, visit:
- Interactive API Docs:
http://localhost:8000/docs - ReDoc Documentation:
http://localhost:8000/redoc
POST /api/register- Register a new userPOST /api/login- User loginGET /api/me- Get current user profile
POST /api/agent- Create a new AI agentGET /api/agents- List all agentsGET /api/agent/{agent_id}- Get specific agentPATCH /api/agent/{agent_id}- Update agent configurationDELETE /api/agent/{agent_id}- Delete an agent
POST /api/call- Initiate an outbound callGET /api/calls- List all callsGET /api/call/{call_id}- Get call details
GET /api/agent/{agent_id}/query- Query agent's knowledge baseGET /api/agent/{agent_id}/documents- Get agent documentsDELETE /api/agent/{agent_id}/documents/{doc_id}- Delete a document
import requests
agent_data = {
"name": "Customer Support Agent",
"language": "English",
"model": "gpt-4",
"persona": "A helpful and professional customer support representative",
"description": "Handles customer inquiries and support requests",
"temperature": 0.7,
"max_tokens": 1000,
"default_voice": "alloy",
"default_tone": "professional"
}
response = requests.post("http://localhost:8000/api/agent", json=agent_data)
agent = response.json()
agent_data = {
"name": "Product Expert",
"persona": "An expert in our products with deep technical knowledge",
"rag_docs": [
"https://example.com/product-manual.pdf",
"https://example.com/faq.txt",
"/path/to/local/documentation.docx"
],
"instructions": "Always provide accurate product information and help customers understand our offerings"
}
response = requests.post("http://localhost:8000/api/agent", json=agent_data)
call_data = {
"agent_id": "agent_abc123",
"phone_number": "+1234567890",
"voice_actor": "alloy",
"tone": "friendly",
"prompt_vars": {
"customer_name": "John Doe",
"appointment_time": "3 PM today"
},
"metadata": {
"campaign": "appointment_reminders"
}
}
response = requests.post("http://localhost:8000/api/call", json=call_data)
call = response.json()
The system automatically processes documents and creates vector embeddings for intelligent knowledge retrieval:
- Document Processing: Supports PDF, TXT, DOCX files and web URLs
- Text Chunking: Intelligently splits documents into manageable chunks
- Vector Embeddings: Uses OpenAI's text-embedding-3-small model
- Storage: Vectors stored in Pinecone with metadata
- Retrieval: Semantic search during conversations for relevant context
- URLs: Web pages, PDFs hosted online
- PDF Files: Local or remote PDF documents
- Text Files: Plain text documents
- Word Documents: DOCX format files
- Direct Text: Raw text content
Each agent comes with intelligent function calling capabilities:
search_knowledge_base(query): Search the agent's knowledge baseanswer_question(question): Answer questions using knowledge basehelp_with_request(request): General assistance with user requests
- Voice Options: alloy, echo, fable, onyx, nova, shimmer
- Tone Settings: professional, friendly, casual, enthusiastic
- Language Support: Multiple languages supported
- Real-time Processing: Natural conversation flow
- Total agents created
- Call statistics and success rates
- RAG processing status
- Recent activity tracking
- Call duration and outcomes
- Voice quality metrics
- Agent performance tracking
- Custom metadata analysis
{
"name": "Agent Name",
"language": "English",
"model": "gpt-4",
"persona": "Agent personality description",
"instructions": "Additional behavioral instructions",
"temperature": 0.7, # 0.0 to 2.0
"max_tokens": 1000,
"default_voice": "alloy",
"default_tone": "professional",
"custom_fields": {
"transfer_to": "+1234567890",
"department": "support"
},
"metadata": {
"version": "1.0",
"created_by": "admin"
}
}
{
"agent_id": "agent_123",
"phone_number": "+1234567890",
"voice_actor": "nova",
"tone": "friendly",
"prompt_vars": {
"variable_name": "value"
},
"metadata": {
"campaign": "outbound_sales",
"priority": "high"
}
}
The system includes comprehensive error handling:
- Database Connection: Graceful fallbacks when MongoDB/Pinecone unavailable
- API Rate Limits: Automatic retry logic for OpenAI API calls
- Call Failures: SIP error tracking and reporting
- Document Processing: Robust error handling for various file formats
- JWT Authentication: Secure token-based authentication
- Password Hashing: bcrypt password encryption
- CORS Configuration: Configurable cross-origin request handling
- Input Validation: Pydantic models for request validation
- Rate Limiting: Built-in protection against abuse
- Batch Processing: Efficient document processing in batches
- Vector Search: Optimized Pinecone queries with relevance thresholds
- Caching: MongoDB indexing for fast agent retrieval
- Async Processing: Background tasks for RAG document processing
ai-agent-system/
βββ main.py # FastAPI backend server
βββ agent_worker.py # LiveKit agent worker
βββ requirements.txt # Python dependencies
βββ .env # Environment configuration
βββ README.md # This file
βββ docs/ # Additional documentation
Key dependencies include:
- FastAPI: Web framework
- LiveKit: Real-time communication
- OpenAI: LLM and embeddings
- Pinecone: Vector database
- PyMongo: MongoDB client
- PyPDF2: PDF processing
- python-docx: Word document processing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Initial release
- AI agent creation and management
- RAG integration with Pinecone
- Outbound calling with LiveKit
- User authentication system
- Analytics dashboard
Built with β€οΈ using FastAPI, LiveKit, OpenAI, and Pinecone,Plivo