AI Agent System with RAG and Outbound Calling

A comprehensive AI agent management system that combines Retrieval-Augmented Generation (RAG) capabilities with real-time voice calling using LiveKit and OpenAI's Realtime API.

🚀 Features

AI Agent Creation & Management: Create custom AI agents with unique personas and behaviors
RAG Integration: Upload documents (PDF, TXT, DOCX, URLs) to create knowledge bases for agents
Outbound Voice Calling: Make real-time voice calls using LiveKit SIP integration
Real-time Conversations: Powered by OpenAI's Realtime API for natural voice interactions
Vector Database: Pinecone integration for efficient knowledge retrieval
User Authentication: JWT-based authentication system
Analytics Dashboard: Call metrics and agent performance tracking
Document Management: Upload, process, and manage agent knowledge bases
Multiple Voice Options: Customizable voice actors and tones for calls

🏗️ Architecture

The system consists of two main components:

FastAPI Backend (main.py): RESTful API for agent management, RAG processing, and call orchestration
LiveKit Agent Worker (agent_worker.py): Real-time voice conversation handler with knowledge base integration

📋 Prerequisites

Python 3.8+
MongoDB database
Pinecone account and API key
OpenAI API key
LiveKit Cloud account
SIP trunk for outbound calling

🛠️ Installation

Clone the repository

git clone https://github.com/shubhamprasad318/ai_wao_agent
cd ai_rag_agent_sip

Install dependencies

pip install -r requirements.txt

Environment Setup

Create a .env file in the root directory:

# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here

# MongoDB Configuration
MONGODB_URI=mongodb://localhost:27017
MONGO_DB_NAME=ai_agent_demo

# Pinecone Configuration
PINECONE_API_KEY=your_pinecone_api_key_here
PINECONE_INDEX_NAME=luminous-pine

# LiveKit Configuration
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
SIP_OUTBOUND_TRUNK_ID=your_sip_trunk_id

# Security
SECRET_KEY=your-secret-key-change-this-in-production

# Optional
PORT=8000
ENVIRONMENT=development

Database Setup

Ensure MongoDB is running and accessible. The application will create the necessary collections automatically.
Pinecone Index Setup

Create a Pinecone index with the following specifications:
- Dimension: 1536 (for OpenAI text-embedding-3-small)
- Metric: Cosine similarity
- Index name: Should match PINECONE_INDEX_NAME in your .env

🚀 Running the Application

Start the FastAPI Server

python main.py

The API will be available at http://localhost:8000

Start the LiveKit Agent Worker

python agent_worker.py

📚 API Documentation

Once the server is running, visit:

Interactive API Docs: http://localhost:8000/docs
ReDoc Documentation: http://localhost:8000/redoc

Key Endpoints

Authentication

POST /api/register - Register a new user
POST /api/login - User login
GET /api/me - Get current user profile

Agent Management

POST /api/agent - Create a new AI agent
GET /api/agents - List all agents
GET /api/agent/{agent_id} - Get specific agent
PATCH /api/agent/{agent_id} - Update agent configuration
DELETE /api/agent/{agent_id} - Delete an agent

Calling

POST /api/call - Initiate an outbound call
GET /api/calls - List all calls
GET /api/call/{call_id} - Get call details

Knowledge Base

GET /api/agent/{agent_id}/query - Query agent's knowledge base
GET /api/agent/{agent_id}/documents - Get agent documents
DELETE /api/agent/{agent_id}/documents/{doc_id} - Delete a document

🤖 Creating an AI Agent

Basic Agent Creation

import requests

agent_data = {
"name": "Customer Support Agent",
"language": "English",
"model": "gpt-4",
"persona": "A helpful and professional customer support representative",
"description": "Handles customer inquiries and support requests",
"temperature": 0.7,
"max_tokens": 1000,
"default_voice": "alloy",
"default_tone": "professional"
}

response = requests.post("http://localhost:8000/api/agent", json=agent_data)
agent = response.json()

Agent with Knowledge Base

agent_data = {
"name": "Product Expert",
"persona": "An expert in our products with deep technical knowledge",
"rag_docs": [
"https://example.com/product-manual.pdf",
"https://example.com/faq.txt",
"/path/to/local/documentation.docx"
],
"instructions": "Always provide accurate product information and help customers understand our offerings"
}

response = requests.post("http://localhost:8000/api/agent", json=agent_data)

📞 Making Outbound Calls

call_data = {
"agent_id": "agent_abc123",
"phone_number": "+1234567890",
"voice_actor": "alloy",
"tone": "friendly",
"prompt_vars": {
"customer_name": "John Doe",
"appointment_time": "3 PM today"
},
"metadata": {
"campaign": "appointment_reminders"
}
}

response = requests.post("http://localhost:8000/api/call", json=call_data)
call = response.json()

🧠 RAG (Retrieval-Augmented Generation)

The system automatically processes documents and creates vector embeddings for intelligent knowledge retrieval:

Document Processing: Supports PDF, TXT, DOCX files and web URLs
Text Chunking: Intelligently splits documents into manageable chunks
Vector Embeddings: Uses OpenAI's text-embedding-3-small model
Storage: Vectors stored in Pinecone with metadata
Retrieval: Semantic search during conversations for relevant context

Supported Document Types

URLs: Web pages, PDFs hosted online
PDF Files: Local or remote PDF documents
Text Files: Plain text documents
Word Documents: DOCX format files
Direct Text: Raw text content

🎯 Agent Capabilities

Built-in Functions

Each agent comes with intelligent function calling capabilities:

search_knowledge_base(query): Search the agent's knowledge base
answer_question(question): Answer questions using knowledge base
help_with_request(request): General assistance with user requests

Voice Configuration

Voice Options: alloy, echo, fable, onyx, nova, shimmer
Tone Settings: professional, friendly, casual, enthusiastic
Language Support: Multiple languages supported
Real-time Processing: Natural conversation flow

📊 Analytics & Monitoring

Dashboard Metrics

Total agents created
Call statistics and success rates
RAG processing status
Recent activity tracking

Call Analytics

Call duration and outcomes
Voice quality metrics
Agent performance tracking
Custom metadata analysis

🔧 Configuration Options

Agent Configuration

{
"name": "Agent Name",
"language": "English",
"model": "gpt-4",
"persona": "Agent personality description",
"instructions": "Additional behavioral instructions",
"temperature": 0.7, # 0.0 to 2.0
"max_tokens": 1000,
"default_voice": "alloy",
"default_tone": "professional",
"custom_fields": {
"transfer_to": "+1234567890",
"department": "support"
},
"metadata": {
"version": "1.0",
"created_by": "admin"
}
}

Call Configuration

{
"agent_id": "agent_123",
"phone_number": "+1234567890",
"voice_actor": "nova",
"tone": "friendly",
"prompt_vars": {
"variable_name": "value"
},
"metadata": {
"campaign": "outbound_sales",
"priority": "high"
}
}

🚨 Error Handling

The system includes comprehensive error handling:

Database Connection: Graceful fallbacks when MongoDB/Pinecone unavailable
API Rate Limits: Automatic retry logic for OpenAI API calls
Call Failures: SIP error tracking and reporting
Document Processing: Robust error handling for various file formats

🔒 Security Features

JWT Authentication: Secure token-based authentication
Password Hashing: bcrypt password encryption
CORS Configuration: Configurable cross-origin request handling
Input Validation: Pydantic models for request validation
Rate Limiting: Built-in protection against abuse

📈 Scaling Considerations

Performance Optimization

Batch Processing: Efficient document processing in batches
Vector Search: Optimized Pinecone queries with relevance thresholds
Caching: MongoDB indexing for fast agent retrieval
Async Processing: Background tasks for RAG document processing

🛠️ Development

Project Structure

ai-agent-system/
├── main.py # FastAPI backend server
├── agent_worker.py # LiveKit agent worker
├── requirements.txt # Python dependencies
├── .env # Environment configuration
├── README.md # This file
└── docs/ # Additional documentation

Dependencies

Key dependencies include:

FastAPI: Web framework
LiveKit: Real-time communication
OpenAI: LLM and embeddings
Pinecone: Vector database
PyMongo: MongoDB client
PyPDF2: PDF processing
python-docx: Word document processing

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

🔄 Changelog

v1.0.0

Initial release
AI agent creation and management
RAG integration with Pinecone
Outbound calling with LiveKit
User authentication system
Analytics dashboard

Built with ❤️ using FastAPI, LiveKit, OpenAI, and Pinecone,Plivo

pinecone -- https://app.pinecone.io/organizations/-OWodZyDTiYiOqoBzFGF/projects/029c5e24-40fc-4ed3-b2ae-2f580d147841/indexes/luminous-pine/browser

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
worker.py		worker.py

prakhar7824/SmartCall-Agent

Folders and files

Latest commit

History

Repository files navigation