A topic modeling and content analysis system that combines BERTopic with Retrieval-Augmented Generation (RAG) to generate in depth reports from data segments of Dembrane ECHO.
- Advanced Topic Modeling: Uses BERTopic with hierarchical clustering for intelligent topic extraction
- RAG Integration: Retrieves relevant context from external knowledge bases to enhance analysis
- GPU Acceleration: Leverages CUDA for faster processing when available
- Multi-language Support: Generate reports in multiple languages
- Professional Reporting: Creates structured, journalistic-style reports with proper citations
- Serverless Deployment: Built for RunPod serverless infrastructure
- Flexible Input: Accepts segment IDs and custom user prompts for tailored analysis
- Machine Learning: BERTopic, Sentence Transformers, UMAP, HDBSCAN
- AI/LLM: Azure OpenAI, LiteLLM
- Backend: PyTorch, Directus SDK
- Infrastructure: RunPod, Docker
- Data Processing: Pydantic for data validation and structured outputs
- Python 3.8+
- CUDA-compatible GPU (optional, falls back to CPU)
- Access to Azure OpenAI API
- Directus instance for data management
- External RAG server for context retrieval
- Clone the repository
- Build the Docker image:
docker build -t topic-modeler .- Run the container:
docker run --gpus all -p 8000:8000 topic-modeler- Install dependencies:
pip install -r requirements.txt-
Set up environment variables (see Configuration section)
-
Run the handler:
python handler.pySet the following environment variables:
# Azure OpenAI Configuration
AZURE_API_KEY=your_azure_api_key
AZURE_API_BASE=your_azure_endpoint
AZURE_API_VERSION=2024-02-01
AZURE_MODEL=gpt-4
# Directus Configuration
DIRECTUS_BASE_URL=https://your-directus-instance.com
DIRECTUS_TOKEN=your_directus_token
# RAG Server Configuration
RAG_SERVER_URL=https://your-rag-server.com
RAG_SERVER_AUTH_TOKEN=your_rag_auth_token
# Optional: Force CPU usage
RUN_CPU=FalseThe system accepts JSON input with the following structure:
{
"input": {
"segment_ids": [360, 361, 362],
"user_prompt": "Please summarize all the topics related to AI development",
"response_language": "en"
}
}- segment_ids: List of segment IDs to analyze
- user_prompt: Custom prompt describing what analysis you want
- response_language: Target language for the output (e.g., "en", "es", "fr")
The system returns structured analysis with:
{
"title": "AI Development Trends Analysis",
"description": "Comprehensive analysis of AI development topics...",
"summary": "## Key Findings\n\n### Topic 1: Machine Learning\n...",
"seed": "Intial prompt",
"aspects": [
{
"title": "Machine Learning Advances",
"description": "Recent developments in ML algorithms",
"summary": "Detailed analysis...",
"image_url":"",
"segments": [
{
"segment_id": 360,
"description": "Discusses neural network improvements"
}
]
}
]
}- Input Processing: Receives segment IDs and user prompts
- Data Retrieval: Fetches relevant data segments from Directus
- RAG Enhancement: Retrieves additional context using external RAG server
- Topic Modeling: Applies BERTopic for intelligent topic extraction
- Content Generation: Uses Azure OpenAI to generate professional reports
- Output Structuring: Returns well-formatted, citable analysis
- Content Analysis: Analyze large volumes of text for key themes
- Research Synthesis: Combine multiple data sources into coherent reports
- Journalistic Research: Generate professional reports from raw data
- Business Intelligence: Extract insights from corporate communications
- Academic Research: Analyze literature and research data
handler.py: Main entry point for RunPod serverless functionutils.py: Core functionality including topic modeling and RAG integrationprompts.py: Professional prompt templates for different analysis typesdata_model.py: Pydantic models for structured input/output validation
The system is designed for serverless deployment on RunPod:
- Build the Docker image
- Deploy to RunPod with GPU support
- Configure environment variables in RunPod dashboard
- Test with sample inputs
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- CUDA Out of Memory: Set
RUN_CPU=Trueto use CPU-only mode - RAG Server Connection: Verify RAG_SERVER_URL and authentication tokens
- Topic Quality: Adjust
min_topic_sizeparameter in topic model initialization - Language Issues: Ensure proper language codes in
response_languagefield
- Use GPU acceleration for faster processing
- Adjust
top_kparameter in RAG queries based on your needs - Monitor memory usage with large document sets
- Consider batch processing for multiple analyses
For issues and questions:
- Check the troubleshooting section
- Review the configuration requirements
- Ensure all environment variables are properly set
git tag -a v1.X.X -m "XYZ"