A complete end-to-end chatbot system that combines Retrieval-Augmented Generation (RAG), emotion analysis, and feedback collection for intelligent customer service automation.
This project implements a sophisticated chatbot system that:
- Answers FAQ questions using semantic search and local LLM generation
- Analyzes user emotions in real-time using fine-tuned DistilBERT
- Collects user feedback and ratings for continuous improvement
- Provides analytics dashboards for monitoring and optimization
- Semantic Search: FAISS vector database for FAQ retrieval
- Local LLM: TinyLlama-1.1B-Chat for response generation
- Smart Context: Combines relevant FAQs for accurate answers
- Fast Performance: 3-6 second response times
- Real-time Detection: 28 different emotions using DistilBERT
- GoEmotions Dataset: Trained on 40,000 samples for accuracy
- Emotion Categories: Positive, Negative, Neutral, Complex emotions
- Business Intelligence: Correlate emotions with satisfaction ratings
- Streamlit Web App: Clean, modern chat interface
- Smart Small Talk: Instant responses for greetings and casual conversation
- Session Management: Unique session tracking with timestamps
- MySQL Storage: Complete conversation logging
- Feedback Collection: Star ratings and text comments
- Emotion Dashboard: Visualize emotion trends and patterns
- Session Analysis: Detailed per-conversation emotion journeys
- Continuous Improvement: Data-driven optimization suggestions
User Input → Small Talk Handler → RAG Pipeline → Response Generation
↓ ↓ ↓ ↓
Emotion Analysis → MySQL Storage → Analytics Dashboard → Insights
- FAQ Embeddings: Sentence Transformers + FAISS
- Response Generation: TinyLlama-1.1B local inference
- Emotion Analysis: DistilBERT + GoEmotions dataset
- Data Storage: MySQL with session and message tracking
- User Interface: Streamlit with real-time chat
- Analytics: Plotly visualizations and trend analysis
- Python 3.8+
- 8GB+ RAM (for local LLM)
- MySQL server
- 4GB+ disk space
git clone https://github.com/Aparnashree11/ChatBot_with_emotion_analysis.git
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txt-- Create database
CREATE DATABASE chatbot_feedback CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
-- Create user (optional)
CREATE USER 'chatbot_user'@'localhost' IDENTIFIED BY 'your_password';
GRANT ALL PRIVILEGES ON chatbot_feedback.* TO 'chatbot_user'@'localhost';
FLUSH PRIVILEGES;Create .env file:
MYSQL_HOST=localhost
MYSQL_USER=chatbot_user
MYSQL_PASSWORD=your_password
MYSQL_DB=chatbot_feedbackpython embeddings_gen.py- Processes your FAQ data (CSV format)
- Creates FAISS vector store for semantic search
- Output:
faq_vector_store.faiss+ metadata
python emotion_classification_training.py- Fine-tunes DistilBERT on GoEmotions dataset
- Training time: ~2-3 hours for 10,000 samples
- Output:
distilbert-goemotions-10k/model directory
streamlit run new_app.py- Opens chat interface at
http://localhost:8501 - Users can chat and provide feedback
- Features: Real-time chat, session management, feedback collection
streamlit run main.py- Analytics dashboard at
http://localhost:8501 - Features: Emotion trends, session analysis, rating correlations
from model_testing import create_balanced_chatbot
# Initialize chatbot
chatbot = create_balanced_chatbot("faq_vector_store")
# Ask questions
response = chatbot.ask("How do I return a product?")
print(f"Answer: {response.answer}")
print(f"Confidence: {response.confidence_score}")from emotion_classification_training import EmotionPredictor
# Load emotion model
predictor = EmotionPredictor("./distilbert-goemotions-10k")
# Analyze emotions
emotions = predictor.predict_top_k("I'm so frustrated with this!", k=3)
print(emotions) # [('anger', 0.87), ('annoyance', 0.72), ('disappointment', 0.54)]# Chat interface
streamlit run new_app.py
# Analytics dashboard
streamlit run main.py- Small Talk: 0.1 seconds (instant)
- FAQ Questions: 5-10 seconds average
- Cold Start: 10-15 seconds (model loading)
- FAQ Retrieval: Confidence-scored semantic matching
- Emotion Detection: F1-macro ~0.45-0.50 (28 emotions)
- User Satisfaction: Tracked via ratings and feedback
- RAM: 2-3GB with model quantization
- Storage: 3-4GB total (models + embeddings)
- GPU: Optional but recommended for training
chatbot = LocalRAGChatbot(
embeddings_path="faq_vector_store",
similarity_threshold=0.5, # Higher = more selective
max_retrieved_faqs=2, # Number of FAQs to use
device="cpu" # "cpu" or "cuda"
)config = TrainingConfig(
max_train_samples=10000, # Training data size
num_epochs=2, # Training epochs
batch_size=32, # Batch size
learning_rate=5e-5 # Learning rate
)Model Loading Errors
# Check if model files exist
ls -la distilbert-goemotions-10k/
# Should contain: config.json, pytorch_model.bin, tokenizer filesMemory Issues
# Reduce batch size or use CPU
config.batch_size = 8
config.device = "cpu"MySQL Connection
# Test MySQL connection
mysql -u root -p chatbot_feedback
# Verify tables exist: SHOW TABLES;Slow Performance
- Use GPU if available
- Reduce FAQ database size
- Lower similarity threshold
- Use smaller models
- Overall Trends: Emotion distribution across all chats
- Session Analysis: Click any session for detailed emotion journey
- Rating Correlation: How emotions affect user satisfaction
- Time Patterns: Emotion trends by day/hour
- Customer Satisfaction: Average ratings and positive emotion rates
- Content Gaps: Identify FAQ areas needing improvement
- Performance Monitoring: Response times and confidence scores
- User Behavior: Conversation patterns and emotional journeys
- Users chat → Provide ratings and feedback
- System analyzes → Emotion patterns and satisfaction
- Identifies gaps → FAQ content and response quality
- Generates insights → Optimization recommendations
- Updates system → Better responses and user experience
- FAQ Content: Add missing questions, improve answers
- Model Parameters: Adjust similarity thresholds and context size
- Response Quality: Fine-tune prompts and generation settings
- User Experience: Enhance based on emotional feedback patterns
- Fork the repository
- Create feature branch:
git checkout -b feature/new-feature - Make changes and test locally
- Commit changes:
git commit -m "Add new feature" - Push to branch:
git push origin feature/new-feature - Create Pull Request
- PEP 8: Python style guide
- Type Hints: Use typing annotations
- Documentation: Docstrings for all functions
- Testing: Add tests for new features
For questions, issues, or contributions:
- Create an Issue: Use GitHub Issues for bugs and feature requests
- Documentation: Check inline code documentation
- Community: Join discussions in project Issues