Skip to content

Latest commit

 

History

History
297 lines (210 loc) · 8.01 KB

File metadata and controls

297 lines (210 loc) · 8.01 KB

Reddit Sentiment Analysis Bot

A comprehensive Reddit sentiment analysis application that fetches posts and comments from Reddit, analyzes their sentiment using Hugging Face transformers, and presents results through a modern web dashboard.

🚀 Features

  • Reddit Integration: Fetch posts and comments from specific subreddits or search by keywords
  • Advanced Sentiment Analysis: Uses DistilBERT model for accurate sentiment classification
  • Real-time Dashboard: Interactive charts and tables showing sentiment trends
  • Background Processing: Asynchronous analysis of large datasets
  • Modern UI: Built with React and TailwindCSS for a beautiful user experience
  • Docker Support: Easy deployment with Docker Compose

🏗️ Architecture

Backend (FastAPI)

  • API: RESTful API with FastAPI
  • Database: SQLite with SQLAlchemy ORM
  • Reddit Integration: PRAW (Python Reddit API Wrapper)
  • Sentiment Analysis: Hugging Face Transformers (DistilBERT)
  • Background Tasks: FastAPI background tasks for async processing

Frontend (React)

  • Framework: React 18 with modern hooks
  • Styling: TailwindCSS for responsive design
  • Charts: Recharts for data visualization
  • HTTP Client: Axios for API communication
  • Notifications: React Hot Toast for user feedback

📋 Prerequisites

  • Docker and Docker Compose
  • Reddit API credentials (Client ID and Secret)
  • Python 3.11+ (for local development)
  • Node.js 18+ (for local development)

🔧 Setup Instructions

1. Clone the Repository

git clone <repository-url>
cd reddit_nlp

2. Reddit API Setup

  1. Go to Reddit App Preferences
  2. Click "Create App" or "Create Another App"
  3. Choose "script" as the app type
  4. Note down your Client ID and Secret

3. Environment Configuration

Copy the example environment file and configure your settings:

cp env.example .env

Edit .env with your Reddit API credentials:

# Reddit API Configuration
REDDIT_CLIENT_ID=your_reddit_client_id_here
REDDIT_CLIENT_SECRET=your_reddit_client_secret_here
REDDIT_USER_AGENT=RedditSentimentBot/1.0

# Database Configuration
DATABASE_URL=sqlite:///./reddit_sentiment.db

# Redis Configuration (for background tasks)
REDIS_URL=redis://localhost:6379

# Application Configuration
DEBUG=True
SECRET_KEY=your-secret-key-here

4. Docker Deployment

Start the application with Docker Compose:

docker-compose up --build

This will start:

5. Access the Application

🚀 Usage

Dashboard Features

  1. Overview Dashboard

    • View sentiment statistics
    • Interactive pie charts and trend graphs
    • Recent posts table with sentiment labels
  2. Analysis Page

    • Analyze posts from specific subreddits
    • Search posts by keywords
    • Quick text sentiment analysis
    • Background processing for large datasets

API Endpoints

Reddit Data

  • POST /api/reddit/fetch-subreddit/{subreddit_name} - Fetch posts from subreddit
  • POST /api/reddit/search-keyword - Search posts by keyword
  • GET /api/reddit/posts - Get analyzed posts
  • GET /api/reddit/comments - Get analyzed comments

Sentiment Analysis

  • GET /api/sentiment/stats - Get sentiment statistics
  • GET /api/sentiment/trend - Get sentiment trends over time
  • POST /api/sentiment/analyze-text - Analyze single text

Dashboard

  • GET /api/dashboard/data - Get comprehensive dashboard data
  • GET /api/dashboard/subreddits - Get list of analyzed subreddits

🛠️ Development

Local Development Setup

Backend Development

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload

Frontend Development

cd frontend
npm install
npm start

Project Structure

reddit_nlp/
├── backend/
│   ├── app/
│   │   ├── models.py          # Database models
│   │   ├── schemas.py         # Pydantic schemas
│   │   ├── database.py        # Database configuration
│   │   ├── routers/           # API route handlers
│   │   ├── services/          # Business logic
│   │   └── utils/             # Utility functions
│   ├── main.py                # FastAPI application
│   ├── requirements.txt       # Python dependencies
│   └── Dockerfile
├── frontend/
│   ├── src/
│   │   ├── components/        # React components
│   │   ├── pages/            # Page components
│   │   ├── services/         # API services
│   │   └── utils/           # Utility functions
│   ├── package.json          # Node.js dependencies
│   └── Dockerfile
├── data/                     # Database storage
├── docker-compose.yml        # Docker orchestration
├── env.example              # Environment template
└── README.md

📊 Sentiment Analysis Model

The application uses DistilBERT-base-uncased-finetuned-sst-2-english from Hugging Face:

  • Model: DistilBERT (distilled version of BERT)
  • Task: Sentiment analysis (positive/negative/neutral)
  • Accuracy: High accuracy on English text
  • Performance: Fast inference with good accuracy trade-off

Sentiment Categories

  • Positive: Optimistic, happy, or favorable sentiment
  • Negative: Pessimistic, sad, or unfavorable sentiment
  • Neutral: Objective or balanced sentiment

🔄 Background Processing

The application uses FastAPI background tasks for:

  • Analyzing large batches of Reddit posts
  • Processing comments from multiple posts
  • Updating analysis sessions
  • Handling long-running sentiment analysis jobs

🐳 Docker Configuration

Services

  • backend: FastAPI application
  • frontend: React development server
  • redis: Background task queue

Volumes

  • ./data: Persistent database storage
  • redis_data: Redis data persistence

🚨 Troubleshooting

Common Issues

  1. Reddit API Rate Limits

    • Reddit has rate limits (60 requests per minute)
    • The app handles this gracefully with retries
  2. Model Loading Issues

    • Ensure sufficient memory (2GB+ recommended)
    • Model downloads automatically on first run
  3. Database Connection

    • SQLite database is created automatically
    • Check file permissions in ./data directory

Logs

View application logs:

docker-compose logs -f backend
docker-compose logs -f frontend

📈 Performance Considerations

  • Batch Processing: Sentiment analysis is batched for efficiency
  • Caching: Redis is used for background task management
  • Database Indexing: Optimized queries with proper indexes
  • Model Optimization: DistilBERT provides good speed/accuracy balance

🔒 Security

  • Environment variables for sensitive data
  • CORS configuration for frontend-backend communication
  • Input validation and sanitization
  • SQL injection protection via SQLAlchemy ORM

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📞 Support

For issues and questions:

  • Create an issue in the repository
  • Check the troubleshooting section
  • Review the API documentation at /docs

Happy Analyzing! 🎉