Skip to content

rk-python5/Reddit_sentiment_analysis_with_nlp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Reddit Sentiment Analysis Bot

A comprehensive Reddit sentiment analysis application that fetches posts and comments from Reddit, analyzes their sentiment using Hugging Face transformers, and presents results through a modern web dashboard.

πŸš€ Features

  • Reddit Integration: Fetch posts and comments from specific subreddits or search by keywords
  • Advanced Sentiment Analysis: Uses DistilBERT model for accurate sentiment classification
  • Real-time Dashboard: Interactive charts and tables showing sentiment trends
  • Background Processing: Asynchronous analysis of large datasets
  • Modern UI: Built with React and TailwindCSS for a beautiful user experience
  • Docker Support: Easy deployment with Docker Compose

πŸ—οΈ Architecture

Backend (FastAPI)

  • API: RESTful API with FastAPI
  • Database: SQLite with SQLAlchemy ORM
  • Reddit Integration: PRAW (Python Reddit API Wrapper)
  • Sentiment Analysis: Hugging Face Transformers (DistilBERT)
  • Background Tasks: FastAPI background tasks for async processing

Frontend (React)

  • Framework: React 18 with modern hooks
  • Styling: TailwindCSS for responsive design
  • Charts: Recharts for data visualization
  • HTTP Client: Axios for API communication
  • Notifications: React Hot Toast for user feedback

πŸ“‹ Prerequisites

  • Docker and Docker Compose
  • Reddit API credentials (Client ID and Secret)
  • Python 3.11+ (for local development)
  • Node.js 18+ (for local development)

πŸ”§ Setup Instructions

1. Clone the Repository

git clone <repository-url>
cd reddit_nlp

2. Reddit API Setup

  1. Go to Reddit App Preferences
  2. Click "Create App" or "Create Another App"
  3. Choose "script" as the app type
  4. Note down your Client ID and Secret

3. Environment Configuration

Copy the example environment file and configure your settings:

cp env.example .env

Edit .env with your Reddit API credentials:

# Reddit API Configuration
REDDIT_CLIENT_ID=your_reddit_client_id_here
REDDIT_CLIENT_SECRET=your_reddit_client_secret_here
REDDIT_USER_AGENT=RedditSentimentBot/1.0

# Database Configuration
DATABASE_URL=sqlite:///./reddit_sentiment.db

# Redis Configuration (for background tasks)
REDIS_URL=redis://localhost:6379

# Application Configuration
DEBUG=True
SECRET_KEY=your-secret-key-here

4. Docker Deployment

Start the application with Docker Compose:

docker-compose up --build

This will start:

5. Access the Application

πŸš€ Usage

Dashboard Features

  1. Overview Dashboard

    • View sentiment statistics
    • Interactive pie charts and trend graphs
    • Recent posts table with sentiment labels
  2. Analysis Page

    • Analyze posts from specific subreddits
    • Search posts by keywords
    • Quick text sentiment analysis
    • Background processing for large datasets

API Endpoints

Reddit Data

  • POST /api/reddit/fetch-subreddit/{subreddit_name} - Fetch posts from subreddit
  • POST /api/reddit/search-keyword - Search posts by keyword
  • GET /api/reddit/posts - Get analyzed posts
  • GET /api/reddit/comments - Get analyzed comments

Sentiment Analysis

  • GET /api/sentiment/stats - Get sentiment statistics
  • GET /api/sentiment/trend - Get sentiment trends over time
  • POST /api/sentiment/analyze-text - Analyze single text

Dashboard

  • GET /api/dashboard/data - Get comprehensive dashboard data
  • GET /api/dashboard/subreddits - Get list of analyzed subreddits

πŸ› οΈ Development

Local Development Setup

Backend Development

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload

Frontend Development

cd frontend
npm install
npm start

Project Structure

reddit_nlp/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ models.py          # Database models
β”‚   β”‚   β”œβ”€β”€ schemas.py         # Pydantic schemas
β”‚   β”‚   β”œβ”€β”€ database.py        # Database configuration
β”‚   β”‚   β”œβ”€β”€ routers/           # API route handlers
β”‚   β”‚   β”œβ”€β”€ services/          # Business logic
β”‚   β”‚   └── utils/             # Utility functions
β”‚   β”œβ”€β”€ main.py                # FastAPI application
β”‚   β”œβ”€β”€ requirements.txt       # Python dependencies
β”‚   └── Dockerfile
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/        # React components
β”‚   β”‚   β”œβ”€β”€ pages/            # Page components
β”‚   β”‚   β”œβ”€β”€ services/         # API services
β”‚   β”‚   └── utils/           # Utility functions
β”‚   β”œβ”€β”€ package.json          # Node.js dependencies
β”‚   └── Dockerfile
β”œβ”€β”€ data/                     # Database storage
β”œβ”€β”€ docker-compose.yml        # Docker orchestration
β”œβ”€β”€ env.example              # Environment template
└── README.md

πŸ“Š Sentiment Analysis Model

The application uses DistilBERT-base-uncased-finetuned-sst-2-english from Hugging Face:

  • Model: DistilBERT (distilled version of BERT)
  • Task: Sentiment analysis (positive/negative/neutral)
  • Accuracy: High accuracy on English text
  • Performance: Fast inference with good accuracy trade-off

Sentiment Categories

  • Positive: Optimistic, happy, or favorable sentiment
  • Negative: Pessimistic, sad, or unfavorable sentiment
  • Neutral: Objective or balanced sentiment

πŸ”„ Background Processing

The application uses FastAPI background tasks for:

  • Analyzing large batches of Reddit posts
  • Processing comments from multiple posts
  • Updating analysis sessions
  • Handling long-running sentiment analysis jobs

🐳 Docker Configuration

Services

  • backend: FastAPI application
  • frontend: React development server
  • redis: Background task queue

Volumes

  • ./data: Persistent database storage
  • redis_data: Redis data persistence

🚨 Troubleshooting

Common Issues

  1. Reddit API Rate Limits

    • Reddit has rate limits (60 requests per minute)
    • The app handles this gracefully with retries
  2. Model Loading Issues

    • Ensure sufficient memory (2GB+ recommended)
    • Model downloads automatically on first run
  3. Database Connection

    • SQLite database is created automatically
    • Check file permissions in ./data directory

Logs

View application logs:

docker-compose logs -f backend
docker-compose logs -f frontend

πŸ“ˆ Performance Considerations

  • Batch Processing: Sentiment analysis is batched for efficiency
  • Caching: Redis is used for background task management
  • Database Indexing: Optimized queries with proper indexes
  • Model Optimization: DistilBERT provides good speed/accuracy balance

πŸ”’ Security

  • Environment variables for sensitive data
  • CORS configuration for frontend-backend communication
  • Input validation and sanitization
  • SQL injection protection via SQLAlchemy ORM

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“ž Support

For issues and questions:

  • Create an issue in the repository
  • Check the troubleshooting section
  • Review the API documentation at /docs

Happy Analyzing! πŸŽ‰

About

A comprehensive sentiment analysis application that fetches and analyzes Reddit posts/comments in real-time using advanced NLP techniques. Built with modern web technologies and containerized for easy deployment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors