Skip to content

IlyasFardaouix/YouTube-Sentiment-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YouTube Sentiment Analysis - MLOps Project

=============================================

GitHub

Table of Contents

  1. Project Overview
  2. Architecture
  3. Installation
    1. Prerequisites
    2. Backend Installation
    3. Chrome Extension Installation
  4. Usage
    1. Using the Chrome Extension
    2. Testing the API Manually
  5. Project Structure
  6. Deployment on Hugging Face Spaces
  7. Configuration
  8. Notes
  9. Known Issues
  10. Author

Project Overview


Introduction

This project implements a complete MLOps pipeline to analyze the sentiment of YouTube comments. It includes:

  • A Machine Learning model (Logistic Regression with TF-IDF) trained on Reddit data
  • A FastAPI backend API to serve the model
  • A Chrome extension for the user interface

Goals

This project demonstrates how to put an ML model into production as part of an MLOps course.

Architecture


Components

The project is organized into several components:

  • Machine Learning Engine (src/models/): Sentiment classification model (Logistic Regression + TF-IDF)
  • Data Processing (src/data/): Scripts to download and process data
  • Backend API (src/api/): FastAPI with /predict_batch endpoint
  • Chrome Extension (chrome-extension/): Extension to extract and analyze YouTube comments
  • Deployment: Dockerized application ready for Hugging Face Spaces

Installation


Prerequisites

  • Python 3.10 or higher
  • Google Chrome
  • Git

Backend Installation

  1. Clone the repository:

    git clone https://github.com/TALEB7/YouTube-Sentiment-Analysis.git
    cd YouTube-Sentiment-Analysis
  2. Create a virtual environment:

    python -m venv venv
    # On Windows:
    venv\Scripts\activate
    # On Linux/Mac:
    source venv/bin/activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Initialize project structure (optional):

    python setup.py
  5. Download and prepare data:

    python src/data/download_data.py
    python src/data/process_data.py
  6. Train the model:

    python src/models/train_model.py

    ⚠️ Note: Training may take several minutes depending on your machine.

  7. Run the API:

    python -m uvicorn src.api.main:app --reload

    The API will be accessible at http://127.0.0.1:8000

Chrome Extension Installation

  1. Open Chrome and go to chrome://extensions/
  2. Enable "Developer mode" (top right)
  3. Click "Load unpacked extension"
  4. Select the chrome-extension folder from this project

Usage


Using the Chrome Extension

  1. Make sure the API is running (locally or on Hugging Face)
  2. Go to a YouTube video page
  3. Scroll to load comments
  4. Click the extension icon
  5. Click "Analyze Comments"
  6. View the sentiment distribution and analysis of each comment

Testing the API Manually

You can test the API with the provided script:

python test_api.py

Or use curl:

curl -X POST "http://127.0.0.1:8000/predict_batch" \
  -H "Content-Type: application/json" \
  -d '{"comments": [{"id": "1", "text": "This video is great!"}]}'

Project Structure


YouTube-Sentiment-Analysis/
├── src/
│   ├── api/           # FastAPI
│   ├── data/          # Data processing scripts
│   └── models/        # Training scripts
├── chrome-extension/  # Chrome extension
├── data/             # Data (raw and processed)
├── models/           # Trained models
├── app.py            # Entry point for Hugging Face
├── Dockerfile        # Docker configuration
├── requirements.txt  # Python dependencies
└── README.md         # This file

Deployment on Hugging Face Spaces


To deploy on Hugging Face Spaces:

  1. Create a new Space (SDK: Docker)
  2. Upload the repository contents
    • ⚠️ Important: Include the models/sentiment_model.joblib file (or train it during build)
  3. The Space will build and launch the API automatically
  4. Update the URL in chrome-extension/popup.js:
    const apiUrl = "https://your-space.hf.space/predict_batch";

Configuration


Main parameters are in config.py. You can modify:

  • File paths
  • Model parameters
  • API URLs

Notes


  • The model is trained on Reddit data, so it may not be perfect for YouTube
  • The Chrome extension may need adjustments if YouTube changes its HTML structure
  • For better performance, we could use a pre-trained model (BERT, etc.)

Known Issues


  • If the extension doesn't find comments, make sure you've scrolled to load them
  • The model may take a few seconds to load when starting the API
  • On Hugging Face, make sure the model is included in the build

Author


  • FARDAOUI Ilyas

About

Sentiment analysis pipeline for YouTube comments with end-to-end NLP workflow.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors