YouTube Sentiment Analysis - MLOps Project

=============================================

Project Overview

Introduction

This project implements a complete MLOps pipeline to analyze the sentiment of YouTube comments. It includes:

A Machine Learning model (Logistic Regression with TF-IDF) trained on Reddit data
A FastAPI backend API to serve the model
A Chrome extension for the user interface

Goals

This project demonstrates how to put an ML model into production as part of an MLOps course.

Architecture

Components

The project is organized into several components:

Machine Learning Engine (src/models/): Sentiment classification model (Logistic Regression + TF-IDF)
Data Processing (src/data/): Scripts to download and process data
Backend API (src/api/): FastAPI with /predict_batch endpoint
Chrome Extension (chrome-extension/): Extension to extract and analyze YouTube comments
Deployment: Dockerized application ready for Hugging Face Spaces

Installation

Prerequisites

Python 3.10 or higher
Google Chrome
Git

Backend Installation

Clone the repository:

git clone https://github.com/TALEB7/YouTube-Sentiment-Analysis.git
cd YouTube-Sentiment-Analysis

Create a virtual environment:

python -m venv venv
# On Windows:
venv\Scripts\activate
# On Linux/Mac:
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```
Initialize project structure (optional):
```
python setup.py
```

Download and prepare data:

python src/data/download_data.py
python src/data/process_data.py

Train the model:
```
python src/models/train_model.py
```
⚠️ Note: Training may take several minutes depending on your machine.
Run the API:
```
python -m uvicorn src.api.main:app --reload
```
The API will be accessible at http://127.0.0.1:8000

Chrome Extension Installation

Open Chrome and go to chrome://extensions/
Enable "Developer mode" (top right)
Click "Load unpacked extension"
Select the chrome-extension folder from this project

Usage

Using the Chrome Extension

Make sure the API is running (locally or on Hugging Face)
Go to a YouTube video page
Scroll to load comments
Click the extension icon
Click "Analyze Comments"
View the sentiment distribution and analysis of each comment

Testing the API Manually

You can test the API with the provided script:

python test_api.py

Or use curl:

curl -X POST "http://127.0.0.1:8000/predict_batch" \
  -H "Content-Type: application/json" \
  -d '{"comments": [{"id": "1", "text": "This video is great!"}]}'

Project Structure

YouTube-Sentiment-Analysis/
├── src/
│   ├── api/           # FastAPI
│   ├── data/          # Data processing scripts
│   └── models/        # Training scripts
├── chrome-extension/  # Chrome extension
├── data/             # Data (raw and processed)
├── models/           # Trained models
├── app.py            # Entry point for Hugging Face
├── Dockerfile        # Docker configuration
├── requirements.txt  # Python dependencies
└── README.md         # This file

Deployment on Hugging Face Spaces

To deploy on Hugging Face Spaces:

Create a new Space (SDK: Docker)
Upload the repository contents
- ⚠️ Important: Include the models/sentiment_model.joblib file (or train it during build)
The Space will build and launch the API automatically

Update the URL in chrome-extension/popup.js:

const apiUrl = "https://your-space.hf.space/predict_batch";

Configuration

Main parameters are in config.py. You can modify:

File paths
Model parameters
API URLs

Notes

The model is trained on Reddit data, so it may not be perfect for YouTube
The Chrome extension may need adjustments if YouTube changes its HTML structure
For better performance, we could use a pre-trained model (BERT, etc.)

Known Issues

If the extension doesn't find comments, make sure you've scrolled to load them
The model may take a few seconds to load when starting the API
On Hugging Face, make sure the model is included in the build

Author

FARDAOUI Ilyas

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
chrome-extension		chrome-extension
experiments		experiments
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
COMMANDES_GIT.md		COMMANDES_GIT.md
Dockerfile		Dockerfile
GUIDE_GIT.md		GUIDE_GIT.md
PROJECT_STRUCTURE.md		PROJECT_STRUCTURE.md
README.md		README.md
README.md.bak.20260313142004		README.md.bak.20260313142004
README.md.bak.20260313142311		README.md.bak.20260313142311
README.md.bak.20260313142543		README.md.bak.20260313142543
README.md.bak.20260313143942		README.md.bak.20260313143942
app.py		app.py
config.py		config.py
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py
setup.py		setup.py
test_api.py		test_api.py
test_api.py.bak.20260313141902		test_api.py.bak.20260313141902

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube Sentiment Analysis - MLOps Project

Table of Contents

Project Overview

Introduction

Goals

Architecture

Components

Installation

Prerequisites

Backend Installation

Chrome Extension Installation

Usage

Using the Chrome Extension

Testing the API Manually

Project Structure

Deployment on Hugging Face Spaces

Configuration

Notes

Known Issues

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

YouTube Sentiment Analysis - MLOps Project

Table of Contents

Project Overview

Introduction

Goals

Architecture

Components

Installation

Prerequisites

Backend Installation

Chrome Extension Installation

Usage

Using the Chrome Extension

Testing the API Manually

Project Structure

Deployment on Hugging Face Spaces

Configuration

Notes

Known Issues

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages