- 🧾 Overview
- 🎯 System Purpose and Scope
- 🔍 Main Features
- 💡 Possible Improvements
- 📁 Folder Structure
- 🏗️ Architecture Diagram
- ⚙️ Installation and Usage
- 🚀 Deployment
- 🧑💻 Collaborators
This document provides a technical overview of the nlp-team2 repository, a machine learning system designed for classifying comments of youtube videos as toxic or not toxic based on NLP (Natural Language Processing) tecnology. The repository implements a complete pipeline from data exploration to model deployment, focusing on ensuring high accuracy and recall for toxic comments detection.
The nlp-team2 repository serves as a comprehensive mushroom classification system that:
- Analyzes relationships between youtube comments characteristics and toxici
- Trains and evaluates multiple classification models
- Deploys the best-performing model for inference
- Provides a structured input validation system to ensure reliable predictions
- Deploys a functional, intuitive API with great analysis capacity.
✅ Complete EDA with visualizations to understand variable relationships.
✅ Trained Transformer model to predict whenever a comment is toxic or not
✅ Connecting to the API
✅ Connecting the backend with the frontend
✅ Implement solution using CSS, HTML, and Vanilla JavaScript
✅ Create connection with the database
✅ Dockerized version of the program.
✅ Model in production
⏩ Analysis of other social networks
⏩ Higher speed in loading results
⏩ Experiments or deployments with neural network models.
📂 Mushroom-Classifier/
nlp-team2/
├── .github/
│ └── pull_request_template.md
├── .gitignore
├── .qodo/
│ └── testConfig.toml
├── client/
│ ├── .gitignore
│ ├── Dockerfile
│ ├── README.md
│ ├── eslint.config.js
│ ├── index.html
│ ├── package-lock.json
│ ├── package.json
│ ├── postcss.config.js
│ ├── public/
│ │ ├── img/
│ │ │ ├── gifgodzilla.gif
│ │ │ ├── icongojira.png
│ │ │ ├── logoFull.png
│ │ │ ├── logoPet.png
│ │ │ └── logofont.png
│ │ ├── index.html
│ │ └── vite.svg
│ ├── src/
│ │ ├── App.css
│ │ ├── App.jsx
│ │ ├── Main.jsx
│ │ ├── assets/
│ │ │ └── react.svg
│ │ ├── components/
│ │ │ ├── AnalyzeTab.jsx
│ │ │ ├── AppMetadata.jsx
│ │ │ ├── Dashboard.jsx
│ │ │ ├── GuideTab.jsx
│ │ │ ├── Header.jsx
│ │ │ ├── HistoryTab.jsx
│ │ │ ├── ProgressLoader.jsx
│ │ │ ├── Sidebar.jsx
│ │ │ └── ToxicityBadge.jsx
│ │ ├── contexts/
│ │ │ └── ThemeContext.jsx
│ │ ├── hooks/
│ │ │ ├── useApiData.js
│ │ │ └── useAppInfo.js
│ │ ├── index.css
│ │ └── utils/
│ │ └── mockData.js
│ ├── tailwind.config.js
│ ├── tests/
│ │ ├── App.test.jsx
│ │ ├── README.md
│ │ ├── basic-component.test.jsx
│ │ ├── basic.test.js
│ │ ├── components/
│ │ │ ├── Dashboard.test.jsx
│ │ │ ├── Header.test.jsx
│ │ │ ├── Sidebar.test.jsx
│ │ │ └── ThemeContext.test.jsx
│ │ └── setup/
│ │ ├── apiMocks.js
│ │ └── setupTests.js
│ │ └── utils/
│ │ └── mockData.test.js
│ ├── vite.config.js
│ └── vitest.config.js
├── docker-compose.yml
├── eda/
│ └── eda_nlp_(1).ipynb
├── mlFlow/
│ ├── NOTES.md
│ ├── README.md
│ ├── data/
│ │ └── raw/
│ │ ├── hatespeech.csv
│ │ ├── homophobia_human_like_1000.csv
│ │ └── youtoxic_enriched_full.csv
│ ├── enviroment.yml
│ ├── experiments/
│ │ ├── demo_feature_engineering.py
│ │ ├── mlflow_experiments.py
│ │ └── test_transformer_simple.py
│ ├── notebooks/
│ │ ├── eda-nlp.ipynb
│ │ ├── mlflow-experiments.ipynb
│ │ └── model-comparison.ipynb
│ ├── requirements.txt
│ ├── scripts/
│ │ ├── check_experiments.py
│ │ ├── save_current_model.py
│ │ ├── test_backup_model.py
│ │ └── test_model_predictions.py
│ └── src/
│ ├── __init__.py
│ ├── data_preprocessing.py
│ ├── feature_engineering.py
│ ├── model_utils.py
│ └── transformer_models_clean.py
├── requirements.txt
├── server/
│ ├── Dockerfile
│ ├── __main__.py
│ ├── core/
│ │ ├── LOGGING_GUIDE.md
│ │ ├── __init__.py
│ │ ├── config.py
│ │ └── print_dev.py
│ ├── database/
│ │ ├── __init__.py
│ │ ├── db_manager.py
│ │ └── models.py
│ ├── main.py
│ ├── ml/
│ │ ├── __init__.py
│ │ ├── api/
│ │ │ ├── __init__.py
│ │ │ └── toxicity_routes.py
│ │ ├── pipeline.py
│ │ └── predictor.py
│ ├── scraper/
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── progress_manager.py
│ │ ├── scrp.py
│ │ └── scrp_socket.py
│ └── tests/
│ ├── README.md
│ ├── conftest.py
│ ├── pytest.ini
│ ├── run_coverage.ps1
│ ├── run_coverage.sh
│ ├── test_database.py
│ ├── test_main.py
│ ├── test_print_dev.py
│ └── test_scrp.pygit clone [https://github.com/Bootcamp-IA-P4/nlp-team2)
cd nlp-team2python -m venv .venv
source .venv/bin/activate # On Linux/MacOS
.venv\Scripts\activate # On Windowspip install -r requirements.txtuvicorn server.main:app --reloadThis project was developed by the following contributors:

