Skip to content

Afzal632/Memes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Meme Harm Detection using YOLOv8

A computer vision project that classifies memes as harmful or safe using YOLOv8 object detection. Built and trained on Google Colab with GPU acceleration, supporting both English and Arabic meme datasets.


Overview

Social media platforms host millions of memes daily — some harmless, others potentially harmful to individuals or communities. This project trains YOLOv8 models to automatically detect and classify memes into distinct categories, enabling content moderation at scale.

Two separate models are trained:

  • English Memes Model — general meme classification
  • Arabic Memes Model — Arabic-language meme classification

Categories

Label Description
normal Neutral, harmless meme content
harmful-to-individual Content that targets or harms individuals
community-society-organization Safe community/social content

Results

Model mAP50 mAP50-95 Precision Recall
English Memes 0.685 0.661 0.456 0.851
Arabic Memes 0.992 0.992 0.957 0.973

The Arabic model was trained on a more balanced and curated dataset, yielding significantly higher performance.


Project Structure

Memes/
├── MemesData.ipynb          # Training pipeline for English memes
├── ArabicMemesData.ipynb    # Training pipeline for Arabic memes
└── README.md

Tech Stack

  • Model: YOLOv8 nano (yolov8n.pt) — Ultralytics
  • Training: Google Colab (Tesla T4 GPU, CUDA 12.2)
  • Optimizer: AdamW
  • Framework: Python, PyTorch

Training Configuration

Hyperparameter Value
Epochs 300
Batch Size 16
Image Size 640 × 640
Optimizer AdamW
Early Stopping Patience 200
Confidence Threshold (inference) 0.25

How to Run

Prerequisites

pip install ultralytics

1. Clone the repo

git clone https://github.com/Afzal632/Memes.git
cd Memes

2. Open in Google Colab

Upload either notebook to Google Colab and mount your Google Drive:

from google.colab import drive
drive.mount('/content/drive')

3. Prepare your dataset

Place your dataset in Google Drive following this structure:

/content/drive/MyDrive/
└── your-dataset/
    ├── data.yaml
    ├── train/
    │   ├── images/
    │   └── labels/
    ├── valid/
    │   ├── images/
    │   └── labels/
    └── test/
        ├── images/
        └── labels/

4. Train

Run all cells in the notebook. The pipeline will:

  1. Check GPU availability
  2. Install dependencies
  3. Train the model (resumes from checkpoint if last.pt exists)
  4. Validate on the test set
  5. Run inference and display predictions

Training Pipeline

# Train
!yolo task=detect mode=train model=yolov8n.pt \
    data=data.yaml \
    epochs=300 \
    batch=16 \
    imgsz=640 \
    optimizer=AdamW \
    patience=200 \
    plots=True

# Validate
!yolo task=detect mode=val model=runs/detect/train/weights/best.pt data=data.yaml

# Predict
!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt \
    conf=0.25 source=datasets/test/images

Notes

  • The English model has high recall (0.851) but lower precision (0.456), meaning it catches most harmful memes but produces some false positives.
  • The Arabic model achieves near-perfect scores, though results should be interpreted carefully given the smaller test set size.
  • Models are saved as best.pt and last.pt and can be reused for inference without retraining.

Use Cases

  • Social media content moderation
  • Automated flagging of harmful meme content
  • Research on Arabic and English online harmful content
  • Dataset labeling assistance

License

This project is intended for academic and research purposes.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors