Creating a model to recognize hate speech in tweets
The goal of this project is to identify tweets containing sexism and racism as two prominent aspects of hate speech. The dataset used is available on Hugging Face and includes 31,962 labled tweets. To do so several classification models were implemented:
- Decision Tree
- Logistic Regression
- Naive Bayes
- Support Vector Machine
- Neural Network
- K-Nearest Neighbor
Install dependencies
Use the Notebooks of the different models to assess.
The best model's configuration and state files are saved in the src/models .
This project started in April 2022 at the University of Mannheim. The team consists of: