Skip to content

Varchita-Beena/Kaggle-Toxic-Comment-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Kaggle-Toxic-Comment-Classification

Dataset

The dataset is available :- https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge
This is multi-class classification problem
Total classes 6:
Toxic
Severe_toxic
Obscene
Threat
Insult
Identity_hate

Idea of code

Word embeddings are made for every word using word2vec .
They are averaged for every sentence, so for every sentence we will get some encoding
Then cmeans (fuzzy clustering or soft clustering so that it can be classified in to more than one class) is used to get the membership for clusters

Word embeddings are made for every word using word2vec.
They are sent to RNN to get the encoding for every sentence
Then cmeans (fuzzy clustering or soft clustering so that it can be classified in to more than one class) is used to get the membership for clusters

About

Kaggle-toxic-comment-classification using Keras, RNN, Word2vec, Averaging sentence, fuzzy clustering

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages