The dataset is available :- https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge
This is multi-class classification problem
Total classes 6:
Toxic
Severe_toxic
Obscene
Threat
Insult
Identity_hate
Word embeddings are made for every word using word2vec .
They are averaged for every sentence, so for every sentence we will get some encoding
Then cmeans (fuzzy clustering or soft clustering so that it can be classified in to more than one class) is used to get the membership for clusters
Word embeddings are made for every word using word2vec.
They are sent to RNN to get the encoding for every sentence
Then cmeans (fuzzy clustering or soft clustering so that it can be classified in to more than one class) is used to get the membership for clusters