Kaggle-Toxic-Comment-Classification

Dataset

The dataset is available :- https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge
This is multi-class classification problem
Total classes 6:
Toxic
Severe_toxic
Obscene
Threat
Insult
Identity_hate

Idea of code

Word embeddings are made for every word using word2vec .
They are averaged for every sentence, so for every sentence we will get some encoding
Then cmeans (fuzzy clustering or soft clustering so that it can be classified in to more than one class) is used to get the membership for clusters

Word embeddings are made for every word using word2vec.
They are sent to RNN to get the encoding for every sentence
Then cmeans (fuzzy clustering or soft clustering so that it can be classified in to more than one class) is used to get the membership for clusters

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
classify.py		classify.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Kaggle-Toxic-Comment-Classification

Dataset

Idea of code

About

Uh oh!

Releases

Packages

Languages

Varchita-Beena/Kaggle-Toxic-Comment-Classification

Folders and files

Latest commit

History

Repository files navigation

Kaggle-Toxic-Comment-Classification

Dataset

Idea of code

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages