Skip to content

EsdrasGrau/NLP-with-Disaster-Tweets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kaggle: Real or Not? NLP with Disaster Tweets

Twitter Sentiment Analysis Tool

Overview

Public competition on Kaggle to predict witch Tweets are about real disasters and with one's aren't, using Machine Learning models.

Summary

As part of our personal development and continuing education, I joined this Kaggle competition with a group of friends to improve our knowledge and develop more experience in the NLP field.

We decided to join this competition as a team to enrich each other experience and obtain better results thru collaboration.

Steps

  1. Data exploration
  2. Data cleaning with Python, Pandas and Regex
  3. Checked the correct spelling and validation of words
  4. Tokenization
  5. Lemmatization
  6. Vectorization of the data and removal of stop words
  7. Exploration of different ML supervised/classification models with Sklearn
  8. Modified hyperparameters to implement a Grid Search and H2o to improve the accuracy of the models
  9. Preparation of the submission file

Results

After implementing different ML models, we achieved an accuracy of 0.80232 with a Support Vector Machine (SVC) model. This result can be improved with other methodologies and libraries.

Next steps

Explore and implement libraries like spacy and word embedding or methodologies like steaming. Also, we could drastically improve the accuracy using google libraries for NLP.

Tools

  • Python
  • Pandas
  • Regex
  • NLTK
  • Sklearn
  • H2o

Our Team

Esdras Campos Saúl Romero César Campuzano
img img img
https://github.com/EsdrasGrau https://github.com/sromero9485 https://github.com/cesarcamp

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors