Skip to content

sashafilippova/nlp_fake_news_classification

 
 

Repository files navigation

Fake News Classifier

CAPP 30254 ML Final Project

Authors:

  • Anthony Hakim
  • Sasha Filippova
  • Yifu Hou

Project Descripion:

Research Question: Can we identify fake news articles based on article title alone?

In this project, our team designed 2 Natural Language Processing (NLP) machine learning models to classify fake news articles using only article titles. For our baseline model, we use a logistic regression model and TF-IDF techniques to classify fake news articles with 94% accuracy. We also apply a pre-trained BERT model for classification, and discover that the more complex model preforms with lower accuracy.

Directory:

  • baseline_model.ipynb: TF-IDF logistic regression training and testing.
  • classification.ipynb: Final BERT model hyperparameter tuning, training and testing.
  • original_bert.ipynb: Baseline BERT model training and testing.
  • util.py: file of helper functions to preprocess data.
  • data/: directory containing data.
  • final_presentation: final presentation of results.

Data Visualization:

image

Data Source:

https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset?select=True.csv

About

Can we determine whether an article is fake news based on title only? We tried to answer the question in this project using ML

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 99.1%
  • Python 0.9%