fake-news-detection-machine-learning

Project Files

images: contains all the images used in the project.
files: contains all CSV data files.
news_scraping.ipynb: web scraping news sites with beautiful soup and selenium.
data_cleaning.ipynb: carrying out data cleaning with regex etc.
EDA.ipynb: exploratory data analysis and visualisations of data.
sentiment_analysis.ipynb: NLP with textblob and vader.
fake_news_prediction.ipynb: using different ML algorithms to predict if text is fake or real.

Background

In recent years media consumption habits have changed due to the spread of the internet. More people get their news from digital sources such as social media and search engines than ever before. News articles from such sources unfortunately often have very little to do with the truth. The aim of this project is to use natural language processing techniques and machine learning algorithms to see if we can detect such fake news articles.

See my presentation here for further details!

Workflow

Scraped articles from real news sites and combined with dataset of fake news from kaggle.
Cleaned the text. Removed URLS, HTML tags & punctuation.
Preprocessed data - removed stopwords and lemmatised the text.
Converted text to vectors using the TF-IDF vectoriser.
Applied different machine learning algorithms including, SVM, Random Forest & logistic regression to the data.
Created a python function which takes a news article as a user input and then vectorises the text and predicts whether it is fake or real using support vector machine.

Results

SVM was the best model overall
Accuracy score: 0.96
F1 score: 0.96

Some further findings

Most common words from the fake news articles.

Most common words from the real news articles

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fake-news-detection-machine-learning

Table of Contents

Project Files

Background

Workflow

Results

Some further findings

Libraries

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
files		files
images		images
.gitignore		.gitignore
EDA.ipynb		EDA.ipynb
README.md		README.md
data_cleaning.ipynb		data_cleaning.ipynb
fake_news_prediction.ipynb		fake_news_prediction.ipynb
news_scraping.ipynb		news_scraping.ipynb
sentiment_analysis.ipynb		sentiment_analysis.ipynb

Folders and files

Latest commit

History

Repository files navigation

fake-news-detection-machine-learning

Table of Contents

Project Files

Background

Workflow

Results

Some further findings

Libraries

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages