Skip to content

Prakashpsk/-Fake-News-Detection-Using-NLP-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Fake News Detection Using NLP πŸ“°πŸ€–

πŸ“Ή Watch my video explanation here.

Introduction

πŸ“°πŸ€– Fake news refers to misinformation or disinformation spread through various channels, including word of mouth and digital communication platforms like WhatsApp messages and social media posts. It spreads faster than real news, creating significant societal issues such as fear and misunderstanding. This project aims to address these problems using classical NLP techniques to classify whether a given message or text is real or fake.

Skills/Concepts Developed

πŸ” Text preprocessing using NLP techniques πŸ“Š Implementing Bag of n-grams πŸ› οΈ Building classification models using different algorithms πŸ” Model evaluation and comparison πŸ“Š Visualization of model performance with confusion matrices

Problem Statement

🚨 The proliferation of fake news poses a significant challenge to society, leading to misinformation, confusion, and social unrest. This project seeks to develop a robust fake news detection system to help identify and combat the spread of false information.

Impact of Your Work

🌐 By accurately classifying messages as real or fake, this project contributes to reducing the spread of misinformation and promoting a more informed society. It helps individuals and organizations make better decisions by distinguishing reliable information from falsehoods.

Modeling

Steps

  • Data Loading and Initial Exploration
  • Data Preparation
  • Model Building and Evaluation
    • Attempt 1: KNN with Euclidean Distance
    • Attempt 2: KNN with Cosine Distance
    • Attempt 3: Random Forest with Trigrams
    • Attempt 4: Multinomial Naive Bayes with Unigrams and Bigrams
  • Text Preprocessing with NLP
  • Train-Test Split with Preprocessed Text
  • Model Evaluation with Preprocessed Text
    • Random Forest with Trigrams
    • Random Forest with Unigrams, Bigrams, and Trigrams
  • Confusion Matrix Visualization

Visualization

πŸ“Š Confusion matrices are visualized to understand the performance of classification models in distinguishing between real and fake news.

Conclusion

πŸŽ‰ This project demonstrates how classical NLP techniques and various classification algorithms can be employed to tackle the problem of fake news detection. By preprocessing the text data and using advanced machine learning models, we can effectively classify messages as real or fake, thereby contributing to reducing the spread of misinformation.

Recommendations

πŸš€ Further exploration of advanced NLP techniques such as word embeddings and deep learning models could enhance the performance of fake news detection. πŸ“‘ Integration of real-time data sources and social media monitoring tools can improve the timeliness and accuracy of detecting fake news.

How to Use

πŸ› οΈ Users can leverage the provided code and guidelines to build their fake news detection system using similar datasets and methodologies. The code is well-documented and modular, allowing for easy customization and extension.

Credits

πŸ‘ Dataset: Code Basics

About the Author

πŸ‘¨β€πŸ’Ό [Prakash. P] is a [Data Scientist] with expertise in [Natural Language Processing] and [Machine Learning]. Feel free to reach out for any questions or collaborations [prakash2822001@gmail.com].

About

πŸš€ Developed an ML model to classify messages as real or fake. Utilized Bag of n-grams with CountVectorizer for text preprocessing. Implemented various classification algorithms to combat the spread of misinformation and its societal impact. πŸŒπŸ“Š

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors