Skip to content

NLP spam detection system using TF-IDF and Logistic Regression. Complete pipeline with training, evaluation, model saving, and a live Streamlit app for real-time SMS classification.

License

Notifications You must be signed in to change notification settings

ArjunPramod/SMS-Spam-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📩 SMS Spam Classifier

An end-to-end NLP project to classify SMS messages as Spam or Ham (Not Spam) using traditional machine learning, TF–IDF features, and a Streamlit web interface.


🚀 Overview

This project demonstrates a full ML workflow:

  1. Data ingestion from Kaggle’s SMS Spam Collection Dataset
  2. Text preprocessing with NLTK (cleaning, stopword removal, stemming)
  3. Feature extraction using TF–IDF
  4. Model training & evaluation with Logistic Regression (and optional Naive Bayes)
  5. Model persistence with joblib
  6. Interactive web app built with Streamlit
  7. Deployment-ready for Streamlit Community Cloud

📊 Dataset

Please refer to the dataset page for licensing and citation details.


🧰 Tech Stack

  • Language: Python
  • Libraries:
    • pandas, numpy
    • scikit-learn (TF–IDF, Logistic Regression, Naive Bayes, metrics)
    • nltk (stopwords, stemming)
    • joblib (model serialization)
    • streamlit (web app)

📁 Project Structure

sms-spam-classifier/
├── app.py                     # Streamlit app
├── requirements.txt           # Python dependencies
├── README.md                  # Project documentation
├── data/
│   └── spam.csv               # Kaggle dataset (placed here by you)
├── models/
│   ├── spam_model.pkl         # Trained Logistic Regression model
│   └── vectorizer.pkl         # TF–IDF vectorizer
└── notebooks/
    └── sms_spam_classifier.ipynb  # Training & evaluation notebook

About

NLP spam detection system using TF-IDF and Logistic Regression. Complete pipeline with training, evaluation, model saving, and a live Streamlit app for real-time SMS classification.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published