📩 SMS Spam Classifier

An end-to-end NLP project to classify SMS messages as Spam or Ham (Not Spam) using traditional machine learning, TF–IDF features, and a Streamlit web interface.

🚀 Overview

This project demonstrates a full ML workflow:

Data ingestion from Kaggle’s SMS Spam Collection Dataset
Text preprocessing with NLTK (cleaning, stopword removal, stemming)
Feature extraction using TF–IDF
Model training & evaluation with Logistic Regression (and optional Naive Bayes)
Model persistence with joblib
Interactive web app built with Streamlit
Deployment-ready for Streamlit Community Cloud

📊 Dataset

Name: SMS Spam Collection Dataset
Source (Kaggle): https://www.kaggle.com/datasets/uciml/sms-spam-collection-dataset
Instances: ~5.5k SMS messages labeled as ham or spam

Please refer to the dataset page for licensing and citation details.

🧰 Tech Stack

Language: Python
Libraries:
- pandas, numpy
- scikit-learn (TF–IDF, Logistic Regression, Naive Bayes, metrics)
- nltk (stopwords, stemming)
- joblib (model serialization)
- streamlit (web app)

📁 Project Structure

sms-spam-classifier/
├── app.py                     # Streamlit app
├── requirements.txt           # Python dependencies
├── README.md                  # Project documentation
├── data/
│   └── spam.csv               # Kaggle dataset (placed here by you)
├── models/
│   ├── spam_model.pkl         # Trained Logistic Regression model
│   └── vectorizer.pkl         # TF–IDF vectorizer
└── notebooks/
    └── sms_spam_classifier.ipynb  # Training & evaluation notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📩 SMS Spam Classifier

🚀 Overview

📊 Dataset

🧰 Tech Stack

📁 Project Structure

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
models		models
notebooks		notebooks
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

License

ArjunPramod/SMS-Spam-Classifier

Folders and files

Latest commit

History

Repository files navigation

📩 SMS Spam Classifier

🚀 Overview

📊 Dataset

🧰 Tech Stack

📁 Project Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages