A production-grade SMS spam detection system with high availability and disaster recovery capabilities. The system combines machine learning for spam detection with enterprise-level DevOps practices.
- Real-time SMS spam detection using machine learning
- Multi-language support with 100% precision
- Interactive UI with word cloud visualization
- Built with Python, Scikit-learn, NLTK, and Streamlit
The SMS Spam Collection dataset was collected from Kaggle, which contains over 5,500 SMS messages labeled as either spam or not spam. You can access the dataset from here
The data was cleaned by handling null and duplicate values, and the "type" column was label-encoded. The data was then preprocessed by converting the text into tokens, removing special characters, stop words and punctuation, and stemming the data. The data was also converted to lowercase before preprocessing.
- Fault-tolerant Jenkins infrastructure with 2-node HA cluster using Kubernetes StatefulSets
- Automated multi-region backup strategy with 24-hour RPO using AWS S3
- Automated disaster recovery system with <5 minute RTO using AWS Route53
- Comprehensive monitoring using Prometheus/Grafana with Slack alerts
- RBAC-based access control system
- ML/Data Science: Python, Scikit-learn, Pandas, NumPy, NLTK
- Web Framework: Streamlit
- DevOps Tools: Jenkins, Kubernetes, Docker
- Monitoring: Prometheus, Grafana, AlertManager
- Cloud Services: AWS (S3, Route53)
- Visualization: Matplotlib, Seaborn, WordCloud
- Real-time spam detection
- Automated CI/CD pipelines
- Cross-region disaster recovery
- Real-time system monitoring
- Secure access control
- Automated backup system
The model was deployed on the web using Streamlit. The user interface has a simple input box where the user can input a message, and the model will predict whether it is spam or not spam.
To try out the SMS Spam Detection model, visit here.
- Clone the repository
- Install dependencies:
pip install -r requirements.txt- Start the application:
streamlit run app.py- Deploy infrastructure:
kubectl apply -f k8s-manifests/- Configure Kubernetes cluster
- Apply Kubernetes manifests
- Setup monitoring stack
- Configure backup system
- Verify HA setup
- Role-based access control (RBAC)
- Secure credential management
- Automated backup system
- Real-time security monitoring
- System metrics visualization
- Real-time performance monitoring
- Automated Slack notifications
- Custom alert thresholds
Contributions are welcome! Please feel free to submit pull requests.
MoggerNet - 2024