Machine learning project for predicting and understanding customer churn in the telecom industry.
This project analyzes telecom customer data to identify factors that influence churn and builds machine-learning models to predict customers likely to leave. The goal is to help businesses improve retention and reduce revenue loss.
The dataset contains 7,043 customer records with:
- Demographics: gender, senior citizen, dependents
- Account details: tenure, contract, payment method, billing
- Services: phone, internet, backup, security, streaming, tech support
- Target variable:
Churn(Yes/No)
Example columns:
customerID | gender | tenure | InternetService | Contract | MonthlyCharges | Churn
- Dropped irrelevant column:
customerID - Converted
TotalChargesto numeric - Handled missing values and tenure=0 cases
- Standardized categorical values (e.g., SeniorCitizen)
Key EDA insights (visuals recommended):
- Churn distribution
- Tenure vs churn patterns
- Contract type impact
- Monthly charges comparisons
- Service usage behavior
(Insert visualizations or screenshots here.)
- Label encoding for categorical variables
- Standardizing numerical features:
tenure,MonthlyCharges,TotalCharges - Trainβtest split
- Scaling with StandardScaler
- Key practices: split before preprocessing/oversampling, fit transformers on training only, oversample training set only.
Models used:
- Logistic Regression
- KNN
- SVC
- Decision Tree
- Random Forest
- Gradient Boosting
- AdaBoost
- XGBoost
- CatBoost
- Voting Classifier (ensemble)
Evaluation metrics: Accuracy, Precision, Recall, F1-score, ROC-AUC, Confusion Matrix.
(Insert model comparison tables and confusion matrices.)
-
Identified key churn drivers
-
Best-performing model: (Insert your final model and score here)
- Example: Gradient Boosting β Accuracy: 80%, ROC-AUC: 0.82
- Python
- Pandas, NumPy
- Matplotlib, Seaborn, Plotly
- Scikit-learn
- XGBoost, CatBoost
git clone https://github.com/wonderakwei/Telecom-Customer-Churn-Analysis.git
cd Telecom-Customer-Churn-Analysis
pip install -r requirements.txt
jupyter notebook Telecom_Churn_Prediction.ipynb- Streamlit web app deployment
- SHAP feature importance
- More hyperparameter tuning
- Interactive churn dashboard
Wonder Akwei Data Analyst | Machine Learning Enthusiast | Fintech Operations
Email: akweiwonder@outlook.com
LinkedIn: https://www.linkedin.com/in/wonderakwei/