Skip to content

A machine learning project that analyzes telecom customer behavior and predicts churn using advanced models. It explores key drivers of churn, visualizes patterns, and evaluates multiple algorithms to help telecom companies improve retention and reduce revenue loss.

Notifications You must be signed in to change notification settings

wonderakwei/Telecom-Customer-Churn-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 

Repository files navigation

πŸ“ˆ Telecom Customer Churn Prediction

Machine learning project for predicting and understanding customer churn in the telecom industry.


πŸ“Œ Introduction

This project analyzes telecom customer data to identify factors that influence churn and builds machine-learning models to predict customers likely to leave. The goal is to help businesses improve retention and reduce revenue loss.


πŸ“‚ Dataset Overview

The dataset contains 7,043 customer records with:

  • Demographics: gender, senior citizen, dependents
  • Account details: tenure, contract, payment method, billing
  • Services: phone, internet, backup, security, streaming, tech support
  • Target variable: Churn (Yes/No)

Example columns:

customerID | gender | tenure | InternetService | Contract | MonthlyCharges | Churn

🧹 Data Cleaning

  • Dropped irrelevant column: customerID
  • Converted TotalCharges to numeric
  • Handled missing values and tenure=0 cases
  • Standardized categorical values (e.g., SeniorCitizen)

πŸ“Š Exploratory Data Analysis (EDA)

Key EDA insights (visuals recommended):

  • Churn distribution
  • Tenure vs churn patterns
  • Contract type impact
  • Monthly charges comparisons
  • Service usage behavior

(Insert visualizations or screenshots here.)


🧽 Data Preprocessing

  • Label encoding for categorical variables
  • Standardizing numerical features: tenure, MonthlyCharges, TotalCharges
  • Train–test split
  • Scaling with StandardScaler
  • Key practices: split before preprocessing/oversampling, fit transformers on training only, oversample training set only.

πŸ€– Modeling

Models used:

  • Logistic Regression
  • KNN
  • SVC
  • Decision Tree
  • Random Forest
  • Gradient Boosting
  • AdaBoost
  • XGBoost
  • CatBoost
  • Voting Classifier (ensemble)

Evaluation metrics: Accuracy, Precision, Recall, F1-score, ROC-AUC, Confusion Matrix.

(Insert model comparison tables and confusion matrices.)


πŸ† Results

  • Identified key churn drivers

  • Best-performing model: (Insert your final model and score here)

    • Example: Gradient Boosting β€” Accuracy: 80%, ROC-AUC: 0.82

πŸ›  Technologies Used

  • Python
  • Pandas, NumPy
  • Matplotlib, Seaborn, Plotly
  • Scikit-learn
  • XGBoost, CatBoost

▢️ How to Run

git clone https://github.com/wonderakwei/Telecom-Customer-Churn-Analysis.git
cd Telecom-Customer-Churn-Analysis
pip install -r requirements.txt
jupyter notebook Telecom_Churn_Prediction.ipynb

πŸš€ Future Improvements

  • Streamlit web app deployment
  • SHAP feature importance
  • More hyperparameter tuning
  • Interactive churn dashboard

πŸ‘€ Author

Wonder Akwei Data Analyst | Machine Learning Enthusiast | Fintech Operations

Email: akweiwonder@outlook.com

LinkedIn: https://www.linkedin.com/in/wonderakwei/


About

A machine learning project that analyzes telecom customer behavior and predicts churn using advanced models. It explores key drivers of churn, visualizes patterns, and evaluates multiple algorithms to help telecom companies improve retention and reduce revenue loss.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published