Skip to content

Denis0242/Customer-Churn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

30 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“Š Customer Churn Prediction | Product Data Science Case Study

Product Data Science | Retention Analytics | Machine Learning | Experimentation Strategy | Revenue Optimization


๐Ÿš€ Executive Summary

Customer churn directly impacts revenue, growth, and long-term product sustainability. This project approaches churn not just as a classification problem โ€” but as a product retention and revenue optimization challenge.


The goal is to:

  • Identify high-risk churn segments
  • Understand behavioral and contractual drivers
  • Quantify business impact
  • Recommend product-level interventions
  • Enable experimentation-driven retention strategies

This mirrors how churn is handled in SaaS, FinTech, HealthTech, and Telecom product organizations.


๐ŸŽฏ Product Problem Statement

Subscription-based products experience revenue leakage when customers cancel early.

  • Key Product Questions:
  • Which customers are most likely to churn?
  • What signals predict churn behavior?
  • When is the highest-risk churn window?
  • What retention experiment should be prioritized?

๐Ÿ“ˆ North Star Metric (NSM)

- Retention Rate / Active Subscription Rate

Supporting Metrics:

  • Monthly Recurring Revenue (MRR)
  • Customer Churn Rate
  • Customer Lifetime Value (CLV)
  • Average Revenue Per User (ARPU)
  • Tenure Distribution
  • Contract Conversion Rate

๐Ÿ“Š Business Context Simulation

Assume:

  • 10,000 active customers
  • 26% churn rate
  • $70 average monthly revenue
  • A 5% reduction in churn leads to significant annual revenue preservation.

This project provides predictive modeling to enable proactive intervention before churn occurs.

๐Ÿ” Exploratory Product Insights

Key findings from EDA:

  • Month-to-month contracts have the highest churn probability
  • Short-tenure customers churn within the early lifecycle stage
  • Higher monthly charges correlate with increased churn
  • Electronic check payment method shows elevated churn behavior

These insights inform targeted retention strategies.


๐Ÿค– Machine Learning Approach

Models Evaluated:

  • Logistic Regression
  • Random Forest
  • Decision Tree
  • Gradient Boosting (if applicable)

Evaluation Metrics:

  • Accuracy
  • Precision
  • Recall
  • F1 Score
  • ROC-AUC

The selected model achieved strong ROC-AUC performance and improved churn risk identification.


๐Ÿง  Feature Importance & Driver Analysis

Top predictors of churn:

  • Contract Type
  • Tenure
  • Monthly Charges
  • Internet Service
  • Payment Method

Interpretation:

Churn behavior is influenced primarily by lifecycle stage and contract structure rather than demographics โ€” indicating product and pricing optimization opportunities.


๐Ÿงช Experimentation Strategy (Product Lens)

  • Instead of stopping at prediction, this project proposes actionable experiments:

1๏ธโƒฃ Contract Conversion Incentive

  • Offer discounted annual plans to month-to-month customers.

Hypothesis:

  • Customers transitioning to long-term contracts will reduce churn probability.

2๏ธโƒฃ Early Lifecycle Engagement Experiment

  • Target customers within the first 90 days with onboarding nudges.

Hypothesis:

  • Improved onboarding engagement increases long-term retention.

3๏ธโƒฃ High-Value Customer Retention Offer

  • Provide personalized loyalty offers to high ARPU customers flagged as high churn risk.

Hypothesis:

  • Targeted retention incentives preserve revenue efficiently.

All experiments can be validated using A/B testing frameworks.

๐Ÿ—๏ธ End-to-End Workflow

  • Data Cleaning & Preprocessing
  • Exploratory Data Analysis (EDA) with business interpretation
  • Feature Engineering (contract type, tenure segmentation, billing patterns)
  • Encoding & Scaling
  • Train-Test Split
  • Model Training (Logistic Regression, Random Forest, etc.)
  • Model Evaluation (ROC-AUC, Precision, Recall, F1)
  • Feature Importance Analysis
  • Product-Level Interpretation & Strategy Recommendation

๐Ÿ› ๏ธ Tech Stack

  • Python
  • Pandas
  • NumPy
  • Scikit-learn
  • Matplotlib
  • Seaborn
  • Jupyter Notebook
  • Streamlit (optional deployment)

๐Ÿ“‚ Project Structure

Customer-Churn/
โ”‚
โ”œโ”€โ”€ data/
โ”œโ”€โ”€ notebooks/
โ”œโ”€โ”€ models/
โ”œโ”€โ”€ app.py
โ”œโ”€โ”€ requirements.txt
โ””โ”€โ”€ README.md

๐Ÿ“Š Business Impact Summary

  • If implemented in production, this system enables:
  • Proactive churn detection
  • Targeted retention campaigns
  • Revenue preservation
  • Experiment-driven product decisions
  • Lifecycle-based segmentation

Even a 3โ€“5% reduction in churn materially increases annual recurring revenue and customer lifetime value.


๐Ÿ”ฎ Future Enhancements

  • SHAP-based model explainability
  • Survival analysis for churn timing
  • Uplift modeling for retention targeting
  • Real-time churn scoring API
  • Automated A/B testing simulator integration
  • Deployment with FastAPI

๐ŸŽฏ Skills Demonstrated

  • Product Data Science
  • Customer Retention Analytics
  • Churn Modeling
  • Predictive Modeling
  • Feature Engineering
  • Machine Learning
  • Experiment Design
  • Revenue Optimization
  • Business Analytics
  • Python

How to Run this

Clone the repository:

Install dependencies:

  • pip install -r requirements.txt

Run the notebook:

  • jupyter notebook

If using Streamlit (optional):

  • streamlit run app.py

๐Ÿ‘ค Author

Denis Agyapong

Product Data Scientist | Data Analyst

Oakland, CA

About

๐Ÿ“Š End-to-end Customer Churn Prediction project using Python, SQL-style analysis, and Machine Learning to identify at-risk users, improve retention strategy, and support product decision-making.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors