This project aims to predict customer churn for a telecom company using machine learning models. The dataset used is the Telco Customer Churn dataset from Kaggle. The models implemented include Random Forest and XGBoost, and performance is evaluated using accuracy, classification reports, and confusion matrices.
- Data Preprocessing:
- Handling missing values
- Encoding categorical variables
- Standardizing numerical features
- Machine Learning Models:
- Random Forest Classifier
- XGBoost Classifier
- Evaluation Metrics:
- Accuracy Score
- Classification Report
- Confusion Matrix
- Feature Importance Analysis
To run this project locally, follow these steps:
Ensure you have Python installed along with the following libraries:
pip install pandas numpy matplotlib seaborn scikit-learn xgboostgit clone https://github.com/Adithya-5369/CustomerChurnModel.git
cd CustomerChurnModel- Ensure the Telco Customer Churn dataset is placed in the
data/directory. - Run the Python script to preprocess the data, train models, and evaluate results.
python telco_churn_prediction.pyThe dataset consists of customer details, contract information, and service usage features. The target variable is Churn, indicating whether a customer leaves the service.
- Model performances are evaluated using confusion matrices and feature importance plots.
- The best-performing model can be fine-tuned further for better results.
The project provides visualizations such as:
- Confusion matrices for model evaluation
- Feature importance charts for interpretability
TelcoChurnPrediction/
│── data/ # Directory containing dataset files
│ └── Telco-Customer-Churn.csv # Dataset file
│── telco_churn_prediction.py # Main script
│── README.md # Project documentation
- Implement additional models like Logistic Regression and Neural Networks.
- Hyperparameter tuning for better model accuracy.
- Deploy as a web application using Flask or Streamlit.
This project is licensed under the MIT License.
You are free to use, modify, and distribute this code with attribution.