A retailer has hired me to create customer segments (clusters) using a data-driven approach. By analyzing past transaction-level purchase data, I aim to identify distinct customer groups based on both aggregate sales patterns and specific items purchased.
This project leverages Unsupervised Learning techniques, specifically K-Means Clustering, to uncover meaningful customer segments.
The input dataset is available in the Files folder. It contains historical transaction data, which is analyzed to extract customer insights.
The project follows these steps:
Exploratory Data Analysis (EDA) β Understanding the data, handling missing values, and visualizing patterns.
Feature Engineering β Extracting relevant features from transactional data.
Scaling & Preprocessing β Standardizing data to improve clustering accuracy.
Applying K-Means Clustering β Identifying distinct customer segments.
Evaluation & Visualization β Analyzing segment characteristics using visualizations.
The entire implementation, from data analysis to clustering, is available in the Jupyter Notebook.
Python
Pandas & NumPy
Matplotlib & Seaborn
Scikit-learn (for K-Means Clustering)
The final model successfully segments customers into different groups based on their purchasing behavior. These insights can help the retailer optimize marketing strategies, improve customer engagement, and enhance business decision-making.