This project is a beginner-friendly data analysis of the Superstore dataset from Kaggle.
The goal is to analyze sales trends, top-performing products, customer segments, and regional contributions to provide actionable business insights.
- Source: Kaggle Superstore dataset
- Columns included:
- Order ID, Order Date, Ship Date, Ship Mode
- Customer ID, Customer Name, Segment, Country, City, State, Region
- Product ID, Category, Sub-Category, Product Name, Sales
Please note: The dataset used in this project is included as
train.csv.
- Python 3
- Jupyter Notebook
- Pandas (data manipulation)
- Matplotlib & Seaborn (visualization)
-
Data Cleaning
- Removed duplicate rows
- Ensured numeric columns are correct
- Checked for missing values
-
Sales Analysis
- Total sales by Region
- Total sales by Category and Sub-Category
- Monthly sales trends over time
-
Top Performers
- Top 10 products by sales
- Top 10 customers by sales
-
Segment & Shipping Analysis
- Sales contribution by customer segment
- Sales contribution by shipping mode
-
Optional Visualizations
- Regional heatmap of sales by category
- City-wise and state-wise top sales
- Region with highest sales: West region
- Most profitable category (highest sales): Technology
- Top-selling product: Canon imageCLASS 2200 advanced copier
- Top customer: Sean Miller
- Segment contributing most sales: Consumer
- Sales trend: Monthly sales show seasonal fluctuations with peaks during key periods.
- Clone this repository:
git clone https://github.com/shreyagh1/SuperstoreSales_DataAnalysis.git