Skip to content

This case study focuses on analyzing retail transactions to uncover insights about customer behavior, product performance, and store-level trends. The goal is to help the business make data-driven decisions by understanding transaction patterns across cities, stores, product categories, and time periods.

Notifications You must be signed in to change notification settings

Dipesh-Ydv/RetailTransaction-Data-Analysis-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛒 Retail Store Transaction Analysis

📘 Business Problem

A retail store needs to analyze its daily transactions and track customer behavior across various locations. This includes analyzing purchases and returns across multiple product categories. The goal is to derive business insights that can help improve customer understanding, optimize product offerings, and enhance store operations.


🎯 Objective

To perform a comprehensive data analysis by:

  • Merging and transforming multiple datasets
  • Generating descriptive statistics and visualizations
  • Extracting key business metrics and actionable insights

📂 Datasets Used

  • Customers.csv: Contains customer demographics and city codes
  • Product_cat_info.csv: Contains information about product categories and subcategories
  • Transactions.csv: Contains detailed transaction data including date, store type, and transaction amount

🛠️ Tasks Performed

🔁 1. Merge Datasets

  • Merged Customers, Product_cat_info, and Transactions into a single dataset Customer_Final
  • Used inner join to include only customers with valid transactions

📊 2. Summary Report

  • a. Columns Overview: Listed column names and data types
  • b. Data Preview: Displayed top 10 and bottom 10 rows
  • c. Five-number Summary: Calculated min, Q1, median, Q3, and max for all continuous variables
  • d. Frequency Tables: Generated frequency distributions for all categorical variables

📉 3. Visualizations

  • Histograms for continuous variables

📆 4. Transaction Metrics

  • Time Period: Identified the start and end dates of available transaction data
  • Negative Transactions: Counted transactions with negative total amounts

👨‍👩‍👧‍👦 5. Gender-Based Product Analysis

  • Compared product category popularity between female and male customers

🏙️ 6. City Code Dominance

  • Found the city code with the maximum number of customers
  • Calculated the percentage share of customers from that city

🏬 7. Store Type Performance

  • Determined the store type that sold the most products by:
    • Total transaction value
    • Total quantity sold

🧾 8. Revenue by Category & Store

  • Calculated total revenue from Electronics and Clothing categories in Flagship Stores

👨 9. Male Electronics Revenue

  • Computed total amount spent by male customers in the Electronics category

🔢 10. Frequent Customers

  • Identified customers with more than 10 unique transactions
  • Filtered out all transactions with negative amounts

🧑‍🎓 11. Age-Based Spending (25–35 Years)

  • a. Category Spending: Total spent on Electronics and Books
  • b. Time-Range Spending: Total spent between 1st Jan 2014 and 1st Mar 2014

💻 Technologies Used

  • Python
    • Pandas
    • NumPy
    • Matplotlib
    • Seaborn
  • Jupyter Notebook
  • CSV Files for input data

📈 Insights & Business Value

  • Identified high-performing store types and product categories
  • Derived customer preferences across gender and age groups
  • Located regions with high customer concentration
  • Highlighted opportunities to reduce return rates and increase transaction value

📁 Project Structure

Retail_Transaction_Analysis/
├── data/
│   ├── Customers.csv
│   ├── Product_cat_info.csv
│   └── Transactions.csv
├── notebooks/
│   └── Retail_Store_Analysis.ipynb
└── README.md

🔮 Future Scope

  • Apply machine learning for customer segmentation and lifetime value prediction
  • Build recommendation systems for cross-sell and up-sell strategies
  • Perform real-time analytics using streaming data solutions

🙌 Acknowledgements

Thanks to the data science and open-source communities for enabling powerful data analysis using Python and its libraries.


About

This case study focuses on analyzing retail transactions to uncover insights about customer behavior, product performance, and store-level trends. The goal is to help the business make data-driven decisions by understanding transaction patterns across cities, stores, product categories, and time periods.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published