Skip to content

This project analyzes over 1 million rows of Apple retail sales data using advanced SQL techniques. It covers key insights on sales trends, product performance, store analysis, and warranty claims across various Apple locations. The project demonstrates SQL proficiency in joins, window functions, aggregations, and query optimization for large-scale

Notifications You must be signed in to change notification settings

ManavLohia945/Apple-1M-Rows-SQL-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Apple_Changsha_RetailTeamMembers_09012021_big jpg slideshow-xlarge_2x

Apple Retail Sales SQL Project - Analyzing Millions of Sales Rows

Project Overview

This project is designed to showcase my advanced SQL querying techniques through the analysis of over 1 million rows of Apple retail sales data. The dataset includes information about products, stores, sales transactions, and warranty claims across various Apple retail locations globally. By tackling a variety of questions, from basic to complex, I have demonstrated my ability to write sophisticated SQL queries that extract valuable insights from large datasets.

Entity Relationship Diagram (ERD)

ERD

Database Schema

The project uses five main tables:

  1. stores: Contains information about Apple retail stores.

    • store_id: Unique identifier for each store.
    • store_name: Name of the store.
    • city: City where the store is located.
    • country: Country of the store.
  2. category: Holds product category information.

    • category_id: Unique identifier for each product category.
    • category_name: Name of the category.
  3. products: Details about Apple products.

    • product_id: Unique identifier for each product.
    • product_name: Name of the product.
    • category_id: References the category table.
    • launch_date: Date when the product was launched.
    • price: Price of the product.
  4. sales: Stores sales transactions.

    • sale_id: Unique identifier for each sale.
    • sale_date: Date of the sale.
    • store_id: References the store table.
    • product_id: References the product table.
    • quantity: Number of units sold.
  5. warranty: Contains information about warranty claims.

    • claim_id: Unique identifier for each warranty claim.
    • claim_date: Date the claim was made.
    • sale_id: References the sales table.
    • repair_status: Status of the warranty claim (e.g., Paid Repaired, Warranty Void).

Objectives

The project is split into three tiers of questions to test SQL skills of increasing complexity:

Easy to Medium (10 Questions)

  1. Find the number of stores in each country.
  2. Calculate the total number of units sold by each store.
  3. Identify how many sales occurred in December 2023.
  4. Determine how many stores have never had a warranty claim filed.
  5. Calculate the percentage of warranty claims marked as "Warranty Void".
  6. Identify which store had the highest total units sold in the last year.
  7. Count the number of unique products sold in the last year.
  8. Find the average price of products in each category.
  9. How many warranty claims were filed in 2020?
  10. For each store, identify the best-selling day based on highest quantity sold.

Medium to Hard (5 Questions)

  1. Identify the least selling product in each country for each year based on total units sold.
  2. Calculate how many warranty claims were filed within 180 days of a product sale.
  3. Determine how many warranty claims were filed for products launched in the last two years.
  4. List the months in the last three years where sales exceeded 5,000 units in the USA.
  5. Identify the product category with the most warranty claims filed in the last two years.

Complex (5 Questions)

  1. Determine the percentage chance of receiving warranty claims after each purchase for each country.
  2. Analyze the year-by-year growth ratio for each store.
  3. Calculate the correlation between product price and warranty claims for products sold in the last five years, segmented by price range.
  4. Identify the store with the highest percentage of "Paid Repaired" claims relative to total claims filed.
  5. Write a query to calculate the monthly running total of sales for each store over the past four years and compare trends during this period.

Bonus Question

  • Analyze product sales trends over time, segmented into key periods: from launch to 6 months, 6-12 months, 12-18 months, and beyond 18 months.

Project Focus

This project primarily focuses on developing and showcasing the following SQL skills:

  • Complex Joins and Aggregations: Demonstrating the ability to perform complex SQL joins and aggregate data meaningfully.
  • Window Functions: Using advanced window functions for running totals, growth analysis, and time-based queries.
  • Data Segmentation: Analyzing data across different time frames to gain insights into product performance.
  • Correlation Analysis: Applying SQL functions to determine relationships between variables, such as product price and warranty claims.
  • Real-World Problem Solving: Answering business-related questions that reflect real-world scenarios faced by data analysts.

Dataset

  • Size: 1 million+ rows of sales data.
  • Period Covered: The data spans multiple years, allowing for long-term trend analysis.
  • Geographical Coverage: Sales data from Apple stores across various countries.

Conclusion

By completing this project, I have developed advanced SQL querying skills, improving my ability to handle large datasets, and gaining practical experience in solving complex data analysis problems that are crucial for business decision-making. This project is an excellent addition to my portfolio and will demonstrate my expertise in SQL to potential employers.


About

This project analyzes over 1 million rows of Apple retail sales data using advanced SQL techniques. It covers key insights on sales trends, product performance, store analysis, and warranty claims across various Apple locations. The project demonstrates SQL proficiency in joins, window functions, aggregations, and query optimization for large-scale

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published