SQL Superstore Business Analysis (PostgreSQL 16)

This project demonstrates senior-level SQL analytics, modular query architecture, and executive-ready business storytelling using PostgreSQL 16.

Project Overview

This project simulates a real-world retail business analytics engagement using the Superstore transactional dataset.

The objective is to transform raw order-level data into executive-ready business insights through structured data modeling, defensive SQL practices, and advanced analytical techniques.

All analysis was performed using PostgreSQL 16 and pgAdmin 4.

Dataset

Source: Superstore transactional dataset
~9,994 raw staging records
1,862 cleaned order-line records after transformation
Fact grain: Order Line Level

Business Objectives

Evaluate revenue growth and profitability trends (MoM, YoY, YTD)
Identify margin drivers and loss-leading categories
Analyze regional and shipping performance
Determine revenue and profit concentration (Pareto 80/20)
Segment customers using RFM methodology
Generate executive-level strategic recommendations

Key Business Questions

What are total revenue, profit, margin %, and order volume?
Which categories and sub-categories drive profit vs. losses?
How is revenue trending month-over-month and year-over-year?
Are profits concentrated in a small percentage of customers or products?
Which customers are high-value vs. at-risk?

Data Quality Handling

During transformation, several real-world data integrity issues were identified and resolved:

Detected inconsistent product_id → product_name mappings across 32 product IDs
Implemented deterministic deduplication using ROW_NUMBER() window functions
Preserved primary key integrity while retaining latest and most complete attribute values
Ensured rerunnable and idempotent ETL design
Applied defensive SQL (NULLIF, COALESCE) to prevent divide-by-zero errors

This reflects production-style analytical data handling rather than academic querying.

Technical Highlights

PostgreSQL 16
CTE-based modular query architecture
Advanced window functions (LAG, RANK, SUM OVER)
Rolling averages and YTD calculations
Defensive SQL (NULLIF, COALESCE)
Indexing and constraint optimization
Structured analytics layering (Core → Advanced → Executive)

Architectural & Interview Talking Points

This project was intentionally designed using production-style data engineering discipline:

Idempotent and rerunnable transformation scripts
Deterministic deduplication using ROW_NUMBER() window functions
Explicit fact table grain definition (Order Line Level)
Defensive data constraints (CHECK constraints for data integrity)
Strategic indexing for performance optimization
Modular query layering (Database Setup → Core Analysis → Advanced Analytics → Executive Insights)

This reflects analytical engineering thinking beyond ad-hoc querying and demonstrates foundational data warehouse and analytics architecture skills.

Repository Structure

01_database_setup/ → schema creation, staging, cleaning, constraints
02_core_analysis/ → executive KPIs and dimensional breakdowns
03_advanced_analytics/ → MoM, YoY, rolling averages, Pareto, RFM
04_executive_insights/ → executive summary and strategic recommendations
assets/ → ERD and sample query outputs

Execution Order

To reproduce this analysis:

Create database in PostgreSQL.
Run scripts in 01_database_setup/ in order.
Validate row counts and constraints.
Execute modules in 02_core_analysis/.
Run advanced analytics in 03_advanced_analytics/.
Review executive summary in 04_executive_insights/.

All modules are modular and can run independently once base tables are created.

Sample Executive Insights

(Replace bracketed numbers with your actual outputs before final push.)

Technology category drove [X]% of total revenue growth YoY, contributing $[X] incremental revenue.
Furniture margins averaged [X]%, significantly lower than overall margin of [X]%, driven by discounting.
Top 20% of customers generated approximately [X]% of total profit, confirming strong revenue concentration.
Several sub-categories operated below [X]% margin, indicating pricing optimization opportunities.
RFM segmentation identified [X] high-value repeat customers suitable for retention targeting.

Example Advanced SQL (Window Function Usage)

SELECT
    DATE_TRUNC('month', order_date) AS month,
    SUM(sales) AS monthly_revenue,
    LAG(SUM(sales)) OVER (ORDER BY DATE_TRUNC('month', order_date)) AS prior_month,
    SUM(sales) 
        - LAG(SUM(sales)) OVER (ORDER BY DATE_TRUNC('month', order_date)) AS mom_change
FROM sales
GROUP BY 1
ORDER BY 1;

This demonstrates layered CTE logic, window functions, and trend analysis.

Data Model Notes

Fact table grain: Order Line Level
Revenue = SUM(sales)
Profit = SUM(profit)
Margin % = profit / revenue
Time-based metrics use order_date
Advanced metrics built using CTE layering and window functions

Strategic Value

This project demonstrates:

Business-focused analytical storytelling
Advanced SQL proficiency
Data modeling discipline
Executive-ready communication
Senior-level analytical thinking

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SQL Superstore Business Analysis (PostgreSQL 16)

Project Overview

Dataset

Business Objectives

Key Business Questions

Data Quality Handling

Technical Highlights

Architectural & Interview Talking Points

Repository Structure

Execution Order

Sample Executive Insights

Example Advanced SQL (Window Function Usage)

Data Model Notes

Strategic Value

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
01_database_setup		01_database_setup
02_core_analysis		02_core_analysis
03_advanced_analytics		03_advanced_analytics
04_executive_insights		04_executive_insights
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

SQL Superstore Business Analysis (PostgreSQL 16)

Project Overview

Dataset

Business Objectives

Key Business Questions

Data Quality Handling

Technical Highlights

Architectural & Interview Talking Points

Repository Structure

Execution Order

Sample Executive Insights

Example Advanced SQL (Window Function Usage)

Data Model Notes

Strategic Value

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages