Skip to content

sharon2719/Customer_ETL_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Customer Data ETL Pipeline (SQL Server + Docker)

This project simulates a real-world ETL pipeline for cleaning, merging, and auditing customer data from multiple sources.

πŸ”§ Tech Stack

  • Microsoft SQL Server (Docker)
  • T-SQL (sqlcmd)
  • Bulk insert from CSV
  • Stored Procedures for automation

πŸ“ Project Structure

  • data/: Sample input data (customer_data.csv)
  • sql/: SQL scripts for table creation, ETL logic, and testing
  • screenshots/: Optional screenshots or dashboard previews
  • README.md: Project overview

βœ… ETL Process

  1. Load raw customer data into Staging_Customers
  2. Clean inconsistent city names (e.g. new york β†’ New York)
  3. Merge data into Dim_Customers using UPSERT logic
  4. Log each ETL run in ETL_Job_Log
  5. Clear staging table

πŸ§ͺ Sample Queries

  • Active vs inactive customer count
  • Top cities by customer count
  • ETL run history

πŸš€ To Run the Project

# Connect to the DB
sqlcmd -S localhost -U SA -P 'YourStrongPassword' -d Company_Datawarehouse

# Create tables
:i sql/create_tables.sql

# Load your CSV via BULK INSERT or bcp

# Run ETL
EXEC Run_Customer_ETL;
GO

# Run test queries
:i sql/test_queries.sql

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages