🖥️ Laptop Data Modeling with dbt

📌 Project Idea

This project takes a dataset of laptop specifications and prices, and transforms it into a star schema using dbt.
The goal is to showcase how to go from raw data → preprocessing → database → dbt models → clean data warehouse design.

It’s a great starting project for anyone learning dbt and data modeling.

🔄 Workflow

Raw dataset
- laptops.csv (original dataset with messy formats)
Preprocessing (Python/Jupyter)
- Converted RAM, weight, memory, etc. into numeric columns
- Extracted CPU/GPU brands, display flags (IPS, Touchscreen, Retina), resolutions, PPI
- Normalized OS types
- Exported as laptop_cutted.csv
Database (Postgres / pgAdmin)
- Loaded preprocessed data into a staging table: stg_laptops
dbt source definition
- Declared stg_laptops in sources.yml
- Example:
```
{{ source('base', 'stg_laptops') }}
```
Staging model
- stg_laptops_clean.sql → cleans and standardizes raw staging data.
Dimension models
- dim_company.sql
- dim_product.sql
- dim_cpu.sql
- dim_gpu.sql
- dim_os.sql
- dim_display.sql
- dim_storage.sql
Each dimension creates a surrogate key (SK) using md5() and stores cleaned attributes.
Fact table
- fact_laptop.sql
- Grain: one row per laptop
- Holds foreign keys to each dimension + measures (price, RAM, weight, etc.)
Tests
- Defined in schema.yml
- Ensures:
  - SKs are unique & not null
  - Fact table foreign keys correctly map to dimensions
- ✅ All tests passed
ERD
- Star schema diagram created with DBML
- File: docs/laptops_erd.dbml
- Rendered ERD (example below):

📊 Schema Overview

Fact Table
- fact_laptop → One row per laptop. Links to all dimensions and contains measures (price, RAM, weight, etc.).
Dimension Tables
- dim_company → Laptop brand/manufacturer (Apple, Dell, HP, etc.).
- dim_product → Product model and category (MacBook Pro, Ultrabook, Notebook, etc.).
- dim_cpu → CPU brand and generation/family (Intel i5, i7, Ryzen, etc.).
- dim_gpu → GPU brand and type (NVIDIA GeForce, Intel Iris, AMD Radeon, etc.).
- dim_os → Operating system (Windows, macOS, Linux, No OS).
- dim_display → Screen attributes (size, resolution, IPS, Retina, Touchscreen).
- dim_storage → Storage breakdown (HDD, SSD, Hybrid, Flash capacities).

Together they form a star schema for analyzing laptops by company, product, CPU/GPU, OS, storage, and display.

🧪 Data Quality

Tests included:

Unique & not null constraints on all dimension SKs
Fact → Dim relationships for referential integrity
Fact grain check: laptop_id_nat unique & not null

All tests passed, ensuring the schema is clean and reliable.

🚀 How to Run

Clone this repo

git clone https://github.com/sshossen/laptop_dbt.git
cd laptop_dbt

Create a virtual environment and install dependencies

python3 -m venv venv
source venv/bin/activate    # Mac/Linux
# or: venv\Scripts\activate  # Windows
pip install -r requirements.txt

Configure dbt connection (edit your profiles.yml)
Example for Postgres:

laptop_dbt:
  target: dev
  outputs:
    dev:
      type: postgres
      host: localhost
      user: your_user
      password: your_password
      port: 5432
      dbname: your_database
      schema: public

4.Run dbt commands

dbt run             # build models
dbt test            # run tests
dbt docs generate   # build docs locally
dbt docs serve      # serve docs locally

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
analyses		analyses
docs		docs
macros		macros
models		models
notebooks		notebooks
seeds		seeds
snapshots		snapshots
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dbt_project.yml		dbt_project.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🖥️ Laptop Data Modeling with dbt

📌 Project Idea

🔄 Workflow

📊 Schema Overview

🧪 Data Quality

🚀 How to Run

About

Uh oh!

Releases

Packages

Languages

License

sshossen/laptop_dbt

Folders and files

Latest commit

History

Repository files navigation

🖥️ Laptop Data Modeling with dbt

📌 Project Idea

🔄 Workflow

📊 Schema Overview

🧪 Data Quality

🚀 How to Run

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages