A complete analytics platform analyzing 100,000+ orders from Brazilian e-commerce marketplace Olist (2016-2018). Built to demonstrate:
- Lakehouse Architecture - Databricks with Delta Lake storage
- Medallion Pattern - Bronze → Silver → Gold data layers
- SQL Expertise - Complex transformations, CTEs, JOINs
- Data Visualization - Interactive Streamlit dashboard
- CI/CD - GitHub Actions for linting and testing
Lakehouse Platform |
Storage Format |
Web Dashboard |
Backend |
┌─────────────────────────────────────────────────────┐
│ Databricks Lakehouse │
│ │
CSV Files ─────────►│ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Bronze │──►│ Silver │──►│ Gold │ │
│ │ (raw) │ │ (clean) │ │ (analytics) │ │
│ │ 9 tables │ │ 7 tables │ │ 4 tables │ │
│ └──────────┘ └──────────┘ └──────────────┘ │
│ │ │
│ Delta Lake Storage │ │
└───────────────────────────────────────┼────────────┘
│
▼
┌──────────────┐
│ Streamlit │
│ Dashboard │
└──────────────┘
| Layer | Tables | Description |
|---|---|---|
| Bronze | 9 tables | Raw data ingested from CSV files |
| Silver | 7 tables | Cleaned, typed, and validated data |
| Gold | 4 tables | Business-ready facts and dimensions |
| Model | Description |
|---|---|
fct_orders |
Order facts with revenue metrics |
dim_customers |
Customer dimension with segmentation |
dim_products |
Product dimension with sales tiers |
dim_sellers |
Seller dimension with performance ratings |
git clone https://github.com/Mohith-akash/olist-analytics-platform.git
cd olist-analytics-platform
python -m venv venv
.\venv\Scripts\activate # Windows
source venv/bin/activate # Mac/Linux
pip install -r requirements.txtCreate .streamlit/secrets.toml:
DATABRICKS_HOST = "your-workspace.cloud.databricks.com"
DATABRICKS_HTTP_PATH = "/sql/1.0/warehouses/your-warehouse-id"
DATABRICKS_TOKEN = "your-access-token"streamlit run streamlit_app.pyolist_analytics_platform/
├── 📊 streamlit_app.py # Dashboard entry point
├── 📋 requirements.txt # Python dependencies
│
├── 📂 app/ # Core modules
│ ├── database.py # Databricks SQL connection
│ ├── styles.py # CSS styling
│ └── utils.py # Formatting utilities
│
├── 📂 tabs/ # Dashboard components
│ ├── home.py # KPIs and overview
│ ├── analytics.py # Analysis charts
│ ├── query.py # Data explorer
│ └── about.py # Project info
│
├── 📂 databricks/ # SQL notebooks (reference)
│ ├── 01_bronze_layer.sql
│ ├── 02_silver_layer.sql
│ └── 03_gold_layer.sql
│
└── 📂 docs/images/ # Screenshots
Olist Brazilian E-commerce Dataset 100K+ orders · 9 tables · 2016-2018 Kaggle
Built by Mohith Akash
⭐ Star this repo if you found it helpful!

