-
Notifications
You must be signed in to change notification settings - Fork 0
Test from latest session #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| -- Revenue breakdown for products in the Excellent rating cohort (avg_rating >= 4.5) | ||
| -- Filters dim_products to rating_tier = 'Excellent' and surfaces per-product | ||
| -- revenue alongside cohort-level totals and each product's percentage share. | ||
| with excellent_products as ( | ||
| select | ||
| product_id, | ||
| product_name, | ||
| product_category, | ||
| unit_price, | ||
| avg_rating, | ||
| rating_tier, | ||
| review_count, | ||
| total_units_sold, | ||
| product_total_revenue, | ||
| product_gross_profit, | ||
| margin_pct | ||
| from {{ ref('dim_products') }} | ||
| where rating_tier = 'Excellent' | ||
| ), | ||
|
|
||
| final as ( | ||
| select | ||
| product_id, | ||
| product_name, | ||
| product_category, | ||
| unit_price, | ||
| avg_rating, | ||
| rating_tier, | ||
| review_count, | ||
| total_units_sold, | ||
| product_total_revenue, | ||
| product_gross_profit, | ||
| margin_pct, | ||
| sum(product_total_revenue) over () as cohort_total_revenue, | ||
| round( | ||
| product_total_revenue / nullif(sum(product_total_revenue) over (), 0) * 100, | ||
| 2 | ||
| ) as pct_of_cohort_revenue | ||
| from excellent_products | ||
| ) | ||
|
|
||
| select * from final |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -173,6 +173,126 @@ models: | |
| sla_hours: 4 | ||
| tier: 2 | ||
|
|
||
| # ──────────────────────────────────────────── | ||
| # marketing_excellent_rating_revenue | ||
| # ──────────────────────────────────────────── | ||
| - name: marketing_excellent_rating_revenue | ||
| description: > | ||
| Revenue analysis for products in the Excellent rating cohort (avg_rating >= 4.5). | ||
| Filters dim_products to rating_tier = 'Excellent' and surfaces per-product revenue | ||
| alongside cohort-level totals and each product's percentage share of cohort revenue. | ||
| Useful for understanding the contribution of top-rated products to overall sales. | ||
| group: marketing_analytics | ||
| access: public | ||
|
|
||
| config: | ||
| tags: ["active", "marketing", "tier-2", "cohort"] | ||
| materialized: table | ||
|
|
||
| meta: | ||
| maintainer_email: sofia.gutierrez@jaffle-shop.com | ||
| listeners: | ||
| - marketing-team@jaffle-shop.com | ||
| - jordan.blake@jaffle-shop.com | ||
| business_unit: "Marketing" | ||
| model_type: "SQL" | ||
| schedule: "Daily at 07:45 AM UTC" | ||
| schedule_cron: "45 7 * * *" | ||
| status: "active" | ||
| revised_state: "RENEWED" | ||
| expiry_date: "2027-06-30" | ||
| approved: true | ||
| approved_by: sofia.gutierrez@jaffle-shop.com | ||
| approved_date: "2026-02-23" | ||
| observe_in_airflow: true | ||
| sla_hours: 4 | ||
| tier: 2 | ||
| data_portal_tests: | ||
| - name: "All rows have Excellent rating tier" | ||
| type: "accepted_values" | ||
| sql: "SELECT * FROM {{ model }} WHERE rating_tier != 'Excellent'" | ||
| expected_result: "0 rows" | ||
| description: "Every row in this model must belong to the Excellent rating cohort (avg_rating >= 4.5)" | ||
| - name: "No negative product revenue" | ||
| type: "data_quality" | ||
| sql: "SELECT * FROM {{ model }} WHERE product_total_revenue < 0" | ||
| expected_result: "0 rows" | ||
| description: "Product total revenue must be non-negative for all Excellent cohort products" | ||
| - name: "Cohort total revenue is consistent across all rows" | ||
| type: "business_logic" | ||
| sql: "SELECT * FROM {{ model }} WHERE cohort_total_revenue != (SELECT SUM(product_total_revenue) FROM {{ model }})" | ||
| expected_result: "0 rows" | ||
| description: "The cohort_total_revenue window value must equal the sum of all product_total_revenue in the model" | ||
| - name: "Percentage of cohort revenue sums to 100" | ||
| type: "business_logic" | ||
| sql: "SELECT * FROM (SELECT ABS(SUM(pct_of_cohort_revenue) - 100) AS diff FROM {{ model }}) t WHERE diff > 0.01" | ||
| expected_result: "0 rows" | ||
| description: "The sum of pct_of_cohort_revenue across all Excellent products must equal 100 (within 0.01 tolerance)" | ||
| - name: "No products with zero units sold" | ||
| type: "data_quality" | ||
| sql: "SELECT * FROM {{ model }} WHERE total_units_sold IS NULL OR total_units_sold = 0" | ||
| expected_result: "0 rows" | ||
| description: "All products in the Excellent cohort must have sold at least one unit to be included in revenue analysis" | ||
|
|
||
| columns: | ||
| - name: product_id | ||
| description: Unique product identifier — primary key, inherited from dim_products | ||
| data_type: varchar | ||
| tests: | ||
| - unique | ||
| - not_null | ||
| - name: product_name | ||
| description: Human-readable name of the product | ||
| data_type: varchar | ||
| tests: | ||
| - not_null | ||
| - name: product_category | ||
| description: Category the product belongs to (e.g. Electronics, Clothing) | ||
| data_type: varchar | ||
| - name: unit_price | ||
| description: Listed selling price per unit of the product | ||
| data_type: numeric | ||
| - name: avg_rating | ||
| description: Average customer review rating for the product; always >= 4.5 in this model | ||
| data_type: numeric | ||
| tests: | ||
| - not_null | ||
| - name: rating_tier | ||
| description: > | ||
| Product quality tier derived from avg_rating. Always 'Excellent' in this model | ||
| (avg_rating >= 4.5). The full tier scale in dim_products is: | ||
| Excellent (>=4.5), Good (>=3.5), Average (>=2.5), Poor (<2.5). | ||
| data_type: varchar | ||
| tests: | ||
| - not_null | ||
| - name: review_count | ||
| description: Total number of customer reviews submitted for the product | ||
| data_type: integer | ||
| - name: total_units_sold | ||
| description: Cumulative number of units sold across all orders | ||
| data_type: integer | ||
| - name: product_total_revenue | ||
| description: Total gross revenue generated by this product (units_sold × unit_price) | ||
| data_type: numeric | ||
| - name: product_gross_profit | ||
| description: Total gross profit for the product (revenue minus supply cost) | ||
| data_type: numeric | ||
| - name: margin_pct | ||
| description: Gross margin percentage — product_gross_profit / product_total_revenue × 100 | ||
| data_type: numeric | ||
|
Comment on lines
+280
to
+282
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The core schema defines Align the description here with the 🤖 Prompt for AI Agents |
||
| - name: cohort_total_revenue | ||
| description: > | ||
| Sum of product_total_revenue across ALL products in the Excellent rating cohort. | ||
| Computed as a window function over the full result set; the value is identical | ||
| on every row and represents the cohort's aggregate revenue contribution. | ||
| data_type: numeric | ||
| - name: pct_of_cohort_revenue | ||
| description: > | ||
| This product's share of the Excellent cohort's total revenue, expressed as a | ||
| percentage (0–100). Calculated as product_total_revenue / cohort_total_revenue × 100, | ||
| rounded to 2 decimal places. All values sum to 100 across the cohort. | ||
| data_type: numeric | ||
|
|
||
| # ──────────────────────────────────────────── | ||
| # marketing_channel_attribution | ||
| # ──────────────────────────────────────────── | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 0.01 tolerance for the sum-to-100 check will produce false failures for cohorts with more than a few products.
pct_of_cohort_revenueisround(..., 2), so each row carries up to ±0.005 rounding error. With N products the cumulative deviation from 100 can reach N × 0.005. A cohort of 7 equal-share products already yields a diff of 0.03, which exceeds the 0.01 threshold and triggers the test.A simple fix is to widen the tolerance to accommodate realistic cohort sizes. If you expect up to ~50 products, a threshold of
0.25(50 × 0.005) would be safe, or round to more decimal places in the SQL model (e.g.,round(..., 4)) and tighten accordingly.💡 Example tolerance adjustment
🤖 Prompt for AI Agents