Skip to content

Making og urm official#3298

Open
CGibson17 wants to merge 7 commits intomainfrom
making-og-urm-official
Open

Making og urm official#3298
CGibson17 wants to merge 7 commits intomainfrom
making-og-urm-official

Conversation

@CGibson17
Copy link
Copy Markdown
Contributor

Pull Request

Summary & Motivation

Self-review

General

  • If this is a same-day request, please flag that in the #data-team Slack
  • Update due date and assignee on the
    TEAMster Asana Project
  • Run Format on all modified files

dbt

  • Include a corresponding [model name].yml properties file for all models.
    These can be generated by running
    dbt run-operation generate_model_yaml --args '{"model_names": ["MODEL NAME"]}'
    and saving the console output to a YAML file.

    models:
      - name: [model name]
        config:
          contract:
            enforced: false  # required while creating a contracted model, delete before merging
        columns:  # required for contracted models
          - name: ...
            data_type: ...
            data_tests:  # column tests, optional
              - ...
        data_tests:  # model tests, optional
          - ...
          - dbt_utils.unique_combination_of_columns:  # recommended for intermediate models
              arguments:
                combination_of_columns:
                  - ...
              config:
                store_failures: true
    
  • Include (or update) an
    exposure for all
    models that will be consumed by a dashboard, analysis, or application:

    exposures:
      - name: [exposure name, snake_case]
        label: [exposure name, Title Case]
        type: dashboard | notebook | analysis | ml | application
        owner:
          name: Data Team
        depends_on:
          - ref("[model name]")
          - ...
        url: ...  # optional
        config:
          meta:
            dagster:
              kinds:
                - tableau | googlesheets | ...
                - ...
              asset:
                metadata:
                  id: [lsid]  # required for Tableau Server workbooks
                  cron_schedule:  # required for Dagster automation
                    - * * * * *
                    - ...
    

Dagster "kinds" Reference

SQL

  • Use the union_dataset_join_clause() macro for queries that employ models
    that use regional datasets

  • Do not use group by without any aggregations when you mean to use
    distinct

  • All distinct usage must be accompanied by an comment explaining it's
    necessity

  • Do not use order by for select statements. That should be done in the
    reporting layer.

  • If you are adding a new external source, before building, run:

    dbt run-operation stage_external_sources --vars "{'ext_full_refresh': 'true'}" --args select: [model name(s)]
    

Troubleshooting

@CGibson17 CGibson17 requested a review from a team February 20, 2026 16:48
@cbini cbini force-pushed the making-og-urm-official branch from ad62e7b to ed411ea Compare March 4, 2026 19:25
@cbini cbini requested review from a team as code owners March 4, 2026 19:25
@anthonygwalters
Copy link
Copy Markdown
Member

@claude please review this PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 16, 2026

Claude finished @anthonygwalters's task in 2m 1s —— View job


PR Review: Making og urm official

  • Gather context and read changed files
  • Review stg_kippadb__enrollment.sql
  • Review base_kippadb__application.sql
  • Review YAML properties files
  • Post review findings

Summary

This PR promotes the Overgrad URM graduation rate (og_urm) from a temporary field to an official column in two models: stg_kippadb__enrollment and base_kippadb__application. The approach looks reasonable overall, but there are two issues to address before merging.


Issues

🔴 Bug: Wrong join key in stg_kippadb__enrollment.sql

File: src/dbt/kipptaf/models/kippadb/staging/stg_kippadb__enrollment.sql, line 71

-- Current (likely wrong):
on e.id = n.account_id

-- Should probably be:
on e.school__c = n.account_id

e.id is the Salesforce enrollment record ID, but n.account_id in the NSC crosswalk is a Salesforce Account ID (i.e., the college/school). These will never match, so og_urm will always be NULL in stg_kippadb__enrollment.

Compare with how base_kippadb__application.sql joins the same crosswalk on line 118: acc.id = n.account_id — where acc is stg_kippadb__account (the college). The enrollment model has e.school__c as school (line 53) which is the account ID you want.

Fix this →


🔴 YAML mismatch in base_kippadb__application.yml

File: src/dbt/kipptaf/models/kippadb/base/properties/base_kippadb__application.yml, lines 258–259

The YAML adds a column named overgrad_urm_grad_rate, but the SQL does not output a column with that name. The SQL (lines 82–87) adds two new columns: og_urm and og_urm_preferred. Both are missing from the YAML, and overgrad_urm_grad_rate should be removed.

# Current (wrong):
- name: overgrad_urm_grad_rate
  data_type: string

# Should be:
- name: og_urm
  data_type: string
- name: og_urm_preferred
  data_type: numeric

Note: og_urm_preferred uses coalesce(cast(... as numeric), acc.adjusted_6_year_minority_graduation_rate) — its type should be numeric, not string. The stg_kippadb__enrollment.yml correctly documents og_urm as string.

Fix this →


Observations

  • Redundant og_urm in enrollment: Since base_kippadb__application.sql already joins the NSC crosswalk directly (on acc.id), the og_urm column in stg_kippadb__enrollment is not consumed by that model. It may be useful independently, but worth confirming there's a downstream consumer of enr.og_urm specifically.

  • select * in enr_clean passthrough (select ec.* ... on line 77) means the new og_urm column from enrollment is automatically promoted — no explicit column declaration needed there, which is fine for a staging-to-base passthrough.

  • The alias cleanup (adding e. prefix to all columns in enr_clean) is a clean improvement that makes the join unambiguous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants