handle new zoning districts in GFT #2119

damonmcc · 2025-12-17T19:41:54Z

resolves #2032

see linked issue for data details and motivations for the logic changes here. worth looking at the commit messages for clarity.

dbt unit test docs: Unit tests, unit tests properties

new tests failing before relevant fix

top 5 most frequent new districts in the outputs

all lots in green_fast_track_bbls where zoning_district like '%R11%' or zoning_district like '%R12%' have a zoning_category of Other

zoning_district	zoning_category	count
M1-9A/R12	Other	281
M1-8A/R11	Other	189
M1-8A/R12	Other	93
C5-2, M1-8A/R12	Other	9
M1-8A/R11, C6-4X	Other	7

codecov · 2025-12-17T19:49:02Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.68%. Comparing base (ab686f5) to head (7427a14).
⚠️ Report is 10 commits behind head on main.

Additional details and impacted files

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

damonmcc · 2025-12-17T19:57:57Z

products/green_fast_track/models/intermediate/flags/int__zoning_districts.sql

        CASE
            WHEN zd IS null THEN 'NONE'
            WHEN zd LIKE 'M%' OR zd LIKE 'C%' THEN LEFT(zd, 1)
            -- match the first group of characters that end with a number
            WHEN zd LIKE 'R%' THEN (REGEXP_MATCH(zd, '^(\w\d+)'))[1]
            ELSE zd


this clause is why all relevant new lots end up being Commercial or Manufacturing. they all have zonedist1 values like M1-8A/R12, so the R11/R12 parts never contribute to the lot's GFT zoning category

damonmcc · 2025-12-17T19:59:42Z

products/green_fast_track/models/intermediate/flags/int_flags__zoning.sql

-            ELSE 'high_res'
+            WHEN has_high_res THEN 'high_res'


to not suppress actual NULLs or any other value we aren't handling

damonmcc · 2025-12-17T20:02:03Z

products/green_fast_track/models/intermediate/flags/int__zoning_districts.sql

-GROUP BY bbl, zoning_district_type
-ORDER BY bbl, zoning_district_type
+ORDER BY bbl, zd


it's really nice being able to see the results of parsing zonedist values to zoning_district_type values. these are few rows from this model after grouping by zd, zoning_district_type in DBeaver:

zd zoning_district_type count

C8-3 C 321

C8-4 C 111

M1-1 M 12521

M1-1/R5 M 65

M1-1/R6A M 4

Arguments to generic tests should be nested under the `arguments` property.

This maps the new districts to Other but the relevant lots have incorrect values because we ignore district values after forward slashes.

damonmcc · 2025-12-30T18:52:15Z

products/green_fast_track/models/intermediate/flags/_flags_models.yml

+        # dict format doesn't work because it tries to do cast(null as USER-DEFINED) as "geom"
+        # rows:
+        #   - {bbl: 123, zonedist1: M1, zonedist2: NULL, zonedist3: NULL, zonedist4: NULL}


the sql format below doesn't feel great but I didn't wanna get hung up on getting the dict or csv format to work for this first pass

Like the thoroughness of this. When does it get run? During a build? During PR tests?

during a build but since they're unit tests it'd be great to run in PR tests. but there's a weird requirement that the upstream models exist (values don't matter) so it might be a little convoluted

I'm guessing it's because the unit test gets columns types form the upstream models

oh from the docs you can do dbt run --select "parent_model_name" --empty to "build an empty version of the models to save warehouse spend"

Also a nit on the sql - using VALUES will save a bunch of lines - don't need to do the unions, can just list the tuples

VALUES ('simple_m', 'M1', NULL, NULL, NULL), ('multiple_districts', 'M1', 'M2', NULL, NULL), ... ;

Though then you lose labeling the individual fields

I'd definitely be in favor of getting them working during CI and skipping during a build

https://docs.getdbt.com/docs/build/unit-tests#when-to-run-unit-tests

Also a nit on the sql - using VALUES will save a bunch of lines - don't need to do the unions, can just list the tuples

Though then you lose labeling the individual fields

losing the field names would be a shame but this seems worth it

You can keep them in the query at least like this

SELECT * FROM ( VALUES (1, 'one'), (2, 'two'), (3, 'three') ) AS t (num, letter);

damonmcc · 2025-12-30T18:54:51Z

products/green_fast_track/models/intermediate/flags/int__zoning_districts.sql

+-- to preserve lots with no zoning since STRING_TO_ARRAY returns an empty (zero-element) array
+-- when the result of UNNEST(ARRAY[ ... ] is a string of zero length. this UNION ALL approach
+-- is simpler than using joins or complicated nesting of array functions


almost got away with just a one-line change to handle splitting and unesting forward slash districts. but I like the clarity of using UNION ALL to just combine two very different types of lots, instead of twisting array logic for the "all null" edge case

fvankrieken

One question to resolve but looks great

damonmcc · 2025-12-31T15:30:25Z

@fvankrieken

I'd definitely be in favor of getting them working during CI and skipping during a build

https://docs.getdbt.com/docs/build/unit-tests#when-to-run-unit-tests

looks like dbt run --empty needs sources to exist in the db and it seems messy to do something like dbt run --empty --select stg__pluto if ${{ matrix.project }} = 'green_fast_tract'

maybe there's a simple way to only select models that are parents of models with unit tests? if not, let's punt on running these in CI

fvankrieken · 2025-12-31T16:28:20Z

@fvankrieken

I'd definitely be in favor of getting them working during CI and skipping during a build
https://docs.getdbt.com/docs/build/unit-tests#when-to-run-unit-tests

looks like dbt run --empty needs sources to exist in the db and it seems messy to do something like dbt run --empty --select stg__pluto if ${{ matrix.project }} = 'green_fast_tract'

maybe there's a simple way to only select models that are parents of models with unit tests? if not, let's punt on running these in CI

I don't think upstream models need to be defined - checked out this branch and ran this aimed at my schema in db-cscl: dbt test --resource-type unit_test, no dbt run --empty beforehand

fvankrieken · 2025-12-31T16:30:17Z

Oh weird though that the docs are so explicit that the upstream tables DO need to exist

Maybe specifically because you've used the input with sql format, upstream models aren't needed?

damonmcc · 2025-12-31T18:43:25Z

Oh weird though that the docs are so explicit that the upstream tables DO need to exist

Maybe specifically because you've used the input with sql format, upstream models aren't needed?

worked! just dropped the dbt run --empty part. shame the docs make it sounds so necessary: "The direct parents of the model that you’re unit testing need to exist in the warehouse before you can execute the unit test."

damonmcc force-pushed the gft-new-districts branch from 4f6a3f0 to 618de90 Compare December 17, 2025 19:55

damonmcc commented Dec 17, 2025

View reviewed changes

damonmcc force-pushed the gft-new-districts branch 3 times, most recently from 84218c1 to dc00003 Compare December 30, 2025 17:25

damonmcc added 7 commits December 30, 2025 12:27

fix dbt deprecated functionality

faf83f8

Arguments to generic tests should be nested under the `arguments` property.

move zoning district group by

581769a

make zoning flag logic more explicit

02477fe

add a slash district lot to the pilot projects

e96ce59

add unit test for zoning district parsing

1e69828

handle new R11 and R12 districts

59ee916

This maps the new districts to Other but the relevant lots have incorrect values because we ignore district values after forward slashes.

parse zondist values with slashes

619d7fe

damonmcc force-pushed the gft-new-districts branch from dc00003 to 619d7fe Compare December 30, 2025 17:34

damonmcc marked this pull request as ready for review December 30, 2025 18:31

damonmcc requested a review from a team December 30, 2025 18:35

damonmcc commented Dec 30, 2025

View reviewed changes

fvankrieken approved these changes Dec 30, 2025

View reviewed changes

improve unit test fixture

5d0de54

damonmcc force-pushed the gft-new-districts branch from e40dc35 to bd6be1f Compare December 31, 2025 15:07

run dbt unit tests in CI

7427a14

damonmcc force-pushed the gft-new-districts branch from bd6be1f to 7427a14 Compare December 31, 2025 18:38

damonmcc merged commit ee1d76e into main Dec 31, 2025
23 checks passed

damonmcc deleted the gft-new-districts branch January 1, 2026 15:43

handle new zoning districts in GFT #2119

handle new zoning districts in GFT #2119

Uh oh!

Conversation

damonmcc commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

new tests failing before relevant fix

top 5 most frequent new districts in the outputs

Uh oh!

codecov bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

damonmcc Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

damonmcc Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fvankrieken Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

damonmcc Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fvankrieken left a comment

Choose a reason for hiding this comment

Uh oh!

damonmcc commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fvankrieken commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fvankrieken commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

damonmcc commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

damonmcc commented Dec 17, 2025 •

edited

Loading

codecov bot commented Dec 17, 2025 •

edited

Loading

damonmcc Dec 17, 2025 •

edited

Loading

damonmcc Dec 30, 2025 •

edited

Loading

fvankrieken Dec 30, 2025 •

edited

Loading

damonmcc Dec 30, 2025 •

edited

Loading

damonmcc commented Dec 31, 2025 •

edited

Loading

fvankrieken commented Dec 31, 2025 •

edited

Loading

fvankrieken commented Dec 31, 2025 •

edited

Loading

damonmcc commented Dec 31, 2025 •

edited

Loading