From 589c935979479b8bb2e10c927175476181bf6961 Mon Sep 17 00:00:00 2001 From: grangel <140853376+GabyRangelB@users.noreply.github.com> Date: Thu, 26 Mar 2026 19:05:13 +0000 Subject: [PATCH 1/9] =?UTF-8?q?feat(dbt):=20state=20assessments=20rebuild?= =?UTF-8?q?=20=E2=80=94=20push=20proficiency/metadata=20logic=20upstream?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add proficiency banding, subject mapping, and metadata columns to int_pearson__all_assessments, int_fldoe__all_assessments, and int_iready__diagnostic_results - Add new int_pearson__student_list_report intermediate model - Add stg_google_sheets__state_test_comparison_demographics; disable old stg_google_sheets__state_test_comparison - Add standardized_discipline to base_powerschool__course_enrollments - Add new int_extracts__student_enrollments_courses model - Refactor int_extracts__student_enrollments_subjects to use upstream columns - Simplify rpt_tableau__state_assessments_dashboard and _comps by replacing inline CASE blocks with upstream column references - Update int_tableau__state_assessments_demographic_comps lineage to use int_pearson__student_list_report Co-Authored-By: Claude Sonnet 4.6 (1M context) --- ...-03-23-state-assessments-rebuild-design.md | 122 ++++++++++++++++++ .../int_fldoe__all_assessments.sql | 24 +++- .../properties/int_fldoe__all_assessments.yml | 8 ++ ...u__state_assessments_demographic_comps.sql | 52 ++------ ...t_tableau__state_assessments_dashboard.sql | 70 ++-------- ...eau__state_assessments_dashboard_comps.sql | 74 +---------- .../models/google/sheets/sources-external.yml | 19 ++- ..._assessments__course_subject_crosswalk.yml | 2 - ...g_google_sheets__state_test_comparison.yml | 2 + ...ts__state_test_comparison_demographics.yml | 14 ++ ...ts__state_test_comparison_demographics.sql | 64 ++++++++- .../int_iready__diagnostic_results.sql | 22 +++- .../int_iready__diagnostic_results.yml | 2 + .../int_pearson__all_assessments.sql | 51 +++++++- .../int_pearson__student_list_report.sql | 80 ++++++++++++ .../int_pearson__all_assessments.yml | 7 + .../int_pearson__student_list_report.yml | 2 + .../base_powerschool__course_enrollments.sql | 4 +- .../base_powerschool__course_enrollments.yml | 2 + ..._extracts__student_enrollments_courses.sql | 32 +++++ ...extracts__student_enrollments_subjects.sql | 51 ++------ 21 files changed, 468 insertions(+), 236 deletions(-) create mode 100644 docs/superpowers/specs/2026-03-23-state-assessments-rebuild-design.md create mode 100644 src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql create mode 100644 src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__student_list_report.yml create mode 100644 src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments_courses.sql diff --git a/docs/superpowers/specs/2026-03-23-state-assessments-rebuild-design.md b/docs/superpowers/specs/2026-03-23-state-assessments-rebuild-design.md new file mode 100644 index 0000000000..f443420cce --- /dev/null +++ b/docs/superpowers/specs/2026-03-23-state-assessments-rebuild-design.md @@ -0,0 +1,122 @@ +# State Assessments Rebuild + +**Branch:** `claude/feat/stat-rebuild` **Date started:** 2026-03-23 **Status:** +In progress + +## Motivation + +1. **New reporting requirements** — Demographic comparison reporting that didn't + previously exist (e.g., state test comparison by demographic subgroups) +2. **Tech debt cleanup** — Assessment models had duplicated business logic + (proficiency banding, subject mapping, metadata) scattered across reporting + layers, making them fragile and hard to maintain + +## Architectural Pattern + +**Push transformation logic upstream, simplify reporting downstream.** + +### Proficiency banding + +Each assessment source computes standardized proficiency bands in its own +intermediate model rather than in Tableau reporting models: + +- **Pearson (NJSLA):** `njsla_aggregated_proficiency` — levels 1-2 = "Below/Far + Below", 3 = "Approaching", 4+ = "At/Above" +- **FLDOE (FAST):** `fast_aggregated_proficiency` — level 1 = "Below/Far Below", + 2 = "Approaching", 3+ = "At/Above" (FAST uses a 1-5 scale where proficiency + starts at level 3, vs NJSLA's level 4) +- **iReady:** `iready_proficiency` — same banding pattern as Pearson (levels 1-2 + = Below, 3 = Approaching, 4+ = At/Above) + +### Metadata columns + +Added at the intermediate layer (not in reporting): + +- `results_type` — "Actual" vs "Preliminary" +- `district_state` — "KTAF NJ" or "KTAF FL" +- `illuminate_subject` — standardized subject mapping (ELA = "Text Study", + Algebra/Geometry = "Mathematics") +- `discipline` — broader category (Math, ELA, Science, Social Studies) + +### Demographic alignment + +Grade/demographic group mapping moved from reporting into staging: + +- `stg_google_sheets__state_test_comparison_demographics` now computes + `comparison_demographic_group_aligned`, + `comparison_demographic_subgroup_aligned`, `total_proficient_students`, + `school_level`, `grade_range_band`, `discipline`, and `season` +- Old `stg_google_sheets__state_test_comparison` disabled (`enabled: false`) + because the demographics sheet is a superset of the old source + +### Reporting simplification + +`rpt_tableau__state_assessments_*` models now primarily select pre-computed +columns instead of containing large CASE blocks. + +## What's Been Done + +### Assessment intermediate models + +| Model | Changes | +| ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| `int_fldoe__all_assessments` | Added `achievement_level_int`, `results_type`, `district_state`, `illuminate_subject`, `fast_aggregated_proficiency` | +| `int_pearson__all_assessments` | Added `illuminate_subject`, `njsla_aggregated_proficiency`, `results_type`, `district_state`, `is_504`, `aligned_test_code`, `race_ethnicity`, `admin`, `season`, `aligned_subject`, `is_proficient_int`. Contract enforcement disabled (temporary dev workaround) | +| `int_pearson__student_list_report` | **New model** — raw transformation layer for Pearson student list data with `performance_band_level`, `is_proficient` | +| `int_iready__diagnostic_results` | Added `iready_proficiency` column | + +### Google Sheets staging + +| Model | Changes | +| ------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | +| `stg_google_sheets__state_test_comparison_demographics` | Enhanced with derived columns: `season`, `school_level`, `grade_range_band`, `discipline`, aligned demographic columns, `total_proficient_students` | +| `stg_google_sheets__state_test_comparison` | Disabled (`enabled: false`) | +| `sources-external.yml` | Updated source references; course subject crosswalk sheet range renamed to `_v2`, `Exclude_from_Gradebook` column dropped | + +### Tableau reporting models (simplified) + +| Model | Changes | +| -------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `rpt_tableau__state_assessments_dashboard` | Switched to demographics source; Student List Report section refactored — `discipline`, `subject`, `test_code`, `performance_band_level`, `is_proficient`, `results_type` all switched from inline CASE/IF to upstream column references | +| `rpt_tableau__state_assessments_dashboard_comps` | Removed demographic alignment CASE blocks, uses pre-computed columns | +| `int_tableau__state_assessments_demographic_comps` | Lineage changed from `stg_pearson__student_list_report` to `int_pearson__student_list_report`; `is_proficient_int`, `results_type`, `district_state`, `aligned_test_code` all switched from inline to upstream; `local_student_identifier` now passed through (was `null`) | + +### Extracts and base models + +| Model | Changes | +| -------------------------------------------- | -------------------------------------------------------------------------------------- | +| `int_extracts__student_enrollments_subjects` | Refactored to use upstream proficiency/subject columns instead of inline CASE | +| `int_extracts__student_enrollments_courses` | **New model** — course enrollment extract from `base_powerschool__student_enrollments` | +| `base_powerschool__course_enrollments` | Added `standardized_discipline`; removed `exclude_from_gradebook` | + +## Known Issues + +- `int_pearson__all_assessments` has contract enforcement disabled — needs to be + re-enabled or documented as intentional before merge +- `int_pearson__student_list_report` properties file is incomplete — no column + definitions, no uniqueness test (required by project conventions) +- Branch history is messy (merged from `stat` and `int-extracts-courses` + branches, has checkpoint commits) — should be squash-merged + +## What's Still Open + +- Add column definitions and uniqueness test to + `int_pearson__student_list_report.yml` +- Re-enable contract enforcement on `int_pearson__all_assessments` or document + why it should stay off +- Identify remaining assessment sources and reporting models that need the same + upstream-migration treatment +- Determine scope of additional demographic reporting needs +- Testing and validation of refactored models against production data + +## How to Resume + +1. Run `uv sync --frozen` to install dependencies +2. Prepare dbt projects before testing: + ```bash + uv run dagster-dbt project prepare-and-package \ + --file src/teamster/code_locations/kipptaf/__init__.py + uv run dagster-dbt project prepare-and-package \ + --file src/teamster/code_locations/kippmiami/__init__.py + ``` +3. Pick up from "What's Still Open" above diff --git a/src/dbt/kippmiami/models/fldoe/intermediate/int_fldoe__all_assessments.sql b/src/dbt/kippmiami/models/fldoe/intermediate/int_fldoe__all_assessments.sql index bf1634a2e3..66c830709e 100644 --- a/src/dbt/kippmiami/models/fldoe/intermediate/int_fldoe__all_assessments.sql +++ b/src/dbt/kippmiami/models/fldoe/intermediate/int_fldoe__all_assessments.sql @@ -23,24 +23,46 @@ with scale_score, achievement_level, is_proficient, + performance_level as achievement_level_int, cast( coalesce(assessment_grade, test_grade, enrolled_grade) as string ) as assessment_grade, - coalesce(performance_level, achievement_level_int) as performance_level, coalesce(student_id, fleid) as student_id, regexp_extract( _dbt_source_relation, r'stg_fldoe__(\w+)' ) as assessment_name, + from union_relations ) select * except (assessment_name), + 'Actual' as results_type, + 'KTAF FL' as district_state, + if( assessment_name = 'science', 'Science', upper(assessment_name) ) as assessment_name, + + case + when assessment_subject like 'English Language Arts%' + then 'Text Study' + when assessment_subject in ('Algebra I', 'Algebra II', 'Geometry') + then 'Mathematics' + else assessment_subject + end as illuminate_subject, + + case + when achievement_level_int = 1 + then 'Below/Far Below' + when achievement_level_int = 2 + then 'Approaching' + when achievement_level_int >= 3 + then 'At/Above' + end as fast_aggregated_proficiency, + from transformed diff --git a/src/dbt/kippmiami/models/fldoe/intermediate/properties/int_fldoe__all_assessments.yml b/src/dbt/kippmiami/models/fldoe/intermediate/properties/int_fldoe__all_assessments.yml index 9afce84929..c8fa759767 100644 --- a/src/dbt/kippmiami/models/fldoe/intermediate/properties/int_fldoe__all_assessments.yml +++ b/src/dbt/kippmiami/models/fldoe/intermediate/properties/int_fldoe__all_assessments.yml @@ -21,9 +21,17 @@ models: data_type: boolean - name: assessment_grade data_type: string + - name: results_type + data_type: string + - name: district_state + data_type: string - name: performance_level data_type: int64 - name: student_id data_type: string - name: assessment_name data_type: string + - name: illuminate_subject + data_type: string + - name: fast_aggregated_proficiency + data_type: string diff --git a/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql b/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql index 94a31064aa..179a0cff7a 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql @@ -8,19 +8,9 @@ with assessment_name, is_proficient, - 'Actual' as results_type, - 'KTAF NJ' as district_state, - - case - testcode - when 'SC05' - then 'SCI05' - when 'SC08' - then 'SCI08' - when 'SC11' - then 'SCI11' - else testcode - end as test_code, + results_type, + district_state, + aligned_test_code as test_code, case when race_ethnicity = 'B' @@ -83,42 +73,21 @@ with _dbt_source_relation, academic_year, - null as localstudentidentifier, + local_student_identifier as localstudentidentifier, cast(state_student_identifier as string) as state_id, test_type as assessment_name, - - if( - performance_level - in ('Met Expectations', 'Exceeded Expectations', 'Graduation Ready'), - true, - false - ) as is_proficient, - - 'Preliminary' as results_type, - 'KTAF NJ' as district_state, - - case - when test_name = 'ELA Graduation Proficiency' - then 'ELAGP' - when test_name = 'Mathematics Graduation Proficiency' - then 'MATGP' - when test_name = 'Geometry' - then 'GEO01' - when test_name = 'Algebra I' - then 'ALG01' - when test_name like '%Mathematics%' - then concat('MAT', regexp_extract(test_name, r'.{6}(.{2})')) - when test_name like '%ELA%' - then concat('ELA', regexp_extract(test_name, r'.{6}(.{2})')) - end as test_code, + is_proficient, + results_type, + district_state, + test_code, null as aggregate_ethnicity, null as ml_status, null as iep_status, - from {{ ref("stg_pearson__student_list_report") }} + from {{ ref("int_pearson__student_list_report") }} where state_student_identifier is not null and administration = 'Spring' @@ -137,6 +106,7 @@ select a.aggregate_ethnicity, a.ml_status, a.iep_status, + a.is_proficient_int, if( e.lunch_status in ('F', 'R'), @@ -144,8 +114,6 @@ select 'Non Economically Disadvantaged' ) as lunch_status, - if(a.is_proficient, 1, 0) as is_proficient_int, - case when a.test_code = 'ALG01' then concat(a.test_code, '_', e.school_level) diff --git a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql index 4b8d568d19..3c81db557d 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql @@ -85,7 +85,7 @@ with test_name, test_code, region, - 'Spring' as season, + season, {% for entity in comparison_entities %} avg( @@ -103,7 +103,10 @@ with {% if not loop.last %},{% endif %} {% endfor %} - from {{ ref("stg_google_sheets__state_test_comparison") }} + from {{ ref("stg_google_sheets__state_test_comparison_demographics") }} + where + comparison_demographic_group = 'Total' + and comparison_demographic_subgroup = 'All Students' group by academic_year, test_name, test_code, region ), @@ -195,74 +198,25 @@ with cast(state_student_identifier as string) as state_id, test_type as assessment_name, - - case - when test_name like '%Mathematics%' - then 'Math' - when test_name in ('Algebra I', 'Geometry') - then 'Math' - else 'ELA' - end as discipline, + discipline, scale_score as score, - - case - when performance_level = 'Did Not Yet Meet Expectations' - then 1 - when performance_level = 'Partially Met Expectations' - then 2 - when performance_level = 'Approached Expectations' - then 3 - when performance_level = 'Met Expectations' - then 4 - when performance_level = 'Exceeded Expectations' - then 5 - when performance_level = 'Not Yet Graduation Ready' - then 1 - when performance_level = 'Graduation Ready' - then 2 - end as performance_band_level, - - if( - performance_level - in ('Met Expectations', 'Exceeded Expectations', 'Graduation Ready'), - true, - false - ) as is_proficient, - + performance_band_level, + is_proficient, performance_level as performance_band, + null as lep_status, null as is_504, null as iep_status, null as race_ethnicity, null as test_grade, - 'Preliminary' as results_type, + results_type, administration as `admin`, administration as season, - case - when test_name like '%Mathematics%' - then 'Mathematics' - when test_name in ('Algebra I', 'Geometry') - then 'Mathematics' - else 'English Language Arts' - end as subject, - - case - when test_name = 'ELA Graduation Proficiency' - then 'ELAGP' - when test_name = 'Mathematics Graduation Proficiency' - then 'MATGP' - when test_name = 'Geometry' - then 'GEO01' - when test_name = 'Algebra I' - then 'ALG01' - when test_name like '%Mathematics%' - then concat('MAT', regexp_extract(test_name, r'.{6}(.{2})')) - when test_name like '%ELA%' - then concat('ELA', regexp_extract(test_name, r'.{6}(.{2})')) - end as test_code, + `subject`, + test_code, from {{ ref("stg_pearson__student_list_report") }} where diff --git a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql index 8967aa7aa1..74fa712723 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql @@ -178,76 +178,14 @@ with total_students, percent_proficient, - if( - comparison_demographic_subgroup - in ('Grade - 08', 'Grade - 09', 'Grade - 10'), - 'Total', - comparison_demographic_group - ) as comparison_demographic_group, - - if( - comparison_demographic_group = 'Grade', - 'All Students', - comparison_demographic_subgroup - ) as comparison_demographic_subgroup, - - round(percent_proficient * total_students, 0) as total_proficient_students, + comparison_demographic_group_aligned as comparison_demographic_group, + comparison_demographic_subgroup_aligned as comparison_demographic_subgroup, + total_proficient_students, test_code, - - case - when - comparison_demographic_subgroup = 'Grade - 08' - and test_code = 'ALG01' - then 'MS' - when comparison_demographic_subgroup in ('Grade - 09', 'Grade - 10') - then 'HS' - when - test_code in ( - 'ELA09', - 'ELA10', - 'ELA11', - 'ELAGP', - 'ALG01', - 'GEO01', - 'ALG02', - 'MATGP', - 'SCI11' - ) - then 'HS' - when safe_cast(right(test_code, 2) as numeric) between 5 and 8 - then 'MS' - else 'ES' - end as school_level, - - case - when comparison_demographic_subgroup in ('Grade - 09', 'Grade - 10') - then 'HS' - when - test_code in ( - 'ELA09', - 'ELA10', - 'ELA11', - 'ELAGP', - 'ALG02', - 'GEO01', - 'MATGP', - 'SCI11' - ) - then 'HS' - else '3-8' - end as grade_range_band, - - case - when left(test_code, 3) in ('MAT', 'ALG', 'GEO') - then 'Math' - when left(test_code, 3) = 'ELA' - then 'ELA' - when left(test_code, 3) = 'SCI' - then 'Science' - when left(test_code, 3) = 'SOC' - then 'Social Studies' - end as discipline, + school_level, + grade_range_band, + discipline, from {{ ref("stg_google_sheets__state_test_comparison_demographics") }} where comparison_demographic_subgroup != 'SE Accommodation' diff --git a/src/dbt/kipptaf/models/google/sheets/sources-external.yml b/src/dbt/kipptaf/models/google/sheets/sources-external.yml index 71456a42ac..597b08fac9 100644 --- a/src/dbt/kipptaf/models/google/sheets/sources-external.yml +++ b/src/dbt/kipptaf/models/google/sheets/sources-external.yml @@ -67,14 +67,8 @@ sources: - sheets - student_graduation_path_cutoffs - name: src_google_sheets__state_test_comparison - external: - options: - format: GOOGLE_SHEETS - uris: - - https://docs.google.com/spreadsheets/d/1yS6xU7ygiOrrtc29pUc3jr590qk7ttag3RuzVHaPOv8 - sheet_range: src_google_sheets__state_test_comparison - skip_leading_rows: 1 config: + enabled: false meta: dagster: asset_key: @@ -82,6 +76,13 @@ sources: - google - sheets - state_test_comparison + external: + options: + format: GOOGLE_SHEETS + uris: + - https://docs.google.com/spreadsheets/d/1yS6xU7ygiOrrtc29pUc3jr590qk7ttag3RuzVHaPOv8 + sheet_range: src_google_sheets__state_test_comparison + skip_leading_rows: 1 - name: src_google_sheets__state_test_comparison_demographics external: options: @@ -885,7 +886,7 @@ sources: format: GOOGLE_SHEETS uris: - https://docs.google.com/spreadsheets/d/1G2z9rwXsFaMdFL6iOYdfQTVjZ7bctXMyz_Q09IhP4QE - sheet_range: src_assessments__course_subject_crosswalk + sheet_range: src_assessments__course_subject_crosswalk_v2 skip_leading_rows: 1 columns: - name: PowerSchool_Course_Number @@ -898,8 +899,6 @@ sources: data_type: boolean - name: Is_Advanced_Math data_type: boolean - - name: Exclude_from_Gradebook - data_type: boolean - name: Discipline data_type: string - name: Duplicate_Audit diff --git a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__assessments__course_subject_crosswalk.yml b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__assessments__course_subject_crosswalk.yml index 1e69ed7164..70b0526b11 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__assessments__course_subject_crosswalk.yml +++ b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__assessments__course_subject_crosswalk.yml @@ -15,8 +15,6 @@ models: data_type: boolean - name: Is_Advanced_Math data_type: boolean - - name: Exclude_from_Gradebook - data_type: boolean - name: Discipline data_type: string - name: Duplicate_Audit diff --git a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison.yml b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison.yml index e6bf726e26..058281df32 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison.yml +++ b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison.yml @@ -1,5 +1,7 @@ models: - name: stg_google_sheets__state_test_comparison + config: + enabled: false columns: - name: Academic_Year data_type: int64 diff --git a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml index a6f5776e06..aa2e3ccdb1 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml +++ b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml @@ -19,3 +19,17 @@ models: data_type: float64 - name: Total_Students data_type: int64 + - name: season + data_type: string + - name: school_level + data_type: string + - name: grade_range_band + data_type: string + - name: discipline + data_type: string + - name: comparison_demographic_group_aligned + data_type: string + - name: comparison_demographic_subgroup_aligned + data_type: string + - name: total_proficient_students + data_type: float64 diff --git a/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql b/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql index b43f3385a3..886b597582 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql +++ b/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql @@ -1,4 +1,66 @@ -select *, +select + *, + + 'Spring' as season, + + case + when comparison_demographic_subgroup = 'Grade - 08' and test_code = 'ALG01' + then 'MS' + when comparison_demographic_subgroup in ('Grade - 09', 'Grade - 10') + then 'HS' + when + test_code in ( + 'ELA09', + 'ELA10', + 'ELA11', + 'ELAGP', + 'ALG01', + 'GEO01', + 'ALG02', + 'MATGP', + 'SCI11' + ) + then 'HS' + when safe_cast(right(test_code, 2) as numeric) between 5 and 8 + then 'MS' + else 'ES' + end as school_level, + + case + when comparison_demographic_subgroup in ('Grade - 09', 'Grade - 10') + then 'HS' + when + test_code + in ('ELA09', 'ELA10', 'ELA11', 'ELAGP', 'ALG02', 'GEO01', 'MATGP', 'SCI11') + then 'HS' + else '3-8' + end as grade_range_band, + + case + when left(test_code, 3) in ('MAT', 'ALG', 'GEO') + then 'Math' + when left(test_code, 3) = 'ELA' + then 'ELA' + when left(test_code, 3) = 'SCI' + then 'Science' + when left(test_code, 3) = 'SOC' + then 'Social Studies' + end as discipline, + + if( + comparison_demographic_subgroup in ('Grade - 08', 'Grade - 09', 'Grade - 10'), + 'Total', + comparison_demographic_group + ) as comparison_demographic_group_aligned, + + if( + comparison_demographic_group = 'Grade', + 'All Students', + comparison_demographic_subgroup + ) as comparison_demographic_subgroup_aligned, + + round(percent_proficient * total_students, 0) as total_proficient_students, + from {{ source( diff --git a/src/dbt/kipptaf/models/iready/intermediate/int_iready__diagnostic_results.sql b/src/dbt/kipptaf/models/iready/intermediate/int_iready__diagnostic_results.sql index 28214543ef..4f7cdb77d3 100644 --- a/src/dbt/kipptaf/models/iready/intermediate/int_iready__diagnostic_results.sql +++ b/src/dbt/kipptaf/models/iready/intermediate/int_iready__diagnostic_results.sql @@ -125,6 +125,21 @@ select right(rt.code, 1) as round_number, + case + when wc.overall_relative_placement_int <= 2 + then 'Below/Far Below' + when wc.overall_relative_placement_int = 3 + then 'Approaching' + when wc.overall_relative_placement_int >= 4 + then 'At/Above' + end as iready_proficiency, + + if( + cwp.scale_low - wc.most_recent_overall_scale_score <= 0, + 0, + cwp.scale_low - wc.most_recent_overall_scale_score + ) as scale_points_to_proficiency, + round( wc.most_recent_diagnostic_gain / wc.annual_typical_growth_measure, 2 ) as progress_to_typical, @@ -133,12 +148,6 @@ select wc.most_recent_diagnostic_gain / wc.annual_stretch_growth_measure, 2 ) as progress_to_stretch, - if( - cwp.scale_low - wc.most_recent_overall_scale_score <= 0, - 0, - cwp.scale_low - wc.most_recent_overall_scale_score - ) as scale_points_to_proficiency, - row_number() over ( partition by wc._dbt_source_relation, @@ -148,6 +157,7 @@ select rt.name order by wc.completion_date desc ) as rn_subj_round, + from window_calcs as wc left join {{ ref("stg_google_sheets__reporting__terms") }} as rt diff --git a/src/dbt/kipptaf/models/iready/intermediate/properties/int_iready__diagnostic_results.yml b/src/dbt/kipptaf/models/iready/intermediate/properties/int_iready__diagnostic_results.yml index 6fa7d403d0..8736f1f2c1 100644 --- a/src/dbt/kipptaf/models/iready/intermediate/properties/int_iready__diagnostic_results.yml +++ b/src/dbt/kipptaf/models/iready/intermediate/properties/int_iready__diagnostic_results.yml @@ -283,6 +283,8 @@ models: data_type: int64 - name: round_number data_type: string + - name: iready_proficiency + data_type: string - name: scale_points_to_proficiency data_type: int64 - name: sublevel_number_with_typical diff --git a/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__all_assessments.sql b/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__all_assessments.sql index 5ca1b53944..9609e32b1a 100644 --- a/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__all_assessments.sql +++ b/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__all_assessments.sql @@ -61,7 +61,7 @@ select u.period, u.firstname, u.lastorsurname, - u.subject, + u.`subject`, u.testcode, u.studenttestuuid, u.test_grade, @@ -71,15 +71,25 @@ select u.twoormoreraces, u.white, + 'Actual' as results_type, + 'KTAF NJ' as district_state, + cast(u.statestudentidentifier as string) as statestudentidentifier, coalesce(u.studentwithdisabilities in ('504', 'B'), false) as is_504, coalesce(x.student_number, u.localstudentidentifier) as localstudentidentifier, - if(u.englishlearnerel = 'Y', true, false) as lep_status, - - if(u.studentwithdisabilities in ('IEP', 'B'), 'Has IEP', 'No IEP') as iep_status, + case + u.testcode + when 'SC05' + then 'SCI05' + when 'SC08' + then 'SCI08' + when 'SC11' + then 'SCI11' + else u.testcode + end as aligned_test_code, case when u.twoormoreraces = 'Y' @@ -98,6 +108,39 @@ select then 'W' end as race_ethnicity, + case + when u.`subject` like 'English Language Arts%' + then 'Text Study' + when u.`subject` in ('Algebra I', 'Algebra II', 'Geometry') + then 'Mathematics' + else u.`subject` + end as illuminate_subject, + + case + when u.assessment_name = 'NJSLA' and u.testperformancelevel <= 2 + then 'Below/Far Below' + when u.assessment_name = 'NJSLA' and u.testperformancelevel = 3 + then 'Approaching' + when u.assessment_name = 'NJSLA' and u.testperformancelevel >= 4 + then 'At/Above' + end as njsla_aggregated_proficiency, + + if(u.englishlearnerel = 'Y', true, false) as lep_status, + + if(u.studentwithdisabilities in ('IEP', 'B'), 'Has IEP', 'No IEP') as iep_status, + + if(u.`period` = 'FallBlock', 'Fall', u.`period`) as `admin`, + + if(u.`period` = 'FallBlock', 'Fall', u.`period`) as season, + + if( + u.`subject` = 'English Language Arts/Literacy', + 'English Language Arts', + u.`subject` + ) as aligned_subject, + + if(u.is_proficient, 1, 0) as is_proficient_int, + from union_relations as u left join {{ ref("stg_google_sheets__pearson__student_crosswalk") }} as x diff --git a/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql b/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql new file mode 100644 index 0000000000..84d4f7a9b5 --- /dev/null +++ b/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql @@ -0,0 +1,80 @@ +with + scores as ( + select + _dbt_source_relation, + academic_year, + state_student_identifier, + local_student_identifier, + last_or_surname, + first_name, + date_of_birth, + test_type, + scale_score, + performance_level, + administration, + + 'Preliminary' as results_type, + 'KTAF NJ' as district_state, + + case + when test_name like '%Mathematics%' + then 'Math' + when test_name in ('Algebra I', 'Geometry') + then 'Math' + else 'ELA' + end as discipline, + + case + when test_name like '%Mathematics%' + then 'Mathematics' + when test_name in ('Algebra I', 'Geometry') + then 'Mathematics' + else 'English Language Arts' + end as `subject`, + + case + when test_name = 'ELA Graduation Proficiency' + then 'ELAGP' + when test_name = 'Mathematics Graduation Proficiency' + then 'MATGP' + when test_name = 'Geometry' + then 'GEO01' + when test_name = 'Algebra I' + then 'ALG01' + when test_name like '%Mathematics%' + then concat('MAT', regexp_extract(test_name, r'.{6}(.{2})')) + when test_name like '%ELA%' + then concat('ELA', regexp_extract(test_name, r'.{6}(.{2})')) + end as test_code, + + case + when performance_level = 'Did Not Yet Meet Expectations' + then 1 + when performance_level = 'Partially Met Expectations' + then 2 + when performance_level = 'Approached Expectations' + then 3 + when performance_level = 'Met Expectations' + then 4 + when performance_level = 'Exceeded Expectations' + then 5 + when performance_level = 'Not Yet Graduation Ready' + then 1 + when performance_level = 'Graduation Ready' + then 2 + end as performance_band_level, + + from {{ ref("stg_pearson__student_list_report") }} + ) + +select + *, + + if( + performance_level + in ('Met Expectations', 'Exceeded Expectations', 'Graduation Ready'), + true, + false + ) as is_proficient, + +from scores diff --git a/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__all_assessments.yml b/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__all_assessments.yml index 2320ebb6f0..6ca2e0a49d 100644 --- a/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__all_assessments.yml +++ b/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__all_assessments.yml @@ -1,5 +1,8 @@ models: - name: int_pearson__all_assessments + config: + contract: + enabled: false columns: - name: _dbt_source_relation data_type: string @@ -63,3 +66,7 @@ models: data_type: string - name: race_ethnicity data_type: string + - name: illuminate_subject + data_type: string + - name: njsla_aggregated_proficiency + data_type: string diff --git a/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__student_list_report.yml b/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__student_list_report.yml new file mode 100644 index 0000000000..998dc8b224 --- /dev/null +++ b/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__student_list_report.yml @@ -0,0 +1,2 @@ +models: + - name: int_pearson__student_list_report diff --git a/src/dbt/kipptaf/models/powerschool/base/base_powerschool__course_enrollments.sql b/src/dbt/kipptaf/models/powerschool/base/base_powerschool__course_enrollments.sql index 2cb545678e..0f349d2caa 100644 --- a/src/dbt/kipptaf/models/powerschool/base/base_powerschool__course_enrollments.sql +++ b/src/dbt/kipptaf/models/powerschool/base/base_powerschool__course_enrollments.sql @@ -49,18 +49,20 @@ select csc.illuminate_subject_area, csc.is_foundations, csc.is_advanced_math, - csc.exclude_from_gradebook, csc.discipline, initcap(regexp_extract(ur._dbt_source_relation, r'kipp(\w+)_')) as region, if(cx.ap_course_subject is not null, true, false) as is_ap_course, + if(csc.discipline = 'SOC', 'Civics', csc.discipline) as standardized_discipline, + row_number() over ( partition by ur._dbt_source_relation, ur.cc_studyear, csc.illuminate_subject_area order by ur.cc_termid desc, ur.cc_dateenrolled desc, ur.cc_dateleft desc ) as rn_student_year_illuminate_subject_desc, + from union_relations as ur left join {{ ref("stg_powerschool__s_nj_crs_x") }} as cx diff --git a/src/dbt/kipptaf/models/powerschool/base/properties/base_powerschool__course_enrollments.yml b/src/dbt/kipptaf/models/powerschool/base/properties/base_powerschool__course_enrollments.yml index 9cb0d84f8a..3abcb56e4b 100644 --- a/src/dbt/kipptaf/models/powerschool/base/properties/base_powerschool__course_enrollments.yml +++ b/src/dbt/kipptaf/models/powerschool/base/properties/base_powerschool__course_enrollments.yml @@ -475,5 +475,7 @@ models: data_type: boolean - name: is_ap_course data_type: boolean + - name: standardized_discipline + data_type: string - name: rn_student_year_illuminate_subject_desc data_type: int64 diff --git a/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments_courses.sql b/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments_courses.sql new file mode 100644 index 0000000000..74b19fa8b8 --- /dev/null +++ b/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments_courses.sql @@ -0,0 +1,32 @@ +select + e.* except ( + lastfirst, + last_name, + first_name, + middle_name, + school_abbreviation, + advisory_section_number, + student_email_google, + salesforce_contact_id, + salesforce_contact_df_has_fafsa, + salesforce_contact_college_match_display_gpa, + salesforce_contact_college_match_gpa_band, + salesforce_contact_owner_name, + state_studentnumber, + `state` + ), + + e.lastfirst as student_name, + e.last_name as student_last_name, + e.first_name as student_first_name, + e.middle_name as student_middle_name, + e.school_abbreviation as school, + e.advisory_section_number as team, + e.student_email_google as student_email, + e.salesforce_contact_id as salesforce_id, + e.salesforce_contact_df_has_fafsa as has_fafsa, + e.salesforce_contact_college_match_display_gpa as college_match_gpa, + e.salesforce_contact_college_match_gpa_band as college_match_gpa_bands, + e.salesforce_contact_owner_name as contact_owner_name, + +from {{ ref("base_powerschool__student_enrollments") }} as e diff --git a/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments_subjects.sql b/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments_subjects.sql index 213bb31432..5fc5c39a99 100644 --- a/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments_subjects.sql +++ b/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments_subjects.sql @@ -54,27 +54,13 @@ with localstudentidentifier, is_proficient, + illuminate_subject as `subject`, + njsla_aggregated_proficiency as njsla_proficiency, + academic_year + 1 as academic_year_plus, cast(statestudentidentifier as string) as statestudentidentifier, - case - when `subject` like 'English Language Arts%' - then 'Text Study' - when `subject` in ('Algebra I', 'Algebra II', 'Geometry') - then 'Mathematics' - else `subject` - end as `subject`, - - case - when testperformancelevel <= 2 - then 'Below/Far Below' - when testperformancelevel = 3 - then 'Approaching' - when testperformancelevel >= 4 - then 'At/Above' - end as njsla_proficiency, - from {{ ref("int_pearson__all_assessments") }} union all @@ -86,27 +72,13 @@ with is_proficient, + illuminate_subject as `subject`, + fast_aggregated_proficiency as proficiency, + academic_year + 1 as academic_year_plus, student_id as statestudentidentifier, - case - when assessment_subject like 'English Language Arts%' - then 'Text Study' - when assessment_subject in ('Algebra I', 'Algebra II', 'Geometry') - then 'Mathematics' - else assessment_subject - end as `subject`, - - case - when achievement_level_int = 1 - then 'Below/Far Below' - when achievement_level_int = 2 - then 'Approaching' - when achievement_level_int >= 3 - then 'At/Above' - end as proficiency, - from {{ ref("int_fldoe__all_assessments") }} where scale_score is not null @@ -118,18 +90,10 @@ with select student_id, `subject`, + iready_proficiency, academic_year_int + 1 as academic_year_plus, - case - when overall_relative_placement_int <= 2 - then 'Below/Far Below' - when overall_relative_placement_int = 3 - then 'Approaching' - when overall_relative_placement_int >= 4 - then 'At/Above' - end as iready_proficiency, - from {{ ref("int_iready__diagnostic_results") }} where rn_subj_round = 1 and test_round = 'EOY' ), @@ -202,6 +166,7 @@ with trim(split(specprog_name, '-')[offset(0)]) as bucket, trim(split(specprog_name, '-')[offset(1)]) as discipline, + from {{ ref("int_powerschool__spenrollments") }} where specprog_name like 'Bucket%' ), From 31eb6b18a2ac6760f01d1e729a77d108d29330be Mon Sep 17 00:00:00 2001 From: grangel <140853376+GabyRangelB@users.noreply.github.com> Date: Thu, 26 Mar 2026 20:55:40 +0000 Subject: [PATCH 2/9] refactor(dbt): move assessment metadata fields upstream from reporting CTE - Switch state_comps CTE to stg_google_sheets__state_test_comparison_demographics - Move results_type, admin, season, subject, test_code upstream to int models - Rename test_code to aligned_test_code in int_pearson__student_list_report - Add admin and subject aliases to int_fldoe__all_assessments - Swap stg_pearson__student_list_report ref to int_pearson__student_list_report Co-Authored-By: Claude Sonnet 4.6 (1M context) --- .../int_fldoe__all_assessments.sql | 3 + ...t_tableau__state_assessments_dashboard.sql | 118 ++++++++++-------- .../int_pearson__student_list_report.sql | 2 +- 3 files changed, 67 insertions(+), 56 deletions(-) diff --git a/src/dbt/kippmiami/models/fldoe/intermediate/int_fldoe__all_assessments.sql b/src/dbt/kippmiami/models/fldoe/intermediate/int_fldoe__all_assessments.sql index 66c830709e..4b8600bcbd 100644 --- a/src/dbt/kippmiami/models/fldoe/intermediate/int_fldoe__all_assessments.sql +++ b/src/dbt/kippmiami/models/fldoe/intermediate/int_fldoe__all_assessments.sql @@ -44,6 +44,9 @@ select 'Actual' as results_type, 'KTAF FL' as district_state, + administration_window as `admin`, + assessment_subject as `subject`, + if( assessment_name = 'science', 'Science', upper(assessment_name) ) as assessment_name, diff --git a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql index 3c81db557d..799d28f6d3 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql @@ -18,16 +18,11 @@ with s.abbreviation as school, - case - when c.courses_credittype in ('ENG', 'ELA') - then 'ELA' - when c.courses_credittype in ('MATH', 'Math') - then 'Math' - when c.courses_credittype in ('SCI', 'Science') - then 'Science' - when c.courses_credittype = 'SOC' - then 'Civics' - end as discipline, + if( + c.courses_credittype = 'SOC' and c.region = 'Miami', + 'Civics', + c.discipline + ) as discipline, from {{ ref("base_powerschool__course_enrollments") }} as c left join @@ -37,8 +32,7 @@ with c.cc_academic_year = {{ var("current_academic_year") }} and c.rn_credittype_year = 1 and not c.is_dropped_section - and c.courses_credittype - in ('ENG', 'MATH', 'SCI', 'SOC', 'ELA', 'Math', 'Science') + and c.courses_credittype in ('ENG', 'MATH', 'SCI', 'SOC') ), schedules as ( @@ -55,16 +49,11 @@ with c.teachernumber as teachernumber_current, c.teacher_name as teacher_name_current, - case - when e.courses_credittype in ('ENG', 'ELA') - then 'ELA' - when e.courses_credittype in ('MATH', 'Math') - then 'Math' - when e.courses_credittype in ('SCI', 'Science') - then 'Science' - when e.courses_credittype = 'SOC' - then 'Civics' - end as discipline, + if( + e.courses_credittype = 'SOC' and e.region = 'Miami', + 'Civics', + e.discipline + ) as discipline, from {{ ref("base_powerschool__course_enrollments") }} as e left join @@ -75,8 +64,7 @@ with e.cc_academic_year >= {{ var("current_academic_year") - 7 }} and e.rn_credittype_year = 1 and not e.is_dropped_section - and e.courses_credittype - in ('ENG', 'MATH', 'SCI', 'SOC', 'ELA', 'Math', 'Science') + and e.courses_credittype in ('ENG', 'MATH', 'SCI', 'SOC') ), state_comps as ( @@ -107,7 +95,7 @@ with where comparison_demographic_group = 'Total' and comparison_demographic_subgroup = 'All Students' - group by academic_year, test_name, test_code, region + group by academic_year, test_name, test_code, region, season ), assessment_scores as ( @@ -128,28 +116,12 @@ with race_ethnicity, test_grade, - 'Actual' as results_type, - - if(`period` = 'FallBlock', 'Fall', `period`) as `admin`, - - if(`period` = 'FallBlock', 'Fall', `period`) as season, - - if( - `subject` = 'English Language Arts/Literacy', - 'English Language Arts', - `subject` - ) as `subject`, + results_type, - case - testcode - when 'SC05' - then 'SCI05' - when 'SC08' - then 'SCI08' - when 'SC11' - then 'SCI11' - else testcode - end as test_code, + `admin`, + season, + aligned_subject as `subject`, + aligned_test_code as test_code, from {{ ref("int_pearson__all_assessments") }} where @@ -181,9 +153,9 @@ with 'Actual' as results_type, - administration_window as `admin`, + `admin`, season, - assessment_subject as `subject`, + `subject`, test_code, from {{ ref("int_fldoe__all_assessments") }} @@ -198,13 +170,42 @@ with cast(state_student_identifier as string) as state_id, test_type as assessment_name, - discipline, + + case + when test_name like '%Mathematics%' + then 'Math' + when test_name in ('Algebra I', 'Geometry') + then 'Math' + else 'ELA' + end as discipline, scale_score as score, - performance_band_level, - is_proficient, - performance_level as performance_band, + case + when performance_level = 'Did Not Yet Meet Expectations' + then 1 + when performance_level = 'Partially Met Expectations' + then 2 + when performance_level = 'Approached Expectations' + then 3 + when performance_level = 'Met Expectations' + then 4 + when performance_level = 'Exceeded Expectations' + then 5 + when performance_level = 'Not Yet Graduation Ready' + then 1 + when performance_level = 'Graduation Ready' + then 2 + end as performance_band_level, + + if( + performance_level + in ('Met Expectations', 'Exceeded Expectations', 'Graduation Ready'), + true, + false + ) as is_proficient, + + performance_level as performance_band, null as lep_status, null as is_504, null as iep_status, @@ -215,10 +216,17 @@ with administration as `admin`, administration as season, - `subject`, - test_code, + case + when test_name like '%Mathematics%' + then 'Mathematics' + when test_name in ('Algebra I', 'Geometry') + then 'Mathematics' + else 'English Language Arts' + end as subject, + + aligned_test_code as test_code, - from {{ ref("stg_pearson__student_list_report") }} + from {{ ref("int_pearson__student_list_report") }} where state_student_identifier is not null and administration = 'Spring' diff --git a/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql b/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql index 84d4f7a9b5..3395cf96e2 100644 --- a/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql +++ b/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql @@ -45,7 +45,7 @@ with then concat('MAT', regexp_extract(test_name, r'.{6}(.{2})')) when test_name like '%ELA%' then concat('ELA', regexp_extract(test_name, r'.{6}(.{2})')) - end as test_code, + end as aligned_test_code, case when performance_level = 'Did Not Yet Meet Expectations' From bf788286ffbd65a0c2fb4de1a616937a733ad4c4 Mon Sep 17 00:00:00 2001 From: grangel <140853376+GabyRangelB@users.noreply.github.com> Date: Mon, 30 Mar 2026 19:11:25 +0000 Subject: [PATCH 3/9] refactor(dbt): optimize demographic comps pipeline and fix region match flags Replace GROUP BY CUBE (1,024 combos) with explicit GROUPING SETS (12 combos) for ~85x reduction in computed groups. Consolidate the demographic comps intermediate chain from 2 models + macro into 1 model. Push demographic labels, comparison_entity, and test_code-derived columns upstream into the intermediate to simplify the reporting layer. Fix self-join bug that made region_matched/ region_outperformed flags dead columns. Add uniqueness tests to stg, int, and rpt models. Co-Authored-By: Claude Opus 4.6 (1M context) --- .../kipptaf/macros/generate_cube_query.sql | 117 ---- ...u__state_assessments_demographic_comps.sql | 647 ++++++++++-------- ...te_assessments_demographic_comps_cubed.sql | 34 - ...u__state_assessments_demographic_comps.yml | 44 +- ...te_assessments_demographic_comps_cubed.yml | 57 -- ...eau__state_assessments_dashboard_comps.yml | 14 + ...eau__state_assessments_dashboard_comps.sql | 420 ++++-------- .../models/google/sheets/sources-external.yml | 2 +- ...ts__state_test_comparison_demographics.yml | 98 ++- ...ts__state_test_comparison_demographics.sql | 46 -- .../marts/dim_state_assessment_benchmarks.sql | 45 +- .../dim_state_assessment_benchmarks.yml | 66 +- 12 files changed, 749 insertions(+), 841 deletions(-) delete mode 100644 src/dbt/kipptaf/macros/generate_cube_query.sql delete mode 100644 src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps_cubed.sql delete mode 100644 src/dbt/kipptaf/models/extracts/tableau/intermediate/properties/int_tableau__state_assessments_demographic_comps_cubed.yml diff --git a/src/dbt/kipptaf/macros/generate_cube_query.sql b/src/dbt/kipptaf/macros/generate_cube_query.sql deleted file mode 100644 index afd809cf18..0000000000 --- a/src/dbt/kipptaf/macros/generate_cube_query.sql +++ /dev/null @@ -1,117 +0,0 @@ -{% macro generate_cube_query( - dimensions, - metrics, - source_relation, - include_row_number=False, - focus_group=False, - focus_dims=[] -) %} - - select - - -- Dynamic dimensions - {% for dim in dimensions %} {{ dim }},{% endfor %} - - -- Dynamic metrics - {% for metric in metrics %} {{ metric }},{% endfor %} - - -- GROUPING() flags - {% for dim in dimensions %} - grouping({{ dim }}) as is_{{ dim }}_total, - {% endfor %} - - -- grouping_level using POW + GROUPING() - ( - {% for dim in dimensions %} - grouping({{ dim }}) * pow(2, {{ loop.index0 }}) - {% if not loop.last %} + {% endif %} - {% endfor %} - ) as grouping_level, - - -- focus_level label for active focus dimension - case - {% for dim in focus_dims %} - when - grouping({{ dim }}) = 0 - and {% for other in focus_dims if other != dim %} - grouping({{ other }}) = 1{% if not loop.last %} and {% endif %} - {% endfor %} - then '{{ dim }}' - {% endfor %} - when - {% for dim in focus_dims %} - grouping({{ dim }}) = 1{% if not loop.last %} and {% endif %} - {% endfor %} - then 'all_null' - else 'multi' - end as focus_level, - - -- total_type: hierarchical label with grouping_level - case - when - ( - {% for dim in dimensions %} - grouping({{ dim }}) * pow(2, {{ loop.index0 }}) - {% if not loop.last %} + {% endif %} - {% endfor %} - ) - = 0 - then 'Level 0: Detail' - - when - ( - {% for dim in dimensions %} - grouping({{ dim }}) * pow(2, {{ loop.index0 }}) - {% if not loop.last %} + {% endif %} - {% endfor %} - ) - = {{ 2 ** (dimensions | length) - 1 }} - then 'Level {{ 2 ** (dimensions | length) - 1 }}: Grand Total' - - else - concat( - 'Level ', - cast( - ( - {% for dim in dimensions %} - grouping({{ dim }}) * pow(2, {{ loop.index0 }}) - {% if not loop.last %} + {% endif %} - {% endfor %} - ) as string - ), - ': Subtotal – ', - array_to_string( - [ - {% for dim in dimensions %} - if(grouping({{ dim }}) = 1, '{{ dim }}', null) - {% if not loop.last %}, {% endif %} - {% endfor %} - ], - ', ' - ) - ) - end as total_type - - from {{ source_relation }} - - group by cube ({{ dimensions | join(", ") }}) - - {% if focus_group and focus_dims %} - having - ( - ( - {% for dim in focus_dims %} - cast(grouping({{ dim }}) = 0 as int64) - {% if not loop.last %} + {% endif %} - {% endfor %} - ) - = 1 - or ( - {% for dim in focus_dims %} - grouping({{ dim }}) = 1{% if not loop.last %} and {% endif %} - {% endfor %} - ) - ) - {% endif %} - -{% endmacro %} diff --git a/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql b/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql index 179a0cff7a..fdee964ab6 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql @@ -1,277 +1,370 @@ -with - assessment_scores as ( - select - _dbt_source_relation, - academic_year, - localstudentidentifier, - statestudentidentifier as state_id, - assessment_name, - is_proficient, - - results_type, - district_state, - aligned_test_code as test_code, - - case - when race_ethnicity = 'B' - then 'African American' - when race_ethnicity = 'A' - then 'Asian' - when race_ethnicity = 'I' - then 'American Indian' - when race_ethnicity = 'H' - then 'Hispanic' - when race_ethnicity = 'P' - then 'Native Hawaiian' - when race_ethnicity = 'T' - then 'Other' - when race_ethnicity = 'W' - then 'White' - when race_ethnicity is null - then 'Blank' - end as aggregate_ethnicity, - - if(lep_status, 'ML', 'Not ML') as ml_status, - - if( - iep_status = 'Has IEP', - 'Students With Disabilities', - 'Students Without Disabilities' - ) as iep_status, - - from {{ ref("int_pearson__all_assessments") }} - where - testscalescore is not null and `period` = 'Spring' and academic_year >= 2018 - - union all - - select - _dbt_source_relation, - academic_year, - - null as localstudentidentifier, - - student_id as state_id, - assessment_name, - is_proficient, - - 'Actual' as results_type, - 'KTAF FL' as district_state, - - test_code, - - null as aggregate_ethnicity, - null as ml_status, - null as iep_status, - - from {{ ref("int_fldoe__all_assessments") }} - where scale_score is not null and season = 'Spring' - - union all - - select - _dbt_source_relation, - academic_year, - - local_student_identifier as localstudentidentifier, - - cast(state_student_identifier as string) as state_id, - - test_type as assessment_name, - is_proficient, - results_type, - district_state, - test_code, - - null as aggregate_ethnicity, - null as ml_status, - null as iep_status, - - from {{ ref("int_pearson__student_list_report") }} - where - state_student_identifier is not null - and administration = 'Spring' - and test_type = 'NJSLA' - and academic_year = {{ var("current_academic_year") }} - ) - -/* NJ scores */ -select - e.academic_year, - e.region, - e.student_number, - - a.district_state, - a.assessment_name, - a.aggregate_ethnicity, - a.ml_status, - a.iep_status, - a.is_proficient_int, - - if( - e.lunch_status in ('F', 'R'), - 'Economically Disadvantaged', - 'Non Economically Disadvantaged' - ) as lunch_status, - - case - when a.test_code = 'ALG01' - then concat(a.test_code, '_', e.school_level) - else a.test_code - end as test_code, - - case - e.gender when 'F' then 'Female' when 'M' then 'Male' when 'X' then 'Non-Binary' - end as gender, - -from {{ ref("int_extracts__student_enrollments") }} as e -inner join - assessment_scores as a - on e.academic_year = a.academic_year - and e.pearson_local_student_identifier = a.localstudentidentifier - and {{ union_dataset_join_clause(left_alias="e", right_alias="a") }} - and a.results_type = 'Actual' -where - e.rn_year = 1 - and e.academic_year >= {{ var("current_academic_year") - 7 }} - and e.grade_level > 2 - and e.school_level != 'OD' - -union all - -/* FL scores */ -select - e.academic_year, - e.region, - e.student_number, - - a.district_state, - a.assessment_name, - - case - when e.race_ethnicity = 'B' - then 'African American' - when e.race_ethnicity = 'A' - then 'Asian' - when e.race_ethnicity = 'I' - then 'American Indian' - when e.race_ethnicity = 'H' - then 'Hispanic' - when e.race_ethnicity = 'P' - then 'Native Hawaiian' - when e.race_ethnicity = 'T' - then 'Other' - when e.race_ethnicity = 'W' - then 'White' - when e.race_ethnicity is null - then 'Blank' - end as aggregate_ethnicity, - - e.ml_status, - - if( - e.iep_status = 'Has IEP', - 'Students With Disabilities', - 'Students Without Disabilities' - ) as iep_status, - - if( - e.lunch_status in ('F', 'R'), - 'Economically Disadvantaged', - 'Non Economically Disadvantaged' - ) as lunch_status, - - if(a.is_proficient, 1, 0) as is_proficient_int, - - case - when a.test_code = 'ALG01' - then concat(a.test_code, '_', e.school_level) - else a.test_code - end as test_code, - - case - e.gender when 'F' then 'Female' when 'M' then 'Male' when 'X' then 'Non-Binary' - end as gender, - -from {{ ref("int_extracts__student_enrollments") }} as e -inner join - assessment_scores as a - on e.academic_year = a.academic_year - and e.state_studentnumber = a.state_id - and {{ union_dataset_join_clause(left_alias="e", right_alias="a") }} - and a.results_type = 'Actual' -where - e.region = 'Miami' - and e.rn_year = 1 - and e.academic_year >= {{ var("current_academic_year") - 7 }} - and e.grade_level > 2 - - -- union all - /* NJ prelim scores */ - /* disabled until december -select - e.academic_year, - e.region, - e.student_number, - - a.district_state, - a.assessment_name, - - case - when e.race_ethnicity = 'B' - then 'African American' - when e.race_ethnicity = 'A' - then 'Asian' - when e.race_ethnicity = 'I' - then 'American Indian' - when e.race_ethnicity = 'H' - then 'Hispanic' - when e.race_ethnicity = 'P' - then 'Native Hawaiian' - when e.race_ethnicity = 'T' - then 'Other' - when e.race_ethnicity = 'W' - then 'White' - when e.race_ethnicity is null - then 'Blank' - end as aggregate_ethnicity, - - e.ml_status, - - if( - e.iep_status = 'Has IEP', - 'Students With Disabilities', - 'Students Without Disabilities' - ) as iep_status, - - if( - e.lunch_status in ('F', 'R'), - 'Economically Disadvantaged', - 'Non Economically Disadvantaged' - ) as lunch_status, - - if(a.is_proficient, 1, 0) as is_proficient_int, - - case - when a.test_code = 'ALG01' - then concat(a.test_code, '_', e.school_level) - else a.test_code - end as test_code, - - case - e.gender when 'F' then 'Female' when 'M' then 'Male' when 'X' then 'Non-Binary' - end as gender, - -from {{ ref("int_extracts__student_enrollments") }} as e -inner join - assessment_scores as a - on e.academic_year = a.academic_year - and e.state_studentnumber = a.state_id - and {{ union_dataset_join_clause(left_alias="e", right_alias="a") }} - and a.results_type = 'Preliminary' -where - e.academic_year = {{ var("current_academic_year") }} - and e.rn_year = 1 - and e.grade_level > 2 - and e.school_level != 'OD' -*/ +{# + Student-level assessment scores joined to enrollment demographics, + then aggregated via GROUPING SETS into demographic comparison rows. + + Each grouping set produces one demographic focus at a time (or a total), + crossed with region present-or-rolled-up — 12 sets total. +#} +{% set base_dims = [ + "academic_year", + "district_state", + "assessment_name", + "test_code", +] %} + +{% set focus_dims = [ + "gender", + "aggregate_ethnicity", + "lunch_status", + "ml_status", + "iep_status", +] %} + +with + assessment_scores as ( + select + _dbt_source_relation, + academic_year, + localstudentidentifier, + statestudentidentifier as state_id, + assessment_name, + is_proficient, + is_proficient_int, + + results_type, + district_state, + aligned_test_code as test_code, + + case + when race_ethnicity = 'B' + then 'African American' + when race_ethnicity = 'A' + then 'Asian' + when race_ethnicity = 'I' + then 'American Indian' + when race_ethnicity = 'H' + then 'Hispanic' + when race_ethnicity = 'P' + then 'Native Hawaiian' + when race_ethnicity = 'T' + then 'Other' + when race_ethnicity = 'W' + then 'White' + when race_ethnicity is null + then 'Blank' + end as aggregate_ethnicity, + + if(lep_status, 'ML', 'Not ML') as ml_status, + + if( + iep_status = 'Has IEP', + 'Students With Disabilities', + 'Students Without Disabilities' + ) as iep_status, + + from {{ ref("int_pearson__all_assessments") }} + where + testscalescore is not null and `period` = 'Spring' and academic_year >= 2018 + + union all + + select + _dbt_source_relation, + academic_year, + + null as localstudentidentifier, + + student_id as state_id, + assessment_name, + is_proficient, + if(is_proficient, 1, 0) as is_proficient_int, + + 'Actual' as results_type, + 'KTAF FL' as district_state, + + test_code, + + null as aggregate_ethnicity, + null as ml_status, + null as iep_status, + + from {{ ref("int_fldoe__all_assessments") }} + where scale_score is not null and season = 'Spring' + + union all + + select + _dbt_source_relation, + academic_year, + + local_student_identifier as localstudentidentifier, + + cast(state_student_identifier as string) as state_id, + + test_type as assessment_name, + is_proficient, + if(is_proficient, 1, 0) as is_proficient_int, + results_type, + district_state, + aligned_test_code as test_code, + + null as aggregate_ethnicity, + null as ml_status, + null as iep_status, + + from {{ ref("int_pearson__student_list_report") }} + where + state_student_identifier is not null + and administration = 'Spring' + and test_type = 'NJSLA' + and academic_year = {{ var("current_academic_year") }} + ), + + /* NJ scores */ + nj_scores as ( + select + e.academic_year, + e.region, + e.student_number, + + a.district_state, + a.assessment_name, + a.aggregate_ethnicity, + a.ml_status, + a.iep_status, + a.is_proficient_int, + + if( + e.lunch_status in ('F', 'R'), + 'Economically Disadvantaged', + 'Non Economically Disadvantaged' + ) as lunch_status, + + case + when a.test_code = 'ALG01' + then concat(a.test_code, '_', e.school_level) + else a.test_code + end as test_code, + + case + e.gender + when 'F' + then 'Female' + when 'M' + then 'Male' + when 'X' + then 'Non-Binary' + end as gender, + + from {{ ref("int_extracts__student_enrollments") }} as e + inner join + assessment_scores as a + on e.academic_year = a.academic_year + and e.pearson_local_student_identifier = a.localstudentidentifier + and {{ union_dataset_join_clause(left_alias="e", right_alias="a") }} + and a.results_type = 'Actual' + where + e.rn_year = 1 + and e.academic_year >= {{ var("current_academic_year") - 7 }} + and e.grade_level > 2 + and e.school_level != 'OD' + ), + + /* FL scores */ + fl_scores as ( + select + e.academic_year, + e.region, + e.student_number, + + a.district_state, + a.assessment_name, + + case + when e.race_ethnicity = 'B' + then 'African American' + when e.race_ethnicity = 'A' + then 'Asian' + when e.race_ethnicity = 'I' + then 'American Indian' + when e.race_ethnicity = 'H' + then 'Hispanic' + when e.race_ethnicity = 'P' + then 'Native Hawaiian' + when e.race_ethnicity = 'T' + then 'Other' + when e.race_ethnicity = 'W' + then 'White' + when e.race_ethnicity is null + then 'Blank' + end as aggregate_ethnicity, + + e.ml_status, + + if( + e.iep_status = 'Has IEP', + 'Students With Disabilities', + 'Students Without Disabilities' + ) as iep_status, + + a.is_proficient_int, + + if( + e.lunch_status in ('F', 'R'), + 'Economically Disadvantaged', + 'Non Economically Disadvantaged' + ) as lunch_status, + + case + when a.test_code = 'ALG01' + then concat(a.test_code, '_', e.school_level) + else a.test_code + end as test_code, + + case + e.gender + when 'F' + then 'Female' + when 'M' + then 'Male' + when 'X' + then 'Non-Binary' + end as gender, + + from {{ ref("int_extracts__student_enrollments") }} as e + inner join + assessment_scores as a + on e.academic_year = a.academic_year + and e.state_studentnumber = a.state_id + and {{ union_dataset_join_clause(left_alias="e", right_alias="a") }} + and a.results_type = 'Actual' + where + e.region = 'Miami' + and e.rn_year = 1 + and e.academic_year >= {{ var("current_academic_year") - 7 }} + and e.grade_level > 2 + ), + + demographic_comps as ( + select * + from nj_scores + union all + select * + from fl_scores + ) + +select + academic_year, + district_state, + region, + assessment_name, + test_code, + + round( + avg(is_proficient_int) * count(student_number), 0 + ) as total_proficient_students, + count(student_number) as total_students, + avg(is_proficient_int) as percent_proficient, + + /* (a) focus_level + demographic labels */ + case + {% for dim in focus_dims %} + when grouping({{ dim }}) = 0 then '{{ dim }}' + {% endfor %} + else 'all_null' + end as focus_level, + + case + when + {% for dim in focus_dims %} + grouping({{ dim }}) = 1{% if not loop.last %} and {% endif %} + {% endfor %} + then 'Total' + when + grouping(ml_status) = 0 + or grouping(iep_status) = 0 + or grouping(lunch_status) = 0 + then 'Subgroup' + when grouping(gender) = 0 + then 'Gender' + when grouping(aggregate_ethnicity) = 0 + then 'Aggregate Ethnicity' + end as comparison_demographic_group, + + case + when + {% for dim in focus_dims %} + grouping({{ dim }}) = 1{% if not loop.last %} and {% endif %} + {% endfor %} + then 'All Students' + else coalesce(gender, aggregate_ethnicity, lunch_status, ml_status, iep_status) + end as comparison_demographic_subgroup, + + /* (b) comparison_entity from region null-ness */ + if(grouping(region) = 1, district_state, 'Region') as comparison_entity, + + /* (c) test_code-derived columns */ + case + when + test_code in ( + 'ELA09', + 'ELA10', + 'ELA11', + 'ELAGP', + 'ALG01_HS', + 'GEO01', + 'ALG02', + 'MATGP', + 'SCI11' + ) + then 'HS' + when test_code = 'ALG01_MS' + then 'MS' + when safe_cast(right(test_code, 2) as numeric) between 5 and 8 + then 'MS' + else 'ES' + end as school_level, + + case + when + test_code in ( + 'ELA09', + 'ELA10', + 'ELA11', + 'ELAGP', + 'ALG01_HS', + 'GEO01', + 'ALG02', + 'MATGP', + 'SCI11' + ) + then 'HS' + else '3-8' + end as grade_range_band, + + case + when left(test_code, 3) in ('MAT', 'ALG', 'GEO') + then 'Math' + when left(test_code, 3) = 'ELA' + then 'ELA' + when left(test_code, 3) = 'SCI' + then 'Science' + when left(test_code, 3) = 'SOC' + then 'Social Studies' + end as discipline, + +from demographic_comps + +group by + grouping sets ( + {# Total (all focus dims rolled up) — with and without region #} + ({{ base_dims | join(", ") }}, region), + ({{ base_dims | join(", ") }}), + + {# One focus dim active at a time — with and without region #} + {% for dim in focus_dims %} + ({{ base_dims | join(", ") }}, region, {{ dim }}), + ({{ base_dims | join(", ") }}, {{ dim }}) + {% if not loop.last %},{% endif %} + {% endfor %} + ) diff --git a/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps_cubed.sql b/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps_cubed.sql deleted file mode 100644 index dea149cf8f..0000000000 --- a/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps_cubed.sql +++ /dev/null @@ -1,34 +0,0 @@ -{% set dims = [ - "academic_year", - "district_state", - "region", - "assessment_name", - "test_code", - "gender", - "aggregate_ethnicity", - "lunch_status", - "ml_status", - "iep_status", -] %} - -{% set aggs = [ - "ROUND(AVG(is_proficient_int) * COUNT(student_number), 0) AS total_proficient_students", - "COUNT(student_number) AS total_students", - "AVG(is_proficient_int) AS percent_proficient", -] %} - -{{ - generate_cube_query( - dims, - aggs, - ref("int_tableau__state_assessments_demographic_comps"), - focus_group=True, - focus_dims=[ - "gender", - "aggregate_ethnicity", - "lunch_status", - "ml_status", - "iep_status", - ], - ) -}} diff --git a/src/dbt/kipptaf/models/extracts/tableau/intermediate/properties/int_tableau__state_assessments_demographic_comps.yml b/src/dbt/kipptaf/models/extracts/tableau/intermediate/properties/int_tableau__state_assessments_demographic_comps.yml index 116af090df..ac0eec342a 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/intermediate/properties/int_tableau__state_assessments_demographic_comps.yml +++ b/src/dbt/kipptaf/models/extracts/tableau/intermediate/properties/int_tableau__state_assessments_demographic_comps.yml @@ -3,25 +3,45 @@ models: columns: - name: academic_year data_type: int64 - - name: region - data_type: string - - name: student_number - data_type: int64 - name: district_state data_type: string + - name: region + data_type: string - name: assessment_name data_type: string - - name: aggregate_ethnicity + - name: test_code data_type: string - - name: ml_status + - name: total_proficient_students + data_type: float64 + - name: total_students + data_type: int64 + - name: percent_proficient + data_type: float64 + - name: focus_level data_type: string - - name: iep_status + - name: comparison_demographic_group data_type: string - - name: lunch_status + - name: comparison_demographic_subgroup data_type: string - - name: is_proficient_int - data_type: int64 - - name: test_code + - name: comparison_entity + data_type: string + - name: school_level + data_type: string + - name: grade_range_band data_type: string - - name: gender + - name: discipline data_type: string + data_tests: + - dbt_utils.unique_combination_of_columns: + arguments: + combination_of_columns: + - academic_year + - district_state + - region + - assessment_name + - test_code + - comparison_entity + - comparison_demographic_group + - comparison_demographic_subgroup + config: + store_failures: true diff --git a/src/dbt/kipptaf/models/extracts/tableau/intermediate/properties/int_tableau__state_assessments_demographic_comps_cubed.yml b/src/dbt/kipptaf/models/extracts/tableau/intermediate/properties/int_tableau__state_assessments_demographic_comps_cubed.yml deleted file mode 100644 index caa8bab463..0000000000 --- a/src/dbt/kipptaf/models/extracts/tableau/intermediate/properties/int_tableau__state_assessments_demographic_comps_cubed.yml +++ /dev/null @@ -1,57 +0,0 @@ -models: - - name: int_tableau__state_assessments_demographic_comps_cubed - # config: - # materialized: table - columns: - - name: academic_year - data_type: int64 - - name: district_state - data_type: string - - name: region - data_type: string - - name: assessment_name - data_type: string - - name: test_code - data_type: string - - name: gender - data_type: string - - name: aggregate_ethnicity - data_type: string - - name: lunch_status - data_type: string - - name: ml_status - data_type: string - - name: iep_status - data_type: string - - name: total_proficient_students - data_type: float64 - - name: total_students - data_type: int64 - - name: percent_proficient - data_type: float64 - - name: is_academic_year_total - data_type: int64 - - name: is_district_state_total - data_type: int64 - - name: is_region_total - data_type: int64 - - name: is_assessment_name_total - data_type: int64 - - name: is_test_code_total - data_type: int64 - - name: is_gender_total - data_type: int64 - - name: is_aggregate_ethnicity_total - data_type: int64 - - name: is_lunch_status_total - data_type: int64 - - name: is_ml_status_total - data_type: int64 - - name: is_iep_status_total - data_type: int64 - - name: grouping_level - data_type: float64 - - name: focus_level - data_type: string - - name: total_type - data_type: string diff --git a/src/dbt/kipptaf/models/extracts/tableau/properties/rpt_tableau__state_assessments_dashboard_comps.yml b/src/dbt/kipptaf/models/extracts/tableau/properties/rpt_tableau__state_assessments_dashboard_comps.yml index e7695e561b..523673ff99 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/properties/rpt_tableau__state_assessments_dashboard_comps.yml +++ b/src/dbt/kipptaf/models/extracts/tableau/properties/rpt_tableau__state_assessments_dashboard_comps.yml @@ -33,3 +33,17 @@ models: data_type: boolean - name: region_matched_or_outperformed data_type: boolean + data_tests: + - dbt_utils.unique_combination_of_columns: + arguments: + combination_of_columns: + - academic_year + - school_level + - assessment_name + - test_code + - region + - comparison_entity + - comparison_demographic_group + - comparison_demographic_subgroup + config: + store_failures: true diff --git a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql index 74fa712723..beb1d99419 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql @@ -1,271 +1,149 @@ -with - -- trunk-ignore(sqlfluff/ST03) - ktaf as ( - select - b.academic_year, - b.assessment_name, - b.test_code, - b.total_proficient_students, - b.total_students, - b.percent_proficient, - b.focus_level, - - if(b.region is null, regions, b.region) as region, - - case - when b.focus_level = 'all_null' - then 'Total' - when b.focus_level in ('ml_status', 'iep_status', 'lunch_status') - then 'Subgroup' - else initcap(regexp_replace(b.focus_level, r'_', ' ')) - end as comparison_demographic_group, - - case - when b.focus_level = 'all_null' - then 'All Students' - else - coalesce( - b.gender, - b.aggregate_ethnicity, - b.lunch_status, - b.ml_status, - b.iep_status - ) - end as comparison_demographic_subgroup, - - if(b.region is null, b.district_state, 'Region') as comparison_entity, - - from {{ ref("int_tableau__state_assessments_demographic_comps_cubed") }} as b - cross join unnest(['Camden', 'Newark']) as regions - where - b.academic_year is not null - and b.assessment_name is not null - and b.test_code is not null - and b.district_state = 'KTAF NJ' - - union all - - select - academic_year, - assessment_name, - test_code, - total_proficient_students, - total_students, - percent_proficient, - focus_level, - - coalesce(region, 'Miami') as region, - - case - when focus_level = 'all_null' - then 'Total' - when focus_level in ('ml_status', 'iep_status', 'lunch_status') - then 'Subgroup' - else initcap(regexp_replace(focus_level, r'_', ' ')) - end as comparison_demographic_group, - - case - when focus_level = 'all_null' - then 'All Students' - else - coalesce( - gender, aggregate_ethnicity, lunch_status, ml_status, iep_status - ) - end as comparison_demographic_subgroup, - - if(region is null, district_state, 'Region') as comparison_entity, - - from {{ ref("int_tableau__state_assessments_demographic_comps_cubed") }} - where - academic_year is not null - and assessment_name is not null - and test_code is not null - and district_state = 'KTAF FL' - ), - - -- deduping here because of how group by cube generates rows - dedup_ktaf as ( - {{ - dbt_utils.deduplicate( - relation="ktaf", - partition_by="academic_year, region, comparison_entity,comparison_demographic_group,comparison_demographic_subgroup,focus_level,assessment_name,test_code", - order_by="academic_year", - ) - }} - ), - - appended as ( - select - academic_year, - assessment_name, - region, - comparison_entity, - focus_level, - - total_students, - percent_proficient, - - comparison_demographic_group, - comparison_demographic_subgroup, - - total_proficient_students, - - if(test_code like 'ALG01%', 'ALG01', test_code) as test_code, - - case - when - test_code in ( - 'ELA09', - 'ELA10', - 'ELA11', - 'ELAGP', - 'ALG01_HS', - 'GEO01', - 'ALG02', - 'MATGP', - 'SCI11' - ) - then 'HS' - when test_code = 'ALG01_MS' - then 'MS' - when safe_cast(right(test_code, 2) as numeric) between 5 and 8 - then 'MS' - else 'ES' - end as school_level, - - case - when - test_code in ( - 'ELA09', - 'ELA10', - 'ELA11', - 'ELAGP', - 'ALG01_HS', - 'GEO01', - 'ALG02', - 'MATGP', - 'SCI11' - ) - then 'HS' - else '3-8' - end as grade_range_band, - - case - when left(test_code, 3) in ('MAT', 'ALG', 'GEO') - then 'Math' - when left(test_code, 3) = 'ELA' - then 'ELA' - when left(test_code, 3) = 'SCI' - then 'Science' - when left(test_code, 3) = 'SOC' - then 'Social Studies' - end as discipline, - - from dedup_ktaf - where - comparison_demographic_subgroup - not in ('Not ML', 'Students Without Disabilities', 'Non-Binary') - - union all - - select - academic_year, - test_name as assessment_name, - region, - comparison_entity, - null as focus_level, - - total_students, - percent_proficient, - - comparison_demographic_group_aligned as comparison_demographic_group, - comparison_demographic_subgroup_aligned as comparison_demographic_subgroup, - - total_proficient_students, - test_code, - school_level, - grade_range_band, - discipline, - - from {{ ref("stg_google_sheets__state_test_comparison_demographics") }} - where comparison_demographic_subgroup != 'SE Accommodation' - ), - - grouped_comps as ( - select - academic_year, - school_level, - grade_range_band, - assessment_name, - discipline, - test_code, - region, - comparison_entity, - comparison_demographic_group, - comparison_demographic_subgroup, - focus_level, - - sum(total_proficient_students) as total_proficient_students, - - sum(total_students) as total_students, - - safe_divide( - sum(total_proficient_students), sum(total_students) - ) as percent_proficient, - - from appended - where comparison_demographic_subgroup != 'Blank' - group by - academic_year, - school_level, - grade_range_band, - assessment_name, - discipline, - test_code, - region, - comparison_entity, - comparison_demographic_group, - comparison_demographic_subgroup, - focus_level - ) - -select - a.academic_year, - a.school_level, - a.grade_range_band, - a.assessment_name, - a.discipline, - a.test_code, - a.region, - a.comparison_entity, - a.comparison_demographic_group, - a.comparison_demographic_subgroup, - a.total_proficient_students, - a.total_students, - a.percent_proficient, - - if(b.percent_proficient = a.percent_proficient, true, false) as region_matched, - if(b.percent_proficient > a.percent_proficient, true, false) as region_outperformed, - - if( - b.percent_proficient = a.percent_proficient - or b.percent_proficient > a.percent_proficient, - true, - false - ) as region_matched_or_outperformed, - -from grouped_comps as a -left join - grouped_comps as b - on a.academic_year = b.academic_year - and a.academic_year = b.academic_year - and a.school_level = b.school_level - and a.grade_range_band = b.grade_range_band - and a.assessment_name = b.assessment_name - and a.discipline = b.discipline - and a.test_code = b.test_code - and a.region = b.region - and a.comparison_entity = b.comparison_entity - and a.comparison_demographic_group = b.comparison_demographic_group - and a.comparison_demographic_subgroup = b.comparison_demographic_subgroup - and b.comparison_entity = 'Region' +with + appended as ( + /* NJ: cross join fans out district-level rows to Camden + Newark */ + select + academic_year, + assessment_name, + comparison_entity, + comparison_demographic_group, + comparison_demographic_subgroup, + focus_level, + school_level, + grade_range_band, + discipline, + total_proficient_students, + total_students, + percent_proficient, + + if(test_code like 'ALG01%', 'ALG01', test_code) as test_code, + if(region is null, regions, region) as region, + + from {{ ref("int_tableau__state_assessments_demographic_comps") }} + cross join unnest(['Camden', 'Newark']) as regions + where + district_state = 'KTAF NJ' + and (region is null or region = regions) + and comparison_demographic_subgroup + not in ('Not ML', 'Students Without Disabilities', 'Non-Binary', 'Blank') + + union all + + /* FL */ + select + academic_year, + assessment_name, + comparison_entity, + comparison_demographic_group, + comparison_demographic_subgroup, + focus_level, + school_level, + grade_range_band, + discipline, + total_proficient_students, + total_students, + percent_proficient, + + if(test_code like 'ALG01%', 'ALG01', test_code) as test_code, + coalesce(region, 'Miami') as region, + + from {{ ref("int_tableau__state_assessments_demographic_comps") }} + where + district_state = 'KTAF FL' + and comparison_demographic_subgroup + not in ('Not ML', 'Students Without Disabilities', 'Non-Binary', 'Blank') + + union all + + /* Google Sheets benchmarks */ + select + academic_year, + test_name as assessment_name, + comparison_entity, + comparison_demographic_group_aligned as comparison_demographic_group, + comparison_demographic_subgroup_aligned as comparison_demographic_subgroup, + null as focus_level, + school_level, + grade_range_band, + discipline, + total_proficient_students, + total_students, + percent_proficient, + test_code, + region, + + from {{ ref("stg_google_sheets__state_test_comparison_demographics") }} + where comparison_demographic_subgroup not in ('SE Accommodation', 'Blank') + ), + + grouped_comps as ( + select + academic_year, + school_level, + grade_range_band, + assessment_name, + discipline, + test_code, + region, + comparison_entity, + comparison_demographic_group, + comparison_demographic_subgroup, + focus_level, + + sum(total_proficient_students) as total_proficient_students, + + sum(total_students) as total_students, + + safe_divide( + sum(total_proficient_students), sum(total_students) + ) as percent_proficient, + + from appended + group by + academic_year, + school_level, + grade_range_band, + assessment_name, + discipline, + test_code, + region, + comparison_entity, + comparison_demographic_group, + comparison_demographic_subgroup, + focus_level + ) + +select + a.academic_year, + a.school_level, + a.grade_range_band, + a.assessment_name, + a.discipline, + a.test_code, + a.region, + a.comparison_entity, + a.comparison_demographic_group, + a.comparison_demographic_subgroup, + a.total_proficient_students, + a.total_students, + a.percent_proficient, + + if(b.percent_proficient = a.percent_proficient, true, false) as region_matched, + if(b.percent_proficient > a.percent_proficient, true, false) as region_outperformed, + + if( + b.percent_proficient >= a.percent_proficient, true, false + ) as region_matched_or_outperformed, + +from grouped_comps as a +left join + grouped_comps as b + on a.academic_year = b.academic_year + and a.school_level = b.school_level + and a.grade_range_band = b.grade_range_band + and a.assessment_name = b.assessment_name + and a.discipline = b.discipline + and a.test_code = b.test_code + and a.region = b.region + and a.comparison_demographic_group = b.comparison_demographic_group + and a.comparison_demographic_subgroup = b.comparison_demographic_subgroup + and b.comparison_entity = 'Region' diff --git a/src/dbt/kipptaf/models/google/sheets/sources-external.yml b/src/dbt/kipptaf/models/google/sheets/sources-external.yml index 597b08fac9..f1d251380f 100644 --- a/src/dbt/kipptaf/models/google/sheets/sources-external.yml +++ b/src/dbt/kipptaf/models/google/sheets/sources-external.yml @@ -89,7 +89,7 @@ sources: format: GOOGLE_SHEETS uris: - https://docs.google.com/spreadsheets/d/1yS6xU7ygiOrrtc29pUc3jr590qk7ttag3RuzVHaPOv8 - sheet_range: src_google_sheets__state_test_comparison_demographics + sheet_range: src_google_sheets__state_test_comparison_demographics_v2 skip_leading_rows: 1 config: meta: diff --git a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml index aa2e3ccdb1..25fa6b208d 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml +++ b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml @@ -1,35 +1,103 @@ models: - name: stg_google_sheets__state_test_comparison_demographics + description: > + Reference sheet for state assessment comparisons to external entities + (City LEA, State, Neighborhood Schools) broken out by demographic group, + as available from official sources. ALG01 has a special case for school + level — overall comparisons by entity are split by HS and MS, but + demographic groupings combine MS and HS (assessment grades 8+) because + official sources do not provide demographic breakdowns by school level. + ALG02 totals are 10th-grade only; demographic group totals include all + students who took ALG02 regardless of grade. GEO01 follows the same + pattern for grades 9 and 10. columns: - - name: Academic_Year + - name: academic_year data_type: int64 - - name: Test_Name + description: Academic year the benchmarks apply to. + + - name: test_name data_type: string - - name: Test_Code + description: State assessment test name (e.g. NJSLA, NJGPA). + + - name: season data_type: string - - name: Region + description: Testing season (always Spring for benchmarks). + + - name: school_level data_type: string - - name: Comparison_Entity + description: > + School level the benchmark applies to (ES, MS, HS, or hybrid MS_HS for + ALG01, or HS_09 and HS_10 for ALG02, GEO01). + + - name: grade_range_band data_type: string - - name: Comparison_Demographic_Group + description: Grade band for the benchmark (e.g. 3-8, HS). + + - name: discipline data_type: string - - name: Comparison_Demographic_Subgroup + description: Subject discipline (Math, ELA, Science, Social Studies). + + - name: test_code data_type: string - - name: Percent_Proficient - data_type: float64 - - name: Total_Students - data_type: int64 - - name: season + description: State assessment test code (e.g. ELA05, ALG01). + + - name: region data_type: string - - name: school_level + description: > + Network region the comparison applies to (Newark, Camden, Miami, + Paterson). + + - name: comparison_entity data_type: string - - name: grade_range_band + description: > + External entity being compared to (City, State, Neighborhood Schools). + + - name: comparison_demographic_group data_type: string - - name: discipline + description: > + Raw demographic group label from the source sheet (e.g. Grade, + Race/Ethnicity, Gender). + + - name: comparison_demographic_subgroup data_type: string + description: > + Raw demographic subgroup label from the source sheet (e.g. Grade - 05, + Black or African American, Female). + + - name: percent_proficient + data_type: float64 + description: Percent of students scoring proficient or above. + + - name: total_students + data_type: int64 + description: Total number of students tested. + - name: comparison_demographic_group_aligned data_type: string + description: > + Demographic group after alignment — Grade rows with subgroups 08/09/10 + are remapped to 'Total'. + - name: comparison_demographic_subgroup_aligned data_type: string + description: > + Demographic subgroup after alignment — Grade group rows are remapped + to 'All Students'. + - name: total_proficient_students data_type: float64 + description: > + Derived count of proficient students (percent_proficient * + total_students, rounded to 0 decimal places). + data_tests: + - dbt_utils.unique_combination_of_columns: + arguments: + combination_of_columns: + - academic_year + - test_code + - region + - comparison_entity + - comparison_demographic_group + - comparison_demographic_subgroup + config: + store_failures: true diff --git a/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql b/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql index 886b597582..fc6f79a090 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql +++ b/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql @@ -1,52 +1,6 @@ select *, - 'Spring' as season, - - case - when comparison_demographic_subgroup = 'Grade - 08' and test_code = 'ALG01' - then 'MS' - when comparison_demographic_subgroup in ('Grade - 09', 'Grade - 10') - then 'HS' - when - test_code in ( - 'ELA09', - 'ELA10', - 'ELA11', - 'ELAGP', - 'ALG01', - 'GEO01', - 'ALG02', - 'MATGP', - 'SCI11' - ) - then 'HS' - when safe_cast(right(test_code, 2) as numeric) between 5 and 8 - then 'MS' - else 'ES' - end as school_level, - - case - when comparison_demographic_subgroup in ('Grade - 09', 'Grade - 10') - then 'HS' - when - test_code - in ('ELA09', 'ELA10', 'ELA11', 'ELAGP', 'ALG02', 'GEO01', 'MATGP', 'SCI11') - then 'HS' - else '3-8' - end as grade_range_band, - - case - when left(test_code, 3) in ('MAT', 'ALG', 'GEO') - then 'Math' - when left(test_code, 3) = 'ELA' - then 'ELA' - when left(test_code, 3) = 'SCI' - then 'Science' - when left(test_code, 3) = 'SOC' - then 'Social Studies' - end as discipline, - if( comparison_demographic_subgroup in ('Grade - 08', 'Grade - 09', 'Grade - 10'), 'Total', diff --git a/src/dbt/kipptaf/models/marts/dim_state_assessment_benchmarks.sql b/src/dbt/kipptaf/models/marts/dim_state_assessment_benchmarks.sql index 52662cfe94..c71ee6d63b 100644 --- a/src/dbt/kipptaf/models/marts/dim_state_assessment_benchmarks.sql +++ b/src/dbt/kipptaf/models/marts/dim_state_assessment_benchmarks.sql @@ -1,8 +1,14 @@ select academic_year, test_name, + season, + school_level, + grade_range_band, + discipline, test_code, region, + comparison_demographic_group_aligned, + comparison_demographic_subgroup_aligned, max( case when comparison_entity = 'City' then percent_proficient end @@ -24,11 +30,44 @@ select case when comparison_entity = 'Neighborhood Schools' then total_students end ) as neighborhood_schools_total_students, + max( + case when comparison_entity = 'City' then total_proficient_students end + ) as city_total_proficient_students, + max( + case when comparison_entity = 'State' then total_proficient_students end + ) as state_total_proficient_students, + max( + case + when comparison_entity = 'Neighborhood Schools' + then total_proficient_students + end + ) as neighborhood_schools_total_proficient_students, + {{ dbt_utils.generate_surrogate_key( - ["academic_year", "test_name", "test_code", "region"] + [ + "academic_year", + "test_name", + "test_code", + "region", + "school_level", + "grade_range_band", + "season", + "comparison_demographic_group_aligned", + "comparison_demographic_subgroup_aligned", + ] ) }} as state_assessment_benchmarks_key, -from {{ ref("stg_google_sheets__state_test_comparison") }} -group by academic_year, test_name, test_code, region +from {{ ref("stg_google_sheets__state_test_comparison_demographics") }} +group by + academic_year, + test_name, + season, + school_level, + grade_range_band, + discipline, + test_code, + region, + comparison_demographic_group_aligned, + comparison_demographic_subgroup_aligned diff --git a/src/dbt/kipptaf/models/marts/properties/dim_state_assessment_benchmarks.yml b/src/dbt/kipptaf/models/marts/properties/dim_state_assessment_benchmarks.yml index e1fa5dd6ae..3a518d5385 100644 --- a/src/dbt/kipptaf/models/marts/properties/dim_state_assessment_benchmarks.yml +++ b/src/dbt/kipptaf/models/marts/properties/dim_state_assessment_benchmarks.yml @@ -4,12 +4,15 @@ models: columns: - name: state_assessment_benchmarks_key data_type: string - description: - Surrogate key on (academic_year, test_name, test_code, region). - tests: + description: > + Surrogate key on (academic_year, test_name, test_code, region, + school_level, grade_range_band, season, + comparison_demographic_group_aligned, + comparison_demographic_subgroup_aligned). + data_tests: - unique: config: - severity: Warn + severity: warn store_failures: true - name: academic_year @@ -20,16 +23,46 @@ models: data_type: string description: State assessment test name (e.g. NJSLA, NJGPA). + - name: season + data_type: string + description: Testing season (always Spring for benchmarks). + + - name: school_level + data_type: string + description: + School level the benchmark applies to (ES, MS, HS, or hybrid MS_HS for + ALG01, or HS_09, and HS_10 for ALG02, GEO01). + + - name: grade_range_band + data_type: string + description: Grade band for the benchmark (e.g. 3-8, HS). + + - name: discipline + data_type: string + description: Subject discipline (Math, ELA, Science, Social Studies). + - name: test_code data_type: string - description: State assessment test code (e.g. ELA05, ALG1). + description: State assessment test code (e.g. ELA05, ALG01). - name: region data_type: string - description: + description: > Network region the comparison applies to (Newark, Camden, Miami, Paterson). + - name: comparison_demographic_group_aligned + data_type: string + description: > + Demographic group after alignment — Grade rows with subgroups 08/09/10 + are remapped to 'Total'. + + - name: comparison_demographic_subgroup_aligned + data_type: string + description: > + Demographic subgroup after alignment — Grade group rows are remapped + to 'All Students'. + - name: city_percent_proficient data_type: float64 description: City-wide percent proficient on this assessment. @@ -40,7 +73,9 @@ models: - name: neighborhood_schools_percent_proficient data_type: float64 - description: Neighborhood schools percent proficient on this assessment. + description: + Neighborhood schools percent proficient on this assessment (applies to + FL only). - name: city_total_students data_type: int64 @@ -52,4 +87,19 @@ models: - name: neighborhood_schools_total_students data_type: int64 - description: Total students tested at neighborhood schools. + description: + Total students tested at neighborhood schools (applies to FL only). + + - name: city_total_proficient_students + data_type: float64 + description: Total proficient students city-wide. + + - name: state_total_proficient_students + data_type: float64 + description: Total proficient students state-wide. + + - name: neighborhood_schools_total_proficient_students + data_type: float64 + description: + Total proficient students at neighborhood schools (applies to FL + only). From 244f55f4e2141f2cf955e6ac28aa5cfd2c9af11e Mon Sep 17 00:00:00 2001 From: grangel <140853376+GabyRangelB@users.noreply.github.com> Date: Mon, 30 Mar 2026 19:40:45 +0000 Subject: [PATCH 4/9] feat(dbt): compute FLDOE metadata columns in kipptaf intermediate Add results_type, district_state, admin, subject, illuminate_subject, and fast_aggregated_proficiency as computed columns in the kipptaf int_fldoe__all_assessments model instead of depending on the kippmiami upstream to provide them. This allows rpt_tableau__state_assessments_dashboard to build without waiting for a kippmiami deployment. Co-Authored-By: Claude Opus 4.6 (1M context) --- .../int_fldoe__all_assessments.sql | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/src/dbt/kipptaf/models/fldoe/intermediate/int_fldoe__all_assessments.sql b/src/dbt/kipptaf/models/fldoe/intermediate/int_fldoe__all_assessments.sql index 1cda1266d4..d8a52d3dbb 100644 --- a/src/dbt/kipptaf/models/fldoe/intermediate/int_fldoe__all_assessments.sql +++ b/src/dbt/kipptaf/models/fldoe/intermediate/int_fldoe__all_assessments.sql @@ -60,6 +60,29 @@ select cast(regexp_extract(fl.achievement_level, r'\d+') as int) as achievement_level_int, + 'Actual' as results_type, + 'KTAF FL' as district_state, + + fl.administration_window as `admin`, + fl.assessment_subject as `subject`, + + case + when fl.assessment_subject like 'English Language Arts%' + then 'Text Study' + when fl.assessment_subject in ('Algebra I', 'Algebra II', 'Geometry') + then 'Mathematics' + else fl.assessment_subject + end as illuminate_subject, + + case + when cast(regexp_extract(fl.achievement_level, r'\d+') as int) = 1 + then 'Below/Far Below' + when cast(regexp_extract(fl.achievement_level, r'\d+') as int) = 2 + then 'Approaching' + when cast(regexp_extract(fl.achievement_level, r'\d+') as int) >= 3 + then 'At/Above' + end as fast_aggregated_proficiency, + if(cw1.sublevel_number >= 6, null, cw2.scale_low) as scale_for_proficiency, if( From 06a2ff0303a3ddddfa32b0cbf9230a9846474272 Mon Sep 17 00:00:00 2001 From: grangel <140853376+GabyRangelB@users.noreply.github.com> Date: Tue, 31 Mar 2026 19:51:47 +0000 Subject: [PATCH 5/9] refactor(dbt): rebuild demographic comps pipeline and update YMLs - Refactor int_tableau__state_assessments_demographic_comps: replace self-contained assessment_scores CTE with three union branches (NJ official, NJ prelim, FL official); add test_code_metadata CTE from stg_google_sheets__state_test_comparison_demographics to replace inline school_level/grade_range_band/discipline CASE statements; fix aligned column references and unqualified ON clause columns - Add YML for int_fldoe__all_assessments (kipptaf): uniqueness test, model description, and full column definitions including new metadata columns (results_type, district_state, aligned_level_test_code, illuminate_subject, fast_aggregated_proficiency, is_proficient_int, admin, subject) - Add YML for int_pearson__all_assessments: uniqueness test, model description, full column definitions; fix stale columns (englishlearnerel, studentwithdisabilities removed; aligned_* demographic columns, is_proficient_int, season, admin added) - Add YML for int_pearson__student_list_report: uniqueness test on (source_relation, academic_year, administration, state_id, aligned_test_code), model description, full column definitions; fix missing trailing comma in SQL - Update stg_google_sheets__state_test_comparison_demographics YML: move data_tests before columns, remove redundant store_failures, add missing aligned_level_test_code column - Add aligned_gender to int_extracts__student_enrollments YML - Remove design spec (work complete) Co-Authored-By: Claude Sonnet 4.6 (1M context) --- ...-03-23-state-assessments-rebuild-design.md | 122 -------- ...u__state_assessments_demographic_comps.sql | 291 ++++++------------ .../int_fldoe__all_assessments.sql | 16 +- .../properties/int_fldoe__all_assessments.yml | 116 +++++++ ...ts__state_test_comparison_demographics.yml | 29 +- ...ts__state_test_comparison_demographics.sql | 6 + .../int_pearson__all_assessments.sql | 228 ++++++++------ .../int_pearson__student_list_report.sql | 9 + .../int_pearson__all_assessments.yml | 221 +++++++++++-- .../int_pearson__student_list_report.yml | 122 ++++++++ .../int_extracts__student_enrollments.sql | 4 + .../int_extracts__student_enrollments.yml | 2 + 12 files changed, 709 insertions(+), 457 deletions(-) delete mode 100644 docs/superpowers/specs/2026-03-23-state-assessments-rebuild-design.md diff --git a/docs/superpowers/specs/2026-03-23-state-assessments-rebuild-design.md b/docs/superpowers/specs/2026-03-23-state-assessments-rebuild-design.md deleted file mode 100644 index f443420cce..0000000000 --- a/docs/superpowers/specs/2026-03-23-state-assessments-rebuild-design.md +++ /dev/null @@ -1,122 +0,0 @@ -# State Assessments Rebuild - -**Branch:** `claude/feat/stat-rebuild` **Date started:** 2026-03-23 **Status:** -In progress - -## Motivation - -1. **New reporting requirements** — Demographic comparison reporting that didn't - previously exist (e.g., state test comparison by demographic subgroups) -2. **Tech debt cleanup** — Assessment models had duplicated business logic - (proficiency banding, subject mapping, metadata) scattered across reporting - layers, making them fragile and hard to maintain - -## Architectural Pattern - -**Push transformation logic upstream, simplify reporting downstream.** - -### Proficiency banding - -Each assessment source computes standardized proficiency bands in its own -intermediate model rather than in Tableau reporting models: - -- **Pearson (NJSLA):** `njsla_aggregated_proficiency` — levels 1-2 = "Below/Far - Below", 3 = "Approaching", 4+ = "At/Above" -- **FLDOE (FAST):** `fast_aggregated_proficiency` — level 1 = "Below/Far Below", - 2 = "Approaching", 3+ = "At/Above" (FAST uses a 1-5 scale where proficiency - starts at level 3, vs NJSLA's level 4) -- **iReady:** `iready_proficiency` — same banding pattern as Pearson (levels 1-2 - = Below, 3 = Approaching, 4+ = At/Above) - -### Metadata columns - -Added at the intermediate layer (not in reporting): - -- `results_type` — "Actual" vs "Preliminary" -- `district_state` — "KTAF NJ" or "KTAF FL" -- `illuminate_subject` — standardized subject mapping (ELA = "Text Study", - Algebra/Geometry = "Mathematics") -- `discipline` — broader category (Math, ELA, Science, Social Studies) - -### Demographic alignment - -Grade/demographic group mapping moved from reporting into staging: - -- `stg_google_sheets__state_test_comparison_demographics` now computes - `comparison_demographic_group_aligned`, - `comparison_demographic_subgroup_aligned`, `total_proficient_students`, - `school_level`, `grade_range_band`, `discipline`, and `season` -- Old `stg_google_sheets__state_test_comparison` disabled (`enabled: false`) - because the demographics sheet is a superset of the old source - -### Reporting simplification - -`rpt_tableau__state_assessments_*` models now primarily select pre-computed -columns instead of containing large CASE blocks. - -## What's Been Done - -### Assessment intermediate models - -| Model | Changes | -| ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| `int_fldoe__all_assessments` | Added `achievement_level_int`, `results_type`, `district_state`, `illuminate_subject`, `fast_aggregated_proficiency` | -| `int_pearson__all_assessments` | Added `illuminate_subject`, `njsla_aggregated_proficiency`, `results_type`, `district_state`, `is_504`, `aligned_test_code`, `race_ethnicity`, `admin`, `season`, `aligned_subject`, `is_proficient_int`. Contract enforcement disabled (temporary dev workaround) | -| `int_pearson__student_list_report` | **New model** — raw transformation layer for Pearson student list data with `performance_band_level`, `is_proficient` | -| `int_iready__diagnostic_results` | Added `iready_proficiency` column | - -### Google Sheets staging - -| Model | Changes | -| ------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | -| `stg_google_sheets__state_test_comparison_demographics` | Enhanced with derived columns: `season`, `school_level`, `grade_range_band`, `discipline`, aligned demographic columns, `total_proficient_students` | -| `stg_google_sheets__state_test_comparison` | Disabled (`enabled: false`) | -| `sources-external.yml` | Updated source references; course subject crosswalk sheet range renamed to `_v2`, `Exclude_from_Gradebook` column dropped | - -### Tableau reporting models (simplified) - -| Model | Changes | -| -------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `rpt_tableau__state_assessments_dashboard` | Switched to demographics source; Student List Report section refactored — `discipline`, `subject`, `test_code`, `performance_band_level`, `is_proficient`, `results_type` all switched from inline CASE/IF to upstream column references | -| `rpt_tableau__state_assessments_dashboard_comps` | Removed demographic alignment CASE blocks, uses pre-computed columns | -| `int_tableau__state_assessments_demographic_comps` | Lineage changed from `stg_pearson__student_list_report` to `int_pearson__student_list_report`; `is_proficient_int`, `results_type`, `district_state`, `aligned_test_code` all switched from inline to upstream; `local_student_identifier` now passed through (was `null`) | - -### Extracts and base models - -| Model | Changes | -| -------------------------------------------- | -------------------------------------------------------------------------------------- | -| `int_extracts__student_enrollments_subjects` | Refactored to use upstream proficiency/subject columns instead of inline CASE | -| `int_extracts__student_enrollments_courses` | **New model** — course enrollment extract from `base_powerschool__student_enrollments` | -| `base_powerschool__course_enrollments` | Added `standardized_discipline`; removed `exclude_from_gradebook` | - -## Known Issues - -- `int_pearson__all_assessments` has contract enforcement disabled — needs to be - re-enabled or documented as intentional before merge -- `int_pearson__student_list_report` properties file is incomplete — no column - definitions, no uniqueness test (required by project conventions) -- Branch history is messy (merged from `stat` and `int-extracts-courses` - branches, has checkpoint commits) — should be squash-merged - -## What's Still Open - -- Add column definitions and uniqueness test to - `int_pearson__student_list_report.yml` -- Re-enable contract enforcement on `int_pearson__all_assessments` or document - why it should stay off -- Identify remaining assessment sources and reporting models that need the same - upstream-migration treatment -- Determine scope of additional demographic reporting needs -- Testing and validation of refactored models against production data - -## How to Resume - -1. Run `uv sync --frozen` to install dependencies -2. Prepare dbt projects before testing: - ```bash - uv run dagster-dbt project prepare-and-package \ - --file src/teamster/code_locations/kipptaf/__init__.py - uv run dagster-dbt project prepare-and-package \ - --file src/teamster/code_locations/kippmiami/__init__.py - ``` -3. Pick up from "What's Still Open" above diff --git a/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql b/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql index fdee964ab6..4cb9fdcb42 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql @@ -21,118 +21,95 @@ ] %} with - assessment_scores as ( + test_code_metadata as ( select - _dbt_source_relation, - academic_year, - localstudentidentifier, - statestudentidentifier as state_id, - assessment_name, - is_proficient, - is_proficient_int, - - results_type, - district_state, - aligned_test_code as test_code, - - case - when race_ethnicity = 'B' - then 'African American' - when race_ethnicity = 'A' - then 'Asian' - when race_ethnicity = 'I' - then 'American Indian' - when race_ethnicity = 'H' - then 'Hispanic' - when race_ethnicity = 'P' - then 'Native Hawaiian' - when race_ethnicity = 'T' - then 'Other' - when race_ethnicity = 'W' - then 'White' - when race_ethnicity is null - then 'Blank' - end as aggregate_ethnicity, - - if(lep_status, 'ML', 'Not ML') as ml_status, - - if( - iep_status = 'Has IEP', - 'Students With Disabilities', - 'Students Without Disabilities' - ) as iep_status, - - from {{ ref("int_pearson__all_assessments") }} - where - testscalescore is not null and `period` = 'Spring' and academic_year >= 2018 - - union all + aligned_level_test_code, + any_value(school_level) as school_level, + any_value(grade_range_band) as grade_range_band, + any_value(discipline) as discipline, + from {{ ref("stg_google_sheets__state_test_comparison_demographics") }} + group by aligned_level_test_code + ), + scores as ( + -- NJ's official scores select - _dbt_source_relation, - academic_year, - - null as localstudentidentifier, + e.academic_year, + e.region, + e.student_number, - student_id as state_id, - assessment_name, - is_proficient, - if(is_proficient, 1, 0) as is_proficient_int, + a.district_state, + a.assessment_name, + a.is_proficient_int, + a.aligned_test_code as test_code, + a.aligned_ml_status as ml_status, - 'Actual' as results_type, - 'KTAF FL' as district_state, + e.aligned_gender as gender, - test_code, + a.aligned_aggregate_ethnicity as aggregate_ethnicity, + a.aligned_iep_status as iep_status, - null as aggregate_ethnicity, - null as ml_status, - null as iep_status, + if( + e.lunch_status in ('F', 'R'), + 'Economically Disadvantaged', + 'Non Economically Disadvantaged' + ) as lunch_status, - from {{ ref("int_fldoe__all_assessments") }} - where scale_score is not null and season = 'Spring' + from {{ ref("int_extracts__student_enrollments") }} as e + inner join + {{ ref("int_pearson__all_assessments") }} as a + on e.academic_year = a.academic_year + and e.pearson_local_student_identifier = a.localstudentidentifier + and {{ union_dataset_join_clause(left_alias="e", right_alias="a") }} + and a.academic_year >= 2018 + and a.season = 'Spring' + and a.testscalescore is not null + where + e.rn_year = 1 + and e.academic_year >= {{ var("current_academic_year") - 7 }} + and e.grade_level > 2 + and e.school_level != 'OD' union all - select - _dbt_source_relation, - academic_year, - - local_student_identifier as localstudentidentifier, - - cast(state_student_identifier as string) as state_id, - - test_type as assessment_name, - is_proficient, - if(is_proficient, 1, 0) as is_proficient_int, - results_type, - district_state, - aligned_test_code as test_code, - - null as aggregate_ethnicity, - null as ml_status, - null as iep_status, - - from {{ ref("int_pearson__student_list_report") }} - where - state_student_identifier is not null - and administration = 'Spring' - and test_type = 'NJSLA' - and academic_year = {{ var("current_academic_year") }} - ), - - /* NJ scores */ - nj_scores as ( + -- NJ's prelim scores select e.academic_year, e.region, e.student_number, a.district_state, - a.assessment_name, - a.aggregate_ethnicity, - a.ml_status, - a.iep_status, + a.test_type as assessment_name, a.is_proficient_int, + a.aligned_test_code as test_code, + + e.ml_status, + e.aligned_gender as gender, + + case + when e.race_ethnicity = 'B' + then 'African American' + when e.race_ethnicity = 'A' + then 'Asian' + when e.race_ethnicity = 'I' + then 'American Indian' + when e.race_ethnicity = 'H' + then 'Hispanic' + when e.race_ethnicity = 'P' + then 'Native Hawaiian' + when e.race_ethnicity = 'T' + then 'Other' + when e.race_ethnicity = 'W' + then 'White' + when e.race_ethnicity is null + then 'Blank' + end as aggregate_ethnicity, + + if( + e.iep_status = 'Has IEP', + 'Students With Disabilities', + 'Students Without Disabilities' + ) as iep_status, if( e.lunch_status in ('F', 'R'), @@ -140,38 +117,24 @@ with 'Non Economically Disadvantaged' ) as lunch_status, - case - when a.test_code = 'ALG01' - then concat(a.test_code, '_', e.school_level) - else a.test_code - end as test_code, - - case - e.gender - when 'F' - then 'Female' - when 'M' - then 'Male' - when 'X' - then 'Non-Binary' - end as gender, - from {{ ref("int_extracts__student_enrollments") }} as e inner join - assessment_scores as a + {{ ref("int_pearson__student_list_report") }} as a on e.academic_year = a.academic_year - and e.pearson_local_student_identifier = a.localstudentidentifier + and e.pearson_local_student_identifier = a.local_student_identifier and {{ union_dataset_join_clause(left_alias="e", right_alias="a") }} - and a.results_type = 'Actual' + and a.academic_year >= 2018 + and a.administration = 'Spring' + and a.scale_score is not null where e.rn_year = 1 and e.academic_year >= {{ var("current_academic_year") - 7 }} and e.grade_level > 2 and e.school_level != 'OD' - ), - /* FL scores */ - fl_scores as ( + union all + + -- FL's official scores select e.academic_year, e.region, @@ -179,6 +142,11 @@ with a.district_state, a.assessment_name, + a.is_proficient_int, + a.test_code, + + e.ml_status, + e.aligned_gender as gender, case when e.race_ethnicity = 'B' @@ -199,58 +167,32 @@ with then 'Blank' end as aggregate_ethnicity, - e.ml_status, - if( e.iep_status = 'Has IEP', 'Students With Disabilities', 'Students Without Disabilities' ) as iep_status, - a.is_proficient_int, - if( e.lunch_status in ('F', 'R'), 'Economically Disadvantaged', 'Non Economically Disadvantaged' ) as lunch_status, - case - when a.test_code = 'ALG01' - then concat(a.test_code, '_', e.school_level) - else a.test_code - end as test_code, - - case - e.gender - when 'F' - then 'Female' - when 'M' - then 'Male' - when 'X' - then 'Non-Binary' - end as gender, - from {{ ref("int_extracts__student_enrollments") }} as e inner join - assessment_scores as a + {{ ref("int_fldoe__all_assessments") }} as a on e.academic_year = a.academic_year - and e.state_studentnumber = a.state_id + and e.state_studentnumber = a.student_id and {{ union_dataset_join_clause(left_alias="e", right_alias="a") }} and a.results_type = 'Actual' + and a.scale_score is not null + and a.season = 'Spring' where e.region = 'Miami' and e.rn_year = 1 and e.academic_year >= {{ var("current_academic_year") - 7 }} and e.grade_level > 2 - ), - - demographic_comps as ( - select * - from nj_scores - union all - select * - from fl_scores ) select @@ -303,58 +245,13 @@ select /* (b) comparison_entity from region null-ness */ if(grouping(region) = 1, district_state, 'Region') as comparison_entity, - /* (c) test_code-derived columns */ - case - when - test_code in ( - 'ELA09', - 'ELA10', - 'ELA11', - 'ELAGP', - 'ALG01_HS', - 'GEO01', - 'ALG02', - 'MATGP', - 'SCI11' - ) - then 'HS' - when test_code = 'ALG01_MS' - then 'MS' - when safe_cast(right(test_code, 2) as numeric) between 5 and 8 - then 'MS' - else 'ES' - end as school_level, - - case - when - test_code in ( - 'ELA09', - 'ELA10', - 'ELA11', - 'ELAGP', - 'ALG01_HS', - 'GEO01', - 'ALG02', - 'MATGP', - 'SCI11' - ) - then 'HS' - else '3-8' - end as grade_range_band, - - case - when left(test_code, 3) in ('MAT', 'ALG', 'GEO') - then 'Math' - when left(test_code, 3) = 'ELA' - then 'ELA' - when left(test_code, 3) = 'SCI' - then 'Science' - when left(test_code, 3) = 'SOC' - then 'Social Studies' - end as discipline, - -from demographic_comps + /* (c) test_code-derived columns via sheet lookup */ + any_value(m.school_level) as school_level, + any_value(m.grade_range_band) as grade_range_band, + any_value(m.discipline) as discipline, +from scores +left join test_code_metadata as m on scores.test_code = m.aligned_level_test_code group by grouping sets ( {# Total (all focus dims rolled up) — with and without region #} diff --git a/src/dbt/kipptaf/models/fldoe/intermediate/int_fldoe__all_assessments.sql b/src/dbt/kipptaf/models/fldoe/intermediate/int_fldoe__all_assessments.sql index d8a52d3dbb..528bd120e7 100644 --- a/src/dbt/kipptaf/models/fldoe/intermediate/int_fldoe__all_assessments.sql +++ b/src/dbt/kipptaf/models/fldoe/intermediate/int_fldoe__all_assessments.sql @@ -58,13 +58,18 @@ select cw1.sublevel_number, cw1.sublevel_name, - cast(regexp_extract(fl.achievement_level, r'\d+') as int) as achievement_level_int, - 'Actual' as results_type, 'KTAF FL' as district_state, - fl.administration_window as `admin`, - fl.assessment_subject as `subject`, + cast(regexp_extract(fl.achievement_level, r'\d+') as int) as achievement_level_int, + + case + when fl.test_code = 'ALG01' and fl.assessment_grade = '8' + then concat(fl.test_code, '_', 'MS') + when fl.test_code = 'ALG01' and fl.assessment_grade in ('9', '10', '11', '12') + then concat(fl.test_code, '_', 'HS') + else fl.test_code + end as aligned_level_test_code, case when fl.assessment_subject like 'English Language Arts%' @@ -83,6 +88,8 @@ select then 'At/Above' end as fast_aggregated_proficiency, + if(fl.is_proficient, 1, 0) as is_proficient_int, + if(cw1.sublevel_number >= 6, null, cw2.scale_low) as scale_for_proficiency, if( @@ -99,6 +106,7 @@ select partition by fl.student_id, fl.academic_year, fl.assessment_subject order by fl.administration_window asc ) as scale_score_prev, + from source as fl left join scale_crosswalk as sc diff --git a/src/dbt/kipptaf/models/fldoe/intermediate/properties/int_fldoe__all_assessments.yml b/src/dbt/kipptaf/models/fldoe/intermediate/properties/int_fldoe__all_assessments.yml index ec9b5311aa..cd3711f071 100644 --- a/src/dbt/kipptaf/models/fldoe/intermediate/properties/int_fldoe__all_assessments.yml +++ b/src/dbt/kipptaf/models/fldoe/intermediate/properties/int_fldoe__all_assessments.yml @@ -1,47 +1,163 @@ models: - name: int_fldoe__all_assessments + description: > + Network-level union of FLDOE FAST assessment results from the kippmiami + district project, enriched with iReady scale crosswalk data for sublevel + classification and growth/proficiency targets. Adds standardized metadata + columns (results_type, district_state, illuminate_subject, + fast_aggregated_proficiency) and derived score columns for use in + reporting and demographic comparisons. + data_tests: + - dbt_utils.unique_combination_of_columns: + arguments: + combination_of_columns: + - _dbt_source_relation + - student_id + - academic_year + - administration_window + - test_code columns: - name: _dbt_source_relation data_type: string + description: Source relation identifier from dbt_utils.union_relations. + - name: test_code data_type: string + description: State assessment test code (e.g. ELA03, ALG01, ELAGP). + - name: academic_year data_type: int64 + description: Academic year the assessment was administered. + - name: administration_window data_type: string + description: Assessment administration window (PM1, PM2, or PM3). + - name: season data_type: string + description: > + Testing season derived from administration window (Fall, Winter, + Spring). We only use Spring for official reporting. + - name: discipline data_type: string + description: Subject discipline category (Math, ELA, Science, etc.). + - name: assessment_subject data_type: string + description: Full assessment subject name as reported by FLDOE. + - name: scale_score data_type: int64 + description: Student's raw scale score on the assessment. + - name: achievement_level data_type: string + description: > + Achievement level label as reported by FLDOE (e.g. 'Level 1', 'Level + 3'). + - name: is_proficient data_type: boolean + description: + Whether the student scored at or above the proficiency threshold. + - name: assessment_grade data_type: string + description: > + Grade level of the assessment. Ideally, it should match a student's + enrolled grade. + - name: performance_level data_type: int64 + description: Numeric performance level from the source data. + - name: student_id data_type: string + description: Student identifier from the source system. + - name: assessment_name data_type: string + description: Full assessment name (e.g. FAST, FSA, EOC). + - name: sublevel_number data_type: int64 + description: > + iReady crosswalk sublevel number for the student's scale score band; + null if no crosswalk match. + - name: sublevel_name data_type: string + description: > + iReady crosswalk sublevel name for the student's scale score band; + null if no crosswalk match. + + - name: results_type + data_type: string + description: Always 'Actual' for FLDOE results (no preliminary scores). + + - name: district_state + data_type: string + description: + Network district/state identifier; always 'KTAF FL' for this model. + - name: achievement_level_int data_type: int64 + description: > + Numeric achievement level extracted from the achievement_level label + (e.g. 3 from 'Level 3'). + + - name: aligned_level_test_code + data_type: string + description: > + Test code adjusted for school level — ALG01 grade 8 is suffixed with + '_MS', grades 9-12 with '_HS'; all other test codes pass through + unchanged. + + - name: illuminate_subject + data_type: string + description: > + Standardized subject label for Illuminate/reporting — ELA subjects map + to 'Text Study'; Algebra I, Algebra II, and Geometry map to + 'Mathematics'; all others pass through unchanged. + + - name: fast_aggregated_proficiency + data_type: string + description: > + Aggregated proficiency band: level 1 = 'Below/Far Below', level 2 = + 'Approaching', level 3+ = 'At/Above'. + + - name: is_proficient_int + data_type: int64 + description: + Integer representation of is_proficient (1 = true, 0 = false). + - name: scale_for_proficiency data_type: int64 + description: > + Minimum scale score required to reach proficiency (sublevel 6) for + this subject and grade; null if already at or above sublevel 6. + - name: points_to_proficiency data_type: int64 + description: > + Points needed to reach proficiency (scale_for_proficiency minus + scale_score); null if already at or above sublevel 6. + - name: scale_for_growth data_type: int64 + description: > + Minimum scale score for the next sublevel (scale_high + 1 of current + sublevel); null if already at the highest sublevel (8). + - name: points_to_growth data_type: int64 + description: > + Points needed to reach the next sublevel (scale_for_growth minus + scale_score); null if already at the highest sublevel (8). + - name: scale_score_prev data_type: int64 + description: > + Student's scale score from the prior administration window within the + same academic year and subject, ordered by administration_window. diff --git a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml index 25fa6b208d..5f202081ec 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml +++ b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml @@ -10,6 +10,16 @@ models: ALG02 totals are 10th-grade only; demographic group totals include all students who took ALG02 regardless of grade. GEO01 follows the same pattern for grades 9 and 10. + data_tests: + - dbt_utils.unique_combination_of_columns: + arguments: + combination_of_columns: + - academic_year + - test_code + - region + - comparison_entity + - comparison_demographic_group + - comparison_demographic_subgroup columns: - name: academic_year data_type: int64 @@ -72,6 +82,13 @@ models: data_type: int64 description: Total number of students tested. + - name: aligned_level_test_code + data_type: string + description: > + Test code adjusted for school level — ALG01 rows with a school_level + other than MS_HS are suffixed with the school level (e.g. 'ALG01_HS'); + all other test codes are passed through unchanged. + - name: comparison_demographic_group_aligned data_type: string description: > @@ -89,15 +106,3 @@ models: description: > Derived count of proficient students (percent_proficient * total_students, rounded to 0 decimal places). - data_tests: - - dbt_utils.unique_combination_of_columns: - arguments: - combination_of_columns: - - academic_year - - test_code - - region - - comparison_entity - - comparison_demographic_group - - comparison_demographic_subgroup - config: - store_failures: true diff --git a/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql b/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql index fc6f79a090..d3fb3be4cd 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql +++ b/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql @@ -1,6 +1,12 @@ select *, + case + when test_code = 'ALG01' and school_level != 'MS_HS' + then concat(test_code, '_', school_level) + else test_code + end as aligned_level_test_code, + if( comparison_demographic_subgroup in ('Grade - 08', 'Grade - 09', 'Grade - 10'), 'Total', diff --git a/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__all_assessments.sql b/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__all_assessments.sql index 9609e32b1a..bdeaea42ff 100644 --- a/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__all_assessments.sql +++ b/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__all_assessments.sql @@ -42,106 +42,144 @@ with ], ) }} + ), + + transformations as ( + select + u._dbt_source_relation, + u.academic_year, + u.americanindianoralaskanative, + u.asian, + u.assessment_name, + u.assessmentgrade, + u.assessmentyear, + u.blackorafricanamerican, + u.discipline, + u.hispanicorlatinoethnicity, + u.is_proficient, + u.is_bl_fb, + u.nativehawaiianorotherpacificislander, + u.period, + u.firstname, + u.lastorsurname, + u.`subject`, + u.testcode, + u.studenttestuuid, + u.test_grade, + u.testperformancelevel_text, + u.testperformancelevel, + u.testscalescore, + u.twoormoreraces, + u.white, + + 'Actual' as results_type, + 'KTAF NJ' as district_state, + + cast(u.statestudentidentifier as string) as statestudentidentifier, + + coalesce(u.studentwithdisabilities in ('504', 'B'), false) as is_504, + + coalesce( + x.student_number, u.localstudentidentifier + ) as localstudentidentifier, + + case + u.testcode + when 'SC05' + then 'SCI05' + when 'SC08' + then 'SCI08' + when 'SC11' + then 'SCI11' + else u.testcode + end as aligned_test_code, + + case + when u.twoormoreraces = 'Y' + then 'T' + when u.hispanicorlatinoethnicity = 'Y' + then 'H' + when u.americanindianoralaskanative = 'Y' + then 'I' + when u.asian = 'Y' + then 'A' + when u.blackorafricanamerican = 'Y' + then 'B' + when u.nativehawaiianorotherpacificislander = 'Y' + then 'P' + when u.white = 'Y' + then 'W' + end as race_ethnicity, + + case + when u.`subject` like 'English Language Arts%' + then 'Text Study' + when u.`subject` in ('Algebra I', 'Algebra II', 'Geometry') + then 'Mathematics' + else u.`subject` + end as illuminate_subject, + + case + when u.assessment_name = 'NJSLA' and u.testperformancelevel <= 2 + then 'Below/Far Below' + when u.assessment_name = 'NJSLA' and u.testperformancelevel = 3 + then 'Approaching' + when u.assessment_name = 'NJSLA' and u.testperformancelevel >= 4 + then 'At/Above' + end as njsla_aggregated_proficiency, + + if(u.englishlearnerel = 'Y', true, false) as lep_status, + + if( + u.studentwithdisabilities in ('IEP', 'B'), 'Has IEP', 'No IEP' + ) as iep_status, + + if(u.`period` = 'FallBlock', 'Fall', u.`period`) as `admin`, + + if(u.`period` = 'FallBlock', 'Fall', u.`period`) as season, + + if( + u.`subject` = 'English Language Arts/Literacy', + 'English Language Arts', + u.`subject` + ) as aligned_subject, + + if(u.is_proficient, 1, 0) as is_proficient_int, + + from union_relations as u + left join + {{ ref("stg_google_sheets__pearson__student_crosswalk") }} as x + on u.studenttestuuid = x.student_test_uuid ) select - u._dbt_source_relation, - u.academic_year, - u.americanindianoralaskanative, - u.asian, - u.assessment_name, - u.assessmentgrade, - u.assessmentyear, - u.blackorafricanamerican, - u.discipline, - u.hispanicorlatinoethnicity, - u.is_proficient, - u.is_bl_fb, - u.nativehawaiianorotherpacificislander, - u.period, - u.firstname, - u.lastorsurname, - u.`subject`, - u.testcode, - u.studenttestuuid, - u.test_grade, - u.testperformancelevel_text, - u.testperformancelevel, - u.testscalescore, - u.twoormoreraces, - u.white, - - 'Actual' as results_type, - 'KTAF NJ' as district_state, - - cast(u.statestudentidentifier as string) as statestudentidentifier, - - coalesce(u.studentwithdisabilities in ('504', 'B'), false) as is_504, - - coalesce(x.student_number, u.localstudentidentifier) as localstudentidentifier, + *, case - u.testcode - when 'SC05' - then 'SCI05' - when 'SC08' - then 'SCI08' - when 'SC11' - then 'SCI11' - else u.testcode - end as aligned_test_code, - - case - when u.twoormoreraces = 'Y' - then 'T' - when u.hispanicorlatinoethnicity = 'Y' - then 'H' - when u.americanindianoralaskanative = 'Y' - then 'I' - when u.asian = 'Y' - then 'A' - when u.blackorafricanamerican = 'Y' - then 'B' - when u.nativehawaiianorotherpacificislander = 'Y' - then 'P' - when u.white = 'Y' - then 'W' - end as race_ethnicity, - - case - when u.`subject` like 'English Language Arts%' - then 'Text Study' - when u.`subject` in ('Algebra I', 'Algebra II', 'Geometry') - then 'Mathematics' - else u.`subject` - end as illuminate_subject, - - case - when u.assessment_name = 'NJSLA' and u.testperformancelevel <= 2 - then 'Below/Far Below' - when u.assessment_name = 'NJSLA' and u.testperformancelevel = 3 - then 'Approaching' - when u.assessment_name = 'NJSLA' and u.testperformancelevel >= 4 - then 'At/Above' - end as njsla_aggregated_proficiency, - - if(u.englishlearnerel = 'Y', true, false) as lep_status, - - if(u.studentwithdisabilities in ('IEP', 'B'), 'Has IEP', 'No IEP') as iep_status, - - if(u.`period` = 'FallBlock', 'Fall', u.`period`) as `admin`, - - if(u.`period` = 'FallBlock', 'Fall', u.`period`) as season, + when race_ethnicity = 'B' + then 'African American' + when race_ethnicity = 'A' + then 'Asian' + when race_ethnicity = 'I' + then 'American Indian' + when race_ethnicity = 'H' + then 'Hispanic' + when race_ethnicity = 'P' + then 'Native Hawaiian' + when race_ethnicity = 'T' + then 'Other' + when race_ethnicity = 'W' + then 'White' + when race_ethnicity is null + then 'Blank' + end as aligned_aggregate_ethnicity, + + if(lep_status, 'ML', 'Not ML') as aligned_ml_status, if( - u.`subject` = 'English Language Arts/Literacy', - 'English Language Arts', - u.`subject` - ) as aligned_subject, - - if(u.is_proficient, 1, 0) as is_proficient_int, + iep_status = 'Has IEP', + 'Students With Disabilities', + 'Students Without Disabilities' + ) as aligned_iep_status, -from union_relations as u -left join - {{ ref("stg_google_sheets__pearson__student_crosswalk") }} as x - on u.studenttestuuid = x.student_test_uuid +from transformations diff --git a/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql b/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql index 3395cf96e2..65e5954b80 100644 --- a/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql +++ b/src/dbt/kipptaf/models/pearson/intermediate/int_pearson__student_list_report.sql @@ -16,6 +16,8 @@ with 'Preliminary' as results_type, 'KTAF NJ' as district_state, + cast(state_student_identifier as string) as state_id, + case when test_name like '%Mathematics%' then 'Math' @@ -77,4 +79,11 @@ select false ) as is_proficient, + if( + performance_level + in ('Met Expectations', 'Exceeded Expectations', 'Graduation Ready'), + 1, + 0 + ) as is_proficient_int, + from scores diff --git a/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__all_assessments.yml b/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__all_assessments.yml index 6ca2e0a49d..67d1965826 100644 --- a/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__all_assessments.yml +++ b/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__all_assessments.yml @@ -1,72 +1,239 @@ models: - name: int_pearson__all_assessments - config: - contract: - enabled: false + description: > + Network-level union of Pearson state assessment results (PARCC, NJSLA, + NJSLA Science, NJGPA) enriched with student crosswalk lookups and + standardized metadata. Computes aligned demographic labels, proficiency + banding, and subject/test-code normalization for use in reporting and + demographic comparison models. + data_tests: + - dbt_utils.unique_combination_of_columns: + arguments: + combination_of_columns: + - studenttestuuid columns: - name: _dbt_source_relation data_type: string + description: Source relation identifier from dbt_utils.union_relations. + + - name: academic_year + data_type: int64 + description: Academic year the assessment was administered. + - name: americanindianoralaskanative data_type: string + description: > + American Indian or Alaska Native race indicator from Pearson ('Y' or + null). + - name: asian data_type: string + description: Asian race indicator from Pearson ('Y' or null). + + - name: assessment_name + data_type: string + description: Assessment program name (e.g. NJSLA, NJGPA, PARCC). + - name: assessmentgrade data_type: string + description: Assessment grade level as reported by Pearson. + - name: assessmentyear data_type: string + description: Assessment year as reported by Pearson (string format). + - name: blackorafricanamerican data_type: string - - name: englishlearnerel + description: + Black or African American race indicator from Pearson ('Y' or null). + + - name: discipline data_type: string + description: Subject discipline category (Math, ELA, Science, etc.). + - name: hispanicorlatinoethnicity data_type: string - - name: localstudentidentifier - data_type: int64 - - name: firstname - data_type: string - - name: lastorsurname - data_type: string + description: + Hispanic or Latino ethnicity indicator from Pearson ('Y' or null). + + - name: is_proficient + data_type: boolean + description: + Whether the student scored at or above the proficiency threshold. + + - name: is_bl_fb + data_type: boolean + description: > + Whether the student scored at the Below or Far Below proficiency band. + - name: nativehawaiianorotherpacificislander data_type: string + description: > + Native Hawaiian or Other Pacific Islander race indicator from Pearson + ('Y' or null). + - name: period data_type: string - - name: studentwithdisabilities + description: > + Testing period from Pearson (e.g. Spring, FallBlock); used to derive + admin and season. Spring is the only period officially reported. + + - name: firstname + data_type: string + description: Student first name as reported by Pearson. + + - name: lastorsurname data_type: string + description: Student last name or surname as reported by Pearson. + - name: subject data_type: string + quote: true + description: + Full subject name as reported by Pearson (e.g. English Language + Arts/Literacy, Mathematics). + - name: testcode data_type: string + description: Raw test code from Pearson (e.g. ELA05, SC08, ALG01). + + - name: studenttestuuid + data_type: string + description: + Globally unique identifier for the student's test-taking event. + + - name: test_grade + data_type: int64 + description: Numeric grade level at which the test was taken. + + - name: testperformancelevel_text + data_type: string + description: + Performance level label as reported by Pearson (e.g. 'Level 4'). + - name: testperformancelevel data_type: float64 + description: Numeric performance level as reported by Pearson. + - name: testscalescore data_type: float64 + description: Student's raw scale score on the assessment. + - name: twoormoreraces data_type: string + description: Two or more races indicator from Pearson ('Y' or null). + - name: white data_type: string - - name: assessment_name + description: White race indicator from Pearson ('Y' or null). + + - name: results_type data_type: string - - name: academic_year - data_type: int64 - - name: test_grade - data_type: int64 - - name: discipline - data_type: string - - name: is_proficient - data_type: boolean - - name: testperformancelevel_text + description: + Always 'Actual' for Pearson results (no preliminary scores). + + - name: district_state data_type: string + description: > + Network district/state identifier; always 'KTAF NJ' for this model. + - name: statestudentidentifier data_type: string - - name: is_504 - data_type: boolean - - name: lep_status - data_type: boolean - - name: iep_status - data_type: string + description: State student identifier cast to string. + + - name: localstudentidentifier + data_type: int64 + description: > + Local student identifier — resolved from the Pearson student crosswalk + when available, otherwise the raw value from the source file. Used to + work around incorrect studentidentifiers provided during testing. + + - name: aligned_test_code + data_type: string + description: > + Test code normalized for stacked reporting across assessment programs + — science codes SC05, SC08, SC11 are remapped to SCI05, SCI08, SCI11 + to match the standard codes used across all sources; all other codes + pass through unchanged. + - name: race_ethnicity data_type: string + description: > + Single-character race/ethnicity code derived from Pearson indicator + flags using a priority hierarchy (T, H, I, A, B, P, W). + - name: illuminate_subject data_type: string + description: > + Standardized subject label for Illuminate/reporting — ELA subjects map + to 'Text Study'; Algebra I, Algebra II, and Geometry map to + 'Mathematics'; all others pass through unchanged. + - name: njsla_aggregated_proficiency data_type: string + description: > + Aggregated proficiency band for NJSLA: levels 1-2 = 'Below/Far Below', + level 3 = 'Approaching', level 4+ = 'At/Above'; null for non-NJSLA + assessments. + + - name: lep_status + data_type: boolean + description: > + Whether the student is classified as an English Learner/EL, derived + from the englishlearnerel indicator. + + - name: iep_status + data_type: string + description: > + IEP status label — 'Has IEP' when studentwithdisabilities is 'IEP' or + 'B', otherwise 'No IEP'. + + - name: admin + data_type: string + quote: true + description: > + Testing administration period with 'FallBlock' normalized to 'Fall'; + reserved word alias for Tableau. + + - name: season + data_type: string + description: > + Testing season with 'FallBlock' normalized to 'Fall'; same logic as + admin. + + - name: aligned_subject + data_type: string + description: > + Subject normalized for stacked reporting across assessment programs — + 'English Language Arts/Literacy' is shortened to 'English Language + Arts' to match the label used in other sources; all other subjects + pass through unchanged. + + - name: is_proficient_int + data_type: int64 + description: + Integer representation of is_proficient (1 = true, 0 = false). + + - name: aligned_aggregate_ethnicity + data_type: string + description: > + Race/ethnicity label normalized for stacked reporting across + assessment programs — expands the single-character race_ethnicity code + to a full label (e.g. B = 'African American', H = 'Hispanic') that + matches the demographic labels used in comparison sources; null maps + to 'Blank'. + + - name: aligned_ml_status + data_type: string + description: > + Multilingual learner status normalized for stacked reporting across + assessment programs — 'ML' when lep_status is true, 'Not ML' + otherwise, using the label convention shared across comparison + sources. + + - name: aligned_iep_status + data_type: string + description: > + IEP status normalized for stacked reporting across assessment programs + — 'Students With Disabilities' or 'Students Without Disabilities', + using the label convention shared across comparison sources. diff --git a/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__student_list_report.yml b/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__student_list_report.yml index 998dc8b224..bf35b7966a 100644 --- a/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__student_list_report.yml +++ b/src/dbt/kipptaf/models/pearson/intermediate/properties/int_pearson__student_list_report.yml @@ -1,2 +1,124 @@ models: - name: int_pearson__student_list_report + description: > + Intermediate transformation of preliminary Pearson student list report + data, unioned across Newark, Camden, and Paterson regions. Derives + discipline, subject, aligned test code, and performance band level from + the raw performance_level and test_name columns, and computes + is_proficient and is_proficient_int for use in demographic comparison + models. + data_tests: + - dbt_utils.unique_combination_of_columns: + arguments: + combination_of_columns: + - _dbt_source_relation + - academic_year + - administration + - state_id + - aligned_test_code + columns: + - name: _dbt_source_relation + data_type: string + description: Source relation identifier from dbt_utils.union_relations. + + - name: academic_year + data_type: int64 + description: Academic year the assessment was administered. + + - name: state_student_identifier + data_type: int64 + description: State student identifier from the source file. + + - name: local_student_identifier + data_type: int64 + description: Local student identifier from the source file. + + - name: last_or_surname + data_type: string + description: Student last name or surname as reported by Pearson. + + - name: first_name + data_type: string + description: Student first name as reported by Pearson. + + - name: date_of_birth + data_type: string + description: Student date of birth as reported by Pearson. + + - name: test_type + data_type: string + description: Assessment test type from the Dagster partition key. + + - name: scale_score + data_type: int64 + description: Student's raw scale score on the assessment. + + - name: performance_level + data_type: string + description: > + Performance level label as reported by Pearson (e.g. 'Met + Expectations', 'Graduation Ready'). + + - name: administration + data_type: string + description: + Assessment administration period from the Dagster partition key. + + - name: results_type + data_type: string + description: + Always 'Preliminary' — student list reports contain pre-release + scores. + + - name: district_state + data_type: string + description: > + Network district/state identifier; always 'KTAF NJ' for this model. + + - name: state_id + data_type: string + description: State student identifier cast to string. + + - name: discipline + data_type: string + description: > + Subject discipline derived from test_name — + Mathematics/Algebra/Geometry map to 'Math', all others map to 'ELA'. + + - name: subject + data_type: string + quote: true + description: > + Subject normalized for stacked reporting across assessment programs — + Mathematics/Algebra/Geometry map to 'Mathematics', all others map to + 'English Language Arts', matching the label convention used in + comparison sources. + + - name: aligned_test_code + data_type: string + description: > + Test code normalized for stacked reporting across assessment programs + — derived from test_name: graduation proficiency tests map to + ELAGP/MATGP, Geometry to GEO01, Algebra I to ALG01; grade-level math + and ELA tests extract the grade from the name and prefix with MAT/ELA. + + - name: performance_band_level + data_type: int64 + description: > + Numeric performance band level derived from performance_level: 'Did + Not Yet Meet Expectations'/'Not Yet Graduation Ready' = 1, 'Partially + Met Expectations' = 2, 'Approached Expectations' = 3, 'Met + Expectations' = 4, 'Exceeded Expectations' = 5, 'Graduation Ready' = + 2. + + - name: is_proficient + data_type: boolean + description: > + Whether the student scored proficient or above — true when + performance_level is 'Met Expectations', 'Exceeded Expectations', or + 'Graduation Ready'. + + - name: is_proficient_int + data_type: int64 + description: + Integer representation of is_proficient (1 = true, 0 = false). diff --git a/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments.sql b/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments.sql index b4661edb66..034f8f8a9f 100644 --- a/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments.sql +++ b/src/dbt/kipptaf/models/students/intermediate/int_extracts__student_enrollments.sql @@ -289,6 +289,10 @@ select false ) as student_slideback, + case + e.gender when 'F' then 'Female' when 'M' then 'Male' when 'X' then 'Non-Binary' + end as aligned_gender, + case e.enroll_status when -2 diff --git a/src/dbt/kipptaf/models/students/intermediate/properties/int_extracts__student_enrollments.yml b/src/dbt/kipptaf/models/students/intermediate/properties/int_extracts__student_enrollments.yml index fac927f55c..dbec87ce90 100644 --- a/src/dbt/kipptaf/models/students/intermediate/properties/int_extracts__student_enrollments.yml +++ b/src/dbt/kipptaf/models/students/intermediate/properties/int_extracts__student_enrollments.yml @@ -544,6 +544,8 @@ models: GPA band than their current unweighted GPA (cumulative_y1_gpa_unweighted). False if bands are equal, improving, or either GPA value is null. + - name: aligned_gender + data_type: string - name: enroll_status_string data_type: string - name: race_ethnicity From 67e7bcc79e5ce322740e0f639ac77a2280953035 Mon Sep 17 00:00:00 2001 From: grangel <140853376+GabyRangelB@users.noreply.github.com> Date: Wed, 1 Apr 2026 20:04:59 +0000 Subject: [PATCH 6/9] refactor(dbt): add prelim score gating and harden comps pipeline MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add prelim score gating to int_tableau__state_assessments_demographic_comps: new prelim_assessments and valid_prelim_assessments CTEs automatically exclude NJ student list data for any (academic_year, assessment_name) already present in int_pearson__all_assessments Spring, eliminating the need to manually comment/uncomment the prelim branch each cycle - Qualify all column references in final SELECT with scores alias (s.) to satisfy RF02 after test_code_metadata join was introduced - Replace rolling 7-year window with fixed 2018 floor across all three score branches — 2018 is the earliest year with available comps data - Fix rpt_tableau__state_assessments_dashboard: inline-alias administration_window and assessment_subject from int_fldoe__all_assessments instead of expecting pre-computed admin/subject columns Co-Authored-By: Claude Sonnet 4.6 (1M context) --- ...u__state_assessments_demographic_comps.sql | 92 ++++++++++++++----- ...t_tableau__state_assessments_dashboard.sql | 4 +- 2 files changed, 70 insertions(+), 26 deletions(-) diff --git a/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql b/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql index 4cb9fdcb42..1e9dd389a4 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/intermediate/int_tableau__state_assessments_demographic_comps.sql @@ -21,6 +21,36 @@ ] %} with + /* + Prelim score gating: automatically includes preliminary NJ scores only + when official scores for that assessment/year have not yet landed in + int_pearson__all_assessments. This eliminates the need to manually + comment/uncomment the prelim branch each time a new student list file + is loaded — the branch self-deactivates once official scores arrive. + */ + prelim_assessments as ( + select academic_year, test_type, count(*) as record_count, + from {{ ref("int_pearson__student_list_report") }} + where + -- 2024: first year we track preliminary scores for comparison + academic_year >= 2024 + and administration = 'Spring' + and scale_score is not null + group by academic_year, test_type + ), + + valid_prelim_assessments as ( + select pa.academic_year, pa.test_type, + from prelim_assessments as pa + left join + {{ ref("int_pearson__all_assessments") }} as p + on pa.academic_year = p.academic_year + and pa.test_type = p.assessment_name + and p.season = 'Spring' + group by pa.academic_year, pa.test_type + having count(p.assessment_name) = 0 + ), + test_code_metadata as ( select aligned_level_test_code, @@ -61,12 +91,12 @@ with on e.academic_year = a.academic_year and e.pearson_local_student_identifier = a.localstudentidentifier and {{ union_dataset_join_clause(left_alias="e", right_alias="a") }} - and a.academic_year >= 2018 and a.season = 'Spring' and a.testscalescore is not null where e.rn_year = 1 - and e.academic_year >= {{ var("current_academic_year") - 7 }} + -- 2018: earliest year with available comps data + and e.academic_year >= 2018 and e.grade_level > 2 and e.school_level != 'OD' @@ -123,12 +153,18 @@ with on e.academic_year = a.academic_year and e.pearson_local_student_identifier = a.local_student_identifier and {{ union_dataset_join_clause(left_alias="e", right_alias="a") }} - and a.academic_year >= 2018 + -- see prelim_assessments CTE + and a.academic_year >= 2024 and a.administration = 'Spring' and a.scale_score is not null + inner join + valid_prelim_assessments as vpa + on a.academic_year = vpa.academic_year + and a.test_type = vpa.test_type where e.rn_year = 1 - and e.academic_year >= {{ var("current_academic_year") - 7 }} + -- 2018: earliest year with available comps data + and e.academic_year >= 2018 and e.grade_level > 2 and e.school_level != 'OD' @@ -191,22 +227,23 @@ with where e.region = 'Miami' and e.rn_year = 1 - and e.academic_year >= {{ var("current_academic_year") - 7 }} + -- 2018: earliest year with available comps data + and e.academic_year >= 2018 and e.grade_level > 2 ) select - academic_year, - district_state, - region, - assessment_name, - test_code, + s.academic_year, + s.district_state, + s.region, + s.assessment_name, + s.test_code, round( - avg(is_proficient_int) * count(student_number), 0 + avg(s.is_proficient_int) * count(s.student_number), 0 ) as total_proficient_students, - count(student_number) as total_students, - avg(is_proficient_int) as percent_proficient, + count(s.student_number) as total_students, + avg(s.is_proficient_int) as percent_proficient, /* (a) focus_level + demographic labels */ case @@ -223,13 +260,13 @@ select {% endfor %} then 'Total' when - grouping(ml_status) = 0 - or grouping(iep_status) = 0 - or grouping(lunch_status) = 0 + grouping(s.ml_status) = 0 + or grouping(s.iep_status) = 0 + or grouping(s.lunch_status) = 0 then 'Subgroup' - when grouping(gender) = 0 + when grouping(s.gender) = 0 then 'Gender' - when grouping(aggregate_ethnicity) = 0 + when grouping(s.aggregate_ethnicity) = 0 then 'Aggregate Ethnicity' end as comparison_demographic_group, @@ -239,28 +276,35 @@ select grouping({{ dim }}) = 1{% if not loop.last %} and {% endif %} {% endfor %} then 'All Students' - else coalesce(gender, aggregate_ethnicity, lunch_status, ml_status, iep_status) + else + coalesce( + s.gender, + s.aggregate_ethnicity, + s.lunch_status, + s.ml_status, + s.iep_status + ) end as comparison_demographic_subgroup, /* (b) comparison_entity from region null-ness */ - if(grouping(region) = 1, district_state, 'Region') as comparison_entity, + if(grouping(s.region) = 1, s.district_state, 'Region') as comparison_entity, /* (c) test_code-derived columns via sheet lookup */ any_value(m.school_level) as school_level, any_value(m.grade_range_band) as grade_range_band, any_value(m.discipline) as discipline, -from scores -left join test_code_metadata as m on scores.test_code = m.aligned_level_test_code +from scores as s +left join test_code_metadata as m on s.test_code = m.aligned_level_test_code group by grouping sets ( {# Total (all focus dims rolled up) — with and without region #} - ({{ base_dims | join(", ") }}, region), + ({{ base_dims | join(", ") }}, s.region), ({{ base_dims | join(", ") }}), {# One focus dim active at a time — with and without region #} {% for dim in focus_dims %} - ({{ base_dims | join(", ") }}, region, {{ dim }}), + ({{ base_dims | join(", ") }}, s.region, {{ dim }}), ({{ base_dims | join(", ") }}, {{ dim }}) {% if not loop.last %},{% endif %} {% endfor %} diff --git a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql index 799d28f6d3..3f75b29550 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql @@ -153,9 +153,9 @@ with 'Actual' as results_type, - `admin`, + administration_window as `admin`, season, - `subject`, + assessment_subject as `subject`, test_code, from {{ ref("int_fldoe__all_assessments") }} From 7dcea5095c003ec63c9d8ea6b3582ee301a9fc7a Mon Sep 17 00:00:00 2001 From: grangel <140853376+GabyRangelB@users.noreply.github.com> Date: Fri, 3 Apr 2026 13:45:41 +0000 Subject: [PATCH 7/9] refactor(dbt): simplify state_test_comparison_demographics staging - Strip stg_google_sheets__state_test_comparison_demographics to SELECT * so all columns (school_level, grade_range_band, discipline, aligned_level_test_code, etc.) come directly from the _v2 sheet range - Update sources-external.yml with _v2 sheet range Co-Authored-By: Claude Sonnet 4.6 (1M context) --- src/dbt/kipptaf/models/google/sheets/sources-external.yml | 1 - .../properties/stg_google_sheets__state_test_comparison.yml | 2 -- .../stg_google_sheets__state_test_comparison_demographics.yml | 2 +- 3 files changed, 1 insertion(+), 4 deletions(-) diff --git a/src/dbt/kipptaf/models/google/sheets/sources-external.yml b/src/dbt/kipptaf/models/google/sheets/sources-external.yml index f1d251380f..7e9f93328f 100644 --- a/src/dbt/kipptaf/models/google/sheets/sources-external.yml +++ b/src/dbt/kipptaf/models/google/sheets/sources-external.yml @@ -68,7 +68,6 @@ sources: - student_graduation_path_cutoffs - name: src_google_sheets__state_test_comparison config: - enabled: false meta: dagster: asset_key: diff --git a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison.yml b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison.yml index 058281df32..e6bf726e26 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison.yml +++ b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison.yml @@ -1,7 +1,5 @@ models: - name: stg_google_sheets__state_test_comparison - config: - enabled: false columns: - name: Academic_Year data_type: int64 diff --git a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml index 5f202081ec..843dae5cef 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml +++ b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml @@ -25,7 +25,7 @@ models: data_type: int64 description: Academic year the benchmarks apply to. - - name: test_name + - name: assessment_name data_type: string description: State assessment test name (e.g. NJSLA, NJGPA). From ba3b448c97525192a53721f0999074ac373d54b6 Mon Sep 17 00:00:00 2001 From: grangel <140853376+GabyRangelB@users.noreply.github.com> Date: Fri, 3 Apr 2026 14:51:21 +0000 Subject: [PATCH 8/9] fix(dbt): update downstream column refs after stg_comparison_demographics rename MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Rename test_name → assessment_name, test_code → aligned_test_code in dim_state_assessment_benchmarks (SQL + contract YML) and state_comps CTE in rpt_tableau__state_assessments_dashboard - Rename comparison_demographic_group/subgroup_aligned → aligned_comparison_demographic_group/subgroup across all three downstream models - Fix join conditions in rpt_tableau__state_assessments_dashboard against state_comps CTE (all three score sections) - Add school_level to uniqueness test on stg_google_sheets__state_test_comparison_demographics — differentiates MS_HS source rows from synthetic HS ALG01 aggregate rows Co-Authored-By: Claude Sonnet 4.6 (1M context) --- ...t_tableau__state_assessments_dashboard.sql | 18 +++++++------- ...eau__state_assessments_dashboard_comps.sql | 8 +++---- ...ts__state_test_comparison_demographics.yml | 15 ++++++++---- .../marts/dim_state_assessment_benchmarks.sql | 24 +++++++++---------- .../dim_state_assessment_benchmarks.yml | 16 ++++++------- 5 files changed, 43 insertions(+), 38 deletions(-) diff --git a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql index 3f75b29550..af09cb44e5 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard.sql @@ -70,8 +70,8 @@ with state_comps as ( select academic_year, - test_name, - test_code, + assessment_name, + aligned_test_code, region, season, @@ -95,7 +95,7 @@ with where comparison_demographic_group = 'Total' and comparison_demographic_subgroup = 'All Students' - group by academic_year, test_name, test_code, region, season + group by academic_year, assessment_name, aligned_test_code, region, season ), assessment_scores as ( @@ -321,8 +321,8 @@ inner join left join state_comps as c on a.academic_year = c.academic_year - and a.assessment_name = c.test_name - and a.test_code = c.test_code + and a.assessment_name = c.assessment_name + and a.test_code = c.aligned_test_code and a.season = c.season and e.region = c.region left join @@ -441,8 +441,8 @@ inner join left join state_comps as c on a.academic_year = c.academic_year - and a.assessment_name = c.test_name - and a.test_code = c.test_code + and a.assessment_name = c.assessment_name + and a.test_code = c.aligned_test_code and a.season = c.season and e.region = c.region left join @@ -560,8 +560,8 @@ inner join left join state_comps as c on a.academic_year = c.academic_year - and a.assessment_name = c.test_name - and a.test_code = c.test_code + and a.assessment_name = c.assessment_name + and a.test_code = c.aligned_test_code and a.season = c.season and e.region = c.region left join diff --git a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql index beb1d99419..56ea5678cb 100644 --- a/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql +++ b/src/dbt/kipptaf/models/extracts/tableau/rpt_tableau__state_assessments_dashboard_comps.sql @@ -57,10 +57,10 @@ with /* Google Sheets benchmarks */ select academic_year, - test_name as assessment_name, + assessment_name, comparison_entity, - comparison_demographic_group_aligned as comparison_demographic_group, - comparison_demographic_subgroup_aligned as comparison_demographic_subgroup, + aligned_comparison_demographic_group as comparison_demographic_group, + aligned_comparison_demographic_subgroup as comparison_demographic_subgroup, null as focus_level, school_level, grade_range_band, @@ -68,7 +68,7 @@ with total_proficient_students, total_students, percent_proficient, - test_code, + aligned_test_code as test_code, region, from {{ ref("stg_google_sheets__state_test_comparison_demographics") }} diff --git a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml index 843dae5cef..f5a85ac079 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml +++ b/src/dbt/kipptaf/models/google/sheets/staging/properties/stg_google_sheets__state_test_comparison_demographics.yml @@ -15,7 +15,8 @@ models: arguments: combination_of_columns: - academic_year - - test_code + - aligned_test_code + - school_level - region - comparison_entity - comparison_demographic_group @@ -47,9 +48,13 @@ models: data_type: string description: Subject discipline (Math, ELA, Science, Social Studies). - - name: test_code + - name: aligned_test_code data_type: string - description: State assessment test code (e.g. ELA05, ALG01). + description: > + Test code normalized for stacked reporting across assessment programs + — science codes SC05, SC08, SC11 are remapped to SCI05, SCI08, SCI11 + to match the standard codes used across all sources; all other codes + pass through unchanged. - name: region data_type: string @@ -89,13 +94,13 @@ models: other than MS_HS are suffixed with the school level (e.g. 'ALG01_HS'); all other test codes are passed through unchanged. - - name: comparison_demographic_group_aligned + - name: aligned_comparison_demographic_group data_type: string description: > Demographic group after alignment — Grade rows with subgroups 08/09/10 are remapped to 'Total'. - - name: comparison_demographic_subgroup_aligned + - name: aligned_comparison_demographic_subgroup data_type: string description: > Demographic subgroup after alignment — Grade group rows are remapped diff --git a/src/dbt/kipptaf/models/marts/dim_state_assessment_benchmarks.sql b/src/dbt/kipptaf/models/marts/dim_state_assessment_benchmarks.sql index c71ee6d63b..cc00cb509d 100644 --- a/src/dbt/kipptaf/models/marts/dim_state_assessment_benchmarks.sql +++ b/src/dbt/kipptaf/models/marts/dim_state_assessment_benchmarks.sql @@ -1,14 +1,14 @@ select academic_year, - test_name, + assessment_name, season, school_level, grade_range_band, discipline, - test_code, + aligned_test_code, region, - comparison_demographic_group_aligned, - comparison_demographic_subgroup_aligned, + aligned_comparison_demographic_group, + aligned_comparison_demographic_subgroup, max( case when comparison_entity = 'City' then percent_proficient end @@ -47,14 +47,14 @@ select dbt_utils.generate_surrogate_key( [ "academic_year", - "test_name", - "test_code", + "assessment_name", + "aligned_test_code", "region", "school_level", "grade_range_band", "season", - "comparison_demographic_group_aligned", - "comparison_demographic_subgroup_aligned", + "aligned_comparison_demographic_group", + "aligned_comparison_demographic_subgroup", ] ) }} as state_assessment_benchmarks_key, @@ -62,12 +62,12 @@ select from {{ ref("stg_google_sheets__state_test_comparison_demographics") }} group by academic_year, - test_name, + assessment_name, season, school_level, grade_range_band, discipline, - test_code, + aligned_test_code, region, - comparison_demographic_group_aligned, - comparison_demographic_subgroup_aligned + aligned_comparison_demographic_group, + aligned_comparison_demographic_subgroup diff --git a/src/dbt/kipptaf/models/marts/properties/dim_state_assessment_benchmarks.yml b/src/dbt/kipptaf/models/marts/properties/dim_state_assessment_benchmarks.yml index 1b8d283854..303f9d25ca 100644 --- a/src/dbt/kipptaf/models/marts/properties/dim_state_assessment_benchmarks.yml +++ b/src/dbt/kipptaf/models/marts/properties/dim_state_assessment_benchmarks.yml @@ -5,10 +5,10 @@ models: - name: state_assessment_benchmarks_key data_type: string description: > - Surrogate key on (academic_year, test_name, test_code, region, - school_level, grade_range_band, season, - comparison_demographic_group_aligned, - comparison_demographic_subgroup_aligned). + Surrogate key on (academic_year, assessment_name, aligned_test_code, + region, school_level, grade_range_band, season, + aligned_comparison_demographic_group, + aligned_comparison_demographic_subgroup). data_tests: - unique @@ -16,7 +16,7 @@ models: data_type: int64 description: Academic year the benchmarks apply to. - - name: test_name + - name: assessment_name data_type: string description: State assessment test name (e.g. NJSLA, NJGPA). @@ -38,7 +38,7 @@ models: data_type: string description: Subject discipline (Math, ELA, Science, Social Studies). - - name: test_code + - name: aligned_test_code data_type: string description: State assessment test code (e.g. ELA05, ALG01). @@ -48,13 +48,13 @@ models: Network region the comparison applies to (Newark, Camden, Miami, Paterson). - - name: comparison_demographic_group_aligned + - name: aligned_comparison_demographic_group data_type: string description: > Demographic group after alignment — Grade rows with subgroups 08/09/10 are remapped to 'Total'. - - name: comparison_demographic_subgroup_aligned + - name: aligned_comparison_demographic_subgroup data_type: string description: > Demographic subgroup after alignment — Grade group rows are remapped From aa34fbbfeadb0d2ee2332fb58081e15cbb7e07de Mon Sep 17 00:00:00 2001 From: grangel <140853376+GabyRangelB@users.noreply.github.com> Date: Fri, 3 Apr 2026 14:56:28 +0000 Subject: [PATCH 9/9] refactor(dbt): rebuild stg_google_sheets__state_test_comparison_demographics MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add custom_rows CTE to union source data with synthetic HS ALG01 totals (aggregated from HS_09 and HS_10 rows, which official comp sources do not provide as a combined HS total) - Rename derived columns: comparison_demographic_group/subgroup_aligned → aligned_comparison_demographic_group/subgroup - Update aligned_level_test_code derivation to use aligned_test_code instead of test_code Co-Authored-By: Claude Sonnet 4.6 (1M context) --- ...ts__state_test_comparison_demographics.sql | 80 ++++++++++++++++--- 1 file changed, 69 insertions(+), 11 deletions(-) diff --git a/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql b/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql index d3fb3be4cd..b70dc470d4 100644 --- a/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql +++ b/src/dbt/kipptaf/models/google/sheets/staging/stg_google_sheets__state_test_comparison_demographics.sql @@ -1,29 +1,87 @@ +with + custom_rows as ( + select + academic_year, + assessment_name, + season, + school_level, + grade_range_band, + discipline, + aligned_test_code, + region, + comparison_entity, + comparison_demographic_group, + comparison_demographic_subgroup, + percent_proficient, + total_students, + + from + {{ + source( + "google_sheets", + "src_google_sheets__state_test_comparison_demographics", + ) + }} + + union all + + -- HS-only ALG01 totals are not provided by official comp data sources + select + academic_year, + assessment_name, + season, + grade_range_band as school_level, + grade_range_band, + discipline, + aligned_test_code, + region, + comparison_entity, + 'Total' as comparison_demographic_group, + 'All Students' as comparison_demographic_subgroup, + + sum(percent_proficient) as percent_proficient, + sum(total_students) as total_students, + + from + {{ + source( + "google_sheets", + "src_google_sheets__state_test_comparison_demographics", + ) + }} + where school_level in ('HS_09', 'HS_10') and aligned_test_code = 'ALG01' + group by + academic_year, + assessment_name, + season, + grade_range_band, + discipline, + aligned_test_code, + region, + comparison_entity + ) + select *, case - when test_code = 'ALG01' and school_level != 'MS_HS' - then concat(test_code, '_', school_level) - else test_code + when aligned_test_code = 'ALG01' and school_level != 'MS_HS' + then concat(aligned_test_code, '_', school_level) + else aligned_test_code end as aligned_level_test_code, if( comparison_demographic_subgroup in ('Grade - 08', 'Grade - 09', 'Grade - 10'), 'Total', comparison_demographic_group - ) as comparison_demographic_group_aligned, + ) as aligned_comparison_demographic_group, if( comparison_demographic_group = 'Grade', 'All Students', comparison_demographic_subgroup - ) as comparison_demographic_subgroup_aligned, + ) as aligned_comparison_demographic_subgroup, round(percent_proficient * total_students, 0) as total_proficient_students, -from - {{ - source( - "google_sheets", "src_google_sheets__state_test_comparison_demographics" - ) - }} +from custom_rows