Skip to content

Refactor int_topline__student_metrics to use CTEs#3328

Open
anthonygwalters wants to merge 4 commits intomainfrom
claude/optimize-topline-cascade-m5ra7
Open

Refactor int_topline__student_metrics to use CTEs#3328
anthonygwalters wants to merge 4 commits intomainfrom
claude/optimize-topline-cascade-m5ra7

Conversation

@anthonygwalters
Copy link
Copy Markdown
Member

Pull Request

Summary & Motivation

When merged, this pull request will refactor int_topline__student_metrics.sql to use Common Table Expressions (CTEs) for all referenced source models. This improves code readability and maintainability by centralizing all source model references at the top of the query, making dependencies explicit and reducing repetition throughout the file.

The following models are now referenced via CTEs:

  • int_topline__iready_diagnostic_weekly
  • int_topline__dibels_pm_weekly
  • int_topline__college_matriculation_weekly
  • int_topline__state_assessments_weekly
  • int_extracts__student_enrollments_weeks

Self-review

General

  • If this is a same-day request, please flag that in the #data-team Slack
  • Update due date and assignee on the TEAMster Asana Project
  • Run Format on all modified files

dbt

  • Include a corresponding [model name].yml properties file for all models.

SQL

  • Use the union_dataset_join_clause() macro for queries that employ models that use regional datasets
  • Do not use group by without any aggregations when you mean to use distinct
  • All distinct usage must be accompanied by a comment explaining its necessity
  • Do not use order by for select statements. That should be done in the reporting layer.

Troubleshooting

https://claude.ai/code/session_01VwB9BpfUrxTnCVYPa16md9

… avoid repeated scans

Five upstream tables were referenced multiple times across separate UNION ALL branches,
causing BigQuery to scan each table more than once per model run:

- int_topline__iready_diagnostic_weekly (2x)
- int_topline__dibels_pm_weekly (2x)
- int_topline__college_matriculation_weekly (4x)
- int_topline__state_assessments_weekly (3x)
- int_extracts__student_enrollments_weeks (3x)

Adding named CTEs at the top of the with block for each of these and referencing
the CTE alias in the body allows BigQuery to treat each as a single scan.

https://claude.ai/code/session_01VwB9BpfUrxTnCVYPa16md9
@anthonygwalters anthonygwalters requested a review from a team February 26, 2026 22:04
Anthony Walters added 2 commits February 26, 2026 22:19
@cbini cbini force-pushed the claude/optimize-topline-cascade-m5ra7 branch from c0802a6 to 2209a4d Compare March 4, 2026 19:26
@cbini cbini requested review from a team as code owners March 4, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants