Add geo-experiment design workflow and sensitivity check plotting for Synthetic Control#819
Draft
drbenvincent wants to merge 13 commits intomainfrom
Draft
Add geo-experiment design workflow and sensitivity check plotting for Synthetic Control#819drbenvincent wants to merge 13 commits intomainfrom
drbenvincent wants to merge 13 commits intomainfrom
Conversation
Add prospective design capabilities so practitioners can assess whether a geo-experiment will work before committing budget: - `SyntheticControl.from_pre_period()`: classmethod that fits SC on pre-period data only, enabling prospective design assessment without requiring post-period observations - `validate_design()`: dress rehearsal that injects a known effect and checks if the model recovers it - `power_analysis()`: simulation-based Bayesian power curve across candidate effect sizes - `donor_pool_quality()`: composite quality score aggregating donor correlations, convex hull coverage, and weight concentration - `DressRehearsalCheck`: pipeline-compatible Check wrapper for sensitivity analysis integration - Result classes with `plot()` and `summary()` methods - 27 integration tests covering both prospective and retrospective workflows - Demo sections in sc_pymc.ipynb showing the real workflow: design assessment before analysis Made-with: Cursor
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #819 +/- ##
==========================================
- Coverage 93.77% 93.73% -0.05%
==========================================
Files 77 80 +3
Lines 11881 12333 +452
Branches 696 732 +36
==========================================
+ Hits 11142 11560 +418
- Misses 546 566 +20
- Partials 193 207 +14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Reorder cells and rewrite headings so the notebook mirrors a practitioner's actual workflow: design assessment before the experiment, causal analysis after. Key changes: - Move df.head() to the data-loading section - Move convex hull explanation before the design section - Rename headings to question-driven titles (educational-narrative) - Add clear "Before / After the experiment" phase headings - Add transition prose between design and analysis phases - Add power curve interpretation cell with go/no-go guidance - Link donor pool selection forward to donor_pool_quality() - Demote Effect Summary to subsection of analysis phase Made-with: Cursor
Donor pool selection and convex hull condition are pre-experiment checks — they now sit as subsections of "Before the experiment" rather than floating between Load data and the design section. Also adds a reminder in the "After" section that the convex hull check runs automatically when constructing the full SyntheticControl. Fixes missing nbformat properties across all output cells. Made-with: Cursor
Summarise the full before/after workflow under the title so readers can see the notebook's scope at a glance. Each step gets 2-3 sentences explaining what it does and why it matters. Made-with: Cursor
5 tasks
…pymc notebook Expand the Synthetic Control notebook with academic references (Abadie 2010/2015/2021, Athey & Imbens 2017, etc.) and add post-estimation robustness sections: placebo-in-space, placebo-in-time, leave-one-out, and prior sensitivity — each with result visualisations and interpretation guidance. Add 13 new BibTeX entries to references.bib. Made-with: Cursor
Replace the synthetic toy dataset with the canonical Abadie, Diamond & Hainmueller (2010) Proposition 99 dataset — per-capita cigarette sales across 39 US states, 1970-2000. This grounds the notebook in real data from the SC literature, improves connections to cited references, and gives robustness checks a realistic "good case" to demonstrate. - Add california_prop99.csv (wide format, 7 KB) and register as "prop99" - Update all narrative to California/tobacco policy context - Update all code cells: control_units, treated_unit, treatment_time - Adjust holdout_periods for the 19-year pre-period Made-with: Cursor
Enlarge the correlation heatmap for readability with 39 states, add an explicit donor pool selection step that removes states with negative pre-treatment correlation (threshold=0.0), and explain the threshold choice. Excludes Alabama, Arkansas, Georgia, Tennessee — leaving 34 well-correlated donors. Made-with: Cursor
Notebook fully executed with California Proposition 99 data: correlation heatmap, donor pool curation, design assessment, model fit, effect summaries, and all four robustness checks (placebo-in-space, placebo-in-time, leave-one-out, prior sensitivity) with visualisations. Made-with: Cursor
2 tasks
Extract shared plotting helpers (_plot_helpers.py) and add plot() staticmethods to PlaceboInSpace, PlaceboInTime, LeaveOneOut, and PriorSensitivity. Each check now auto-populates CheckResult.figures in run(). GenerateReport renders check figures in the HTML report. Replace ~80 lines of custom matplotlib in sc_pymc.ipynb with single-line library calls. Made-with: Cursor
- Add raw data time-series visualization after data loading - Add circle tile map showing per-state correlation with California - Add interpretation text after dress rehearsal plot - Document power curve Type I error issue as TODO; remove effect_size=0 - Reduce forest plot per-row height (0.45 -> 0.3) in _plot_helpers.py - Fix pre-existing nbformat validation issues in cell outputs Made-with: Cursor
Made-with: Cursor
Agents cannot detect unsaved IDE state, so prompt the user to confirm all files (especially notebooks with expensive outputs) are saved before staging and committing. Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds prospective experiment-design capabilities and first-class sensitivity check plotting to
SyntheticControl.Design-phase methods — so practitioners can assess whether a geo-experiment will work before committing budget:
SyntheticControl.from_pre_period()creates a design-phase instance from pre-period data onlyvalidate_design()— dress rehearsal: injects a known effect and checks recoverypower_analysis()— simulation-based Bayesian power curvedonor_pool_quality()— composite quality score (correlation, convex hull, weight concentration)DressRehearsalCheckwraps dress rehearsal as aCheckfor pipeline integrationDressRehearsalResult,PowerCurveResult,DonorPoolQualityResult) withplot()andsummary()methodsSensitivity check plotting — previously, check visualizations lived as ~80 lines of custom matplotlib in the notebook. Now they are part of the library:
causalpy/checks/_plot_helpers.pywith sharedforest_plot()andnull_distribution_plot()helpersplot()staticmethods onPlaceboInSpace,PlaceboInTime,LeaveOneOut, andPriorSensitivityrun()auto-populatesCheckResult.figureswith matplotlib figuresGenerateReportnow renders check figures in the HTML report (base64-encoded PNGs)PlaceboInSpace.plot(result, baseline_stats=stats))Notebook overhaul (
sc_pymc.ipynb):Test plan
test_sc_design.py)test_check_plots.py)interrogatefailure is pre-existing (84% vs 85%)make html