Skip to content

Add geo-experiment design workflow and sensitivity check plotting for Synthetic Control#819

Draft
drbenvincent wants to merge 13 commits intomainfrom
feature/sc-design-workflow
Draft

Add geo-experiment design workflow and sensitivity check plotting for Synthetic Control#819
drbenvincent wants to merge 13 commits intomainfrom
feature/sc-design-workflow

Conversation

@drbenvincent
Copy link
Copy Markdown
Collaborator

@drbenvincent drbenvincent commented Apr 2, 2026

Summary

Adds prospective experiment-design capabilities and first-class sensitivity check plotting to SyntheticControl.

Design-phase methods — so practitioners can assess whether a geo-experiment will work before committing budget:

  • SyntheticControl.from_pre_period() creates a design-phase instance from pre-period data only
  • validate_design() — dress rehearsal: injects a known effect and checks recovery
  • power_analysis() — simulation-based Bayesian power curve
  • donor_pool_quality() — composite quality score (correlation, convex hull, weight concentration)
  • DressRehearsalCheck wraps dress rehearsal as a Check for pipeline integration
  • Result classes (DressRehearsalResult, PowerCurveResult, DonorPoolQualityResult) with plot() and summary() methods

Sensitivity check plotting — previously, check visualizations lived as ~80 lines of custom matplotlib in the notebook. Now they are part of the library:

  • New causalpy/checks/_plot_helpers.py with shared forest_plot() and null_distribution_plot() helpers
  • plot() staticmethods on PlaceboInSpace, PlaceboInTime, LeaveOneOut, and PriorSensitivity
  • Each check's run() auto-populates CheckResult.figures with matplotlib figures
  • GenerateReport now renders check figures in the HTML report (base64-encoded PNGs)
  • Notebook custom plot cells replaced with single-line library calls (e.g. PlaceboInSpace.plot(result, baseline_stats=stats))

Notebook overhaul (sc_pymc.ipynb):

  • Switched to the California Proposition 99 dataset — the canonical SC example
  • Restructured as a full workflow: design assessment before analysis, robustness checks after
  • Literature-grounded narrative with citations to Abadie (2003, 2010, 2015, 2021), Athey (2017), Brodersen (2015)

Test plan

  • 27 integration tests pass (test_sc_design.py)
  • 13 unit tests for check plotting (test_check_plots.py)
  • All prek checks pass (ruff, mypy, codespell, notebook schema) — interrogate failure is pre-existing (84% vs 85%)
  • Verify notebook renders correctly via make html
  • Run full test suite to check for regressions

Add prospective design capabilities so practitioners can assess whether
a geo-experiment will work before committing budget:

- `SyntheticControl.from_pre_period()`: classmethod that fits SC on
  pre-period data only, enabling prospective design assessment without
  requiring post-period observations
- `validate_design()`: dress rehearsal that injects a known effect and
  checks if the model recovers it
- `power_analysis()`: simulation-based Bayesian power curve across
  candidate effect sizes
- `donor_pool_quality()`: composite quality score aggregating donor
  correlations, convex hull coverage, and weight concentration
- `DressRehearsalCheck`: pipeline-compatible Check wrapper for
  sensitivity analysis integration
- Result classes with `plot()` and `summary()` methods
- 27 integration tests covering both prospective and retrospective
  workflows
- Demo sections in sc_pymc.ipynb showing the real workflow: design
  assessment before analysis

Made-with: Cursor
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@drbenvincent drbenvincent added the OSS_PRODUCT OSS_PRODUCT project priorities. Labs members should get approval before logging hours. label Apr 2, 2026
@drbenvincent drbenvincent marked this pull request as draft April 2, 2026 21:06
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 92.49448% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.73%. Comparing base (1ee7322) to head (c6a9b21).

Files with missing lines Patch % Lines
causalpy/experiments/synthetic_control.py 83.67% 14 Missing and 10 partials ⚠️
causalpy/experiments/sc_results.py 92.59% 3 Missing and 3 partials ⚠️
causalpy/checks/dress_rehearsal.py 82.60% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #819      +/-   ##
==========================================
- Coverage   93.77%   93.73%   -0.05%     
==========================================
  Files          77       80       +3     
  Lines       11881    12333     +452     
  Branches      696      732      +36     
==========================================
+ Hits        11142    11560     +418     
- Misses        546      566      +20     
- Partials      193      207      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community bot commented Apr 2, 2026

Documentation build overview

📚 causalpy | 🛠️ Build #32108742 | 📁 Comparing ce959f6 against latest (1ee7322)

  🔍 Preview build  

Show files changed (50 files in total): 📝 22 modified | ➕ 28 added | ➖ 0 deleted
File Status
404.html 📝 modified
genindex.html 📝 modified
py-modindex.html 📝 modified
_modules/index.html 📝 modified
notebooks/index.html 📝 modified
notebooks/sc_pymc.html 📝 modified
api/generated/causalpy.checks.dress_rehearsal.DressRehearsalCheck.init.html ➕ added
api/generated/causalpy.checks.dress_rehearsal.DressRehearsalCheck.html ➕ added
api/generated/causalpy.checks.dress_rehearsal.DressRehearsalCheck.run.html ➕ added
api/generated/causalpy.checks.dress_rehearsal.DressRehearsalCheck.validate.html ➕ added
api/generated/causalpy.checks.dress_rehearsal.html ➕ added
api/generated/causalpy.checks.html 📝 modified
api/generated/causalpy.checks.leave_one_out.LeaveOneOut.html 📝 modified
api/generated/causalpy.checks.leave_one_out.LeaveOneOut.plot.html ➕ added
api/generated/causalpy.checks.placebo_in_space.PlaceboInSpace.html 📝 modified
api/generated/causalpy.checks.placebo_in_space.PlaceboInSpace.plot.html ➕ added
api/generated/causalpy.checks.placebo_in_time.PlaceboInTime.html 📝 modified
api/generated/causalpy.checks.placebo_in_time.PlaceboInTime.plot.html ➕ added
api/generated/causalpy.checks.prior_sensitivity.PriorSensitivity.html 📝 modified
api/generated/causalpy.checks.prior_sensitivity.PriorSensitivity.plot.html ➕ added
api/generated/causalpy.data.datasets.load_data.html 📝 modified
api/generated/causalpy.experiments.html 📝 modified
api/generated/causalpy.experiments.sc_results.DonorPoolQualityResult.init.html ➕ added
api/generated/causalpy.experiments.sc_results.DonorPoolQualityResult.html ➕ added
api/generated/causalpy.experiments.sc_results.DonorPoolQualityResult.summary.html ➕ added
api/generated/causalpy.experiments.sc_results.DressRehearsalResult.init.html ➕ added
api/generated/causalpy.experiments.sc_results.DressRehearsalResult.html ➕ added
api/generated/causalpy.experiments.sc_results.DressRehearsalResult.plot.html ➕ added
api/generated/causalpy.experiments.sc_results.DressRehearsalResult.summary.html ➕ added
api/generated/causalpy.experiments.sc_results.DressRehearsalResult.to_check_result.html ➕ added
api/generated/causalpy.experiments.sc_results.PowerCurveResult.init.html ➕ added
api/generated/causalpy.experiments.sc_results.PowerCurveResult.html ➕ added
api/generated/causalpy.experiments.sc_results.PowerCurveResult.plot.html ➕ added
api/generated/causalpy.experiments.sc_results.PowerCurveResult.summary.html ➕ added
api/generated/causalpy.experiments.sc_results.html ➕ added
api/generated/causalpy.experiments.synthetic_control.SyntheticControl.init.html 📝 modified
api/generated/causalpy.experiments.synthetic_control.SyntheticControl.donor_pool_quality.html ➕ added
api/generated/causalpy.experiments.synthetic_control.SyntheticControl.from_pre_period.html ➕ added
api/generated/causalpy.experiments.synthetic_control.SyntheticControl.html 📝 modified
api/generated/causalpy.experiments.synthetic_control.SyntheticControl.power_analysis.html ➕ added
api/generated/causalpy.experiments.synthetic_control.SyntheticControl.validate_design.html ➕ added
_modules/causalpy/checks/dress_rehearsal.html ➕ added
_modules/causalpy/checks/leave_one_out.html 📝 modified
_modules/causalpy/checks/placebo_in_space.html 📝 modified
_modules/causalpy/checks/placebo_in_time.html 📝 modified
_modules/causalpy/checks/prior_sensitivity.html 📝 modified
_modules/causalpy/data/datasets.html 📝 modified
_modules/causalpy/experiments/sc_results.html ➕ added
_modules/causalpy/experiments/synthetic_control.html 📝 modified
_modules/causalpy/steps/report.html 📝 modified

Reorder cells and rewrite headings so the notebook mirrors a
practitioner's actual workflow: design assessment before the
experiment, causal analysis after. Key changes:

- Move df.head() to the data-loading section
- Move convex hull explanation before the design section
- Rename headings to question-driven titles (educational-narrative)
- Add clear "Before / After the experiment" phase headings
- Add transition prose between design and analysis phases
- Add power curve interpretation cell with go/no-go guidance
- Link donor pool selection forward to donor_pool_quality()
- Demote Effect Summary to subsection of analysis phase

Made-with: Cursor
Donor pool selection and convex hull condition are pre-experiment
checks — they now sit as subsections of "Before the experiment"
rather than floating between Load data and the design section.

Also adds a reminder in the "After" section that the convex hull
check runs automatically when constructing the full SyntheticControl.
Fixes missing nbformat properties across all output cells.

Made-with: Cursor
Summarise the full before/after workflow under the title so readers
can see the notebook's scope at a glance. Each step gets 2-3
sentences explaining what it does and why it matters.

Made-with: Cursor
…pymc notebook

Expand the Synthetic Control notebook with academic references (Abadie 2010/2015/2021,
Athey & Imbens 2017, etc.) and add post-estimation robustness sections: placebo-in-space,
placebo-in-time, leave-one-out, and prior sensitivity — each with result visualisations
and interpretation guidance. Add 13 new BibTeX entries to references.bib.

Made-with: Cursor
Replace the synthetic toy dataset with the canonical Abadie, Diamond &
Hainmueller (2010) Proposition 99 dataset — per-capita cigarette sales
across 39 US states, 1970-2000. This grounds the notebook in real data
from the SC literature, improves connections to cited references, and
gives robustness checks a realistic "good case" to demonstrate.

- Add california_prop99.csv (wide format, 7 KB) and register as "prop99"
- Update all narrative to California/tobacco policy context
- Update all code cells: control_units, treated_unit, treatment_time
- Adjust holdout_periods for the 19-year pre-period

Made-with: Cursor
Enlarge the correlation heatmap for readability with 39 states, add an
explicit donor pool selection step that removes states with negative
pre-treatment correlation (threshold=0.0), and explain the threshold
choice. Excludes Alabama, Arkansas, Georgia, Tennessee — leaving 34
well-correlated donors.

Made-with: Cursor
Notebook fully executed with California Proposition 99 data: correlation
heatmap, donor pool curation, design assessment, model fit, effect
summaries, and all four robustness checks (placebo-in-space,
placebo-in-time, leave-one-out, prior sensitivity) with visualisations.

Made-with: Cursor
Extract shared plotting helpers (_plot_helpers.py) and add plot()
staticmethods to PlaceboInSpace, PlaceboInTime, LeaveOneOut, and
PriorSensitivity. Each check now auto-populates CheckResult.figures
in run(). GenerateReport renders check figures in the HTML report.
Replace ~80 lines of custom matplotlib in sc_pymc.ipynb with
single-line library calls.

Made-with: Cursor
@drbenvincent drbenvincent changed the title Add geo-experiment design workflow for Synthetic Control Add geo-experiment design workflow and sensitivity check plotting for Synthetic Control Apr 3, 2026
- Add raw data time-series visualization after data loading
- Add circle tile map showing per-state correlation with California
- Add interpretation text after dress rehearsal plot
- Document power curve Type I error issue as TODO; remove effect_size=0
- Reduce forest plot per-row height (0.45 -> 0.3) in _plot_helpers.py
- Fix pre-existing nbformat validation issues in cell outputs

Made-with: Cursor
Agents cannot detect unsaved IDE state, so prompt the user to confirm
all files (especially notebooks with expensive outputs) are saved
before staging and committing.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OSS_PRODUCT OSS_PRODUCT project priorities. Labs members should get approval before logging hours.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant