Add Bayesian Synthetic Difference-in-Differences#823
Add Bayesian Synthetic Difference-in-Differences#823thomaspinder wants to merge 6 commits intomainfrom
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
👋 Welcome to CausalPy, @thomaspinder! Thank you for opening your first pull request! We're excited to have you contribute to the project. 🎉 Here are a few tips to help your PR get merged smoothly:
A maintainer will review your changes soon. Thanks for helping make CausalPy better! 🚀 💼 LinkedIn Shoutout: Once your PR is merged, we'd love to give you a shoutout on LinkedIn to thank you for your contribution! If you're interested, just drop your LinkedIn profile URL in a comment below. |
|
Very cool @thomaspinder! |
c7befc5 to
c4283d9
Compare
|
So cool! 💪 . Is this one ready for review? |
It is! |
Implements the cut-posterior SDiD formulation (Pinder, 2026) as a new experiment class. Unit and time weights are estimated jointly in a single SyntheticDifferenceInDifferencesWeightFitter model, and the ATT is computed analytically via the double-difference formula. Includes the California Proposition 99 dataset, a Jupyter notebook demo, and full test coverage. Depends on feat/softmax-weighted-sum-fitter (PR #1) for the _softmax_simplex_weights helper and SoftmaxWeightedSumFitter base. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
c083fb5 to
6a4cb2b
Compare
|
pre-commit.ci run |
juanitorduz
left a comment
There was a problem hiding this comment.
This looks great! just left some initial comments :)
| "obs_ind_raw": list(range(1, T_pre)), | ||
| } | ||
|
|
||
| self.model.fit(X=X, y=y, coords=COORDS) |
There was a problem hiding this comment.
Shall we allow the user to pass kwargs for the fit method? Say I wanna sample with Nutpie and add a random seed?
There was a problem hiding this comment.
ahh ok! I now understand the API from the notebook
result = cp.SyntheticDifferenceInDifferences(
df,
treatment_time,
control_units=control_units,
treated_units=treated_units,
model=cp.pymc_models.SyntheticDifferenceInDifferencesWeightFitter(
sample_kwargs={
"target_accept": 0.95,
"random_seed": seed,
"tune": 3000,
"draws": 2000,
},
priors={
"sigma_omega": Prior("HalfNormal", sigma=y_sd),
"sigma_lambda": Prior("HalfNormal", sigma=y_sd),
"omega0": Prior("Normal", mu=0, sigma=y_sd * 2),
"lambda0": Prior("Normal", mu=0, sigma=y_sd * 2),
},
),
)There was a problem hiding this comment.
Precisely :) Are you ok with this API, or does your original comment still stand and you're in favour of a **kwargs style argument in the SDiD model? Generally, my principle was minimising the number of places to inject this type of information, and keeping the number of **kwargs to a minimum.
There was a problem hiding this comment.
This makes sense (and it is documented in the example as well :) ), maybe @drbenvincent has an opinion here.
|
@thomaspinder this looks great! left some minor suggestions, let me know what you think about it :) |
The arkhangelsky2021synthetic entry was missing its closing brace following the merge of main into feat/bayesian-sdid, which broke pybtex parsing and the Sphinx/readthedocs docs build.
juanitorduz
left a comment
There was a problem hiding this comment.
great! thank you @thomaspinder ! It seems tests failures are because of other reasons, right? cc @drbenvincent
Implements the cut-posterior SDiD formulation of SDiD as a new experiment class. Unit and time weights are estimated jointly in a single
SyntheticDifferenceInDifferencesWeightFittermodel, and the ATT is computed analytically via the double-difference formula. Includes the California Proposition 99 dataset, a Jupyter notebook demo, and full test coverage.Depends on
feat/softmax-weighted-sum-fitter(PR #822) for the_softmax_simplex_weightshelper andSoftmaxWeightedSumFitterbase.Relevant Issue: #47