Background
PR #794 deleted 8 synthetic CSV files (~172KB) from the repo, but they remain in git history.
Using git-filter-repo would permanently remove them and reduce clone size.
Files to remove from history
causalpy/data/did.csv
causalpy/data/regression_discontinuity.csv
causalpy/data/synthetic_control.csv
causalpy/data/its.csv
causalpy/data/its_simple.csv
causalpy/data/ancova_generated.csv
causalpy/data/geolift1.csv
causalpy/data/geolift_multi_cell.csv
causalpy/data/gt_social_media_data.csv
Trade-offs
Savings: ~172KB reduction in clone size
Cost: All commit hashes rewritten, all open PRs need rebasing, all contributors need to re-clone, external SHA links break.
Given the small savings vs. high disruption, this is low priority and should only be considered at a natural breakpoint (e.g. before a major release).
References
Background
PR #794 deleted 8 synthetic CSV files (~172KB) from the repo, but they remain in git history.
Using
git-filter-repowould permanently remove them and reduce clone size.Files to remove from history
causalpy/data/did.csvcausalpy/data/regression_discontinuity.csvcausalpy/data/synthetic_control.csvcausalpy/data/its.csvcausalpy/data/its_simple.csvcausalpy/data/ancova_generated.csvcausalpy/data/geolift1.csvcausalpy/data/geolift_multi_cell.csvcausalpy/data/gt_social_media_data.csvTrade-offs
Savings: ~172KB reduction in clone size
Cost: All commit hashes rewritten, all open PRs need rebasing, all contributors need to re-clone, external SHA links break.
Given the small savings vs. high disruption, this is low priority and should only be considered at a natural breakpoint (e.g. before a major release).
References