Skip to content

Remove synthetic CSVs from git history with git-filter-repo #824

@louismagowan

Description

@louismagowan

Background

PR #794 deleted 8 synthetic CSV files (~172KB) from the repo, but they remain in git history.
Using git-filter-repo would permanently remove them and reduce clone size.

Files to remove from history

  • causalpy/data/did.csv
  • causalpy/data/regression_discontinuity.csv
  • causalpy/data/synthetic_control.csv
  • causalpy/data/its.csv
  • causalpy/data/its_simple.csv
  • causalpy/data/ancova_generated.csv
  • causalpy/data/geolift1.csv
  • causalpy/data/geolift_multi_cell.csv
  • causalpy/data/gt_social_media_data.csv

Trade-offs

Savings: ~172KB reduction in clone size

Cost: All commit hashes rewritten, all open PRs need rebasing, all contributors need to re-clone, external SHA links break.

Given the small savings vs. high disruption, this is low priority and should only be considered at a natural breakpoint (e.g. before a major release).

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions