Skip to content

Conversation

@jennan
Copy link
Collaborator

@jennan jennan commented Aug 28, 2025

This PR adds a mechanism to WeatherBench2 data accessors to ensure that partial downloads are detected and rerun, avoiding keeping corrupted data.

@jennan jennan self-assigned this Aug 28, 2025
@jennan jennan requested a review from tennlee August 28, 2025 12:57
@coveralls
Copy link

Pull Request Test Coverage Report for Build 17296422059

Details

  • 1 of 16 (6.25%) changed or added relevant lines in 3 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.004%) to 60.989%

Changes Missing Coverage Covered Lines Changed/Added Lines %
packages/pipeline/src/pyearthtools/pipeline/operations/xarray/normalisation.py 0 3 0.0%
packages/pipeline/src/pyearthtools/pipeline/operations/xarray/join.py 0 5 0.0%
packages/data/src/pyearthtools/data/download/weatherbench.py 1 8 12.5%
Files with Coverage Reduction New Missed Lines %
packages/data/src/pyearthtools/data/download/weatherbench.py 1 27.43%
Totals Coverage Status
Change from base Build 17286816387: 0.004%
Covered Lines: 9481
Relevant Lines: 15115

💛 - Coveralls

@tennlee
Copy link
Collaborator

tennlee commented Sep 13, 2025

A useful increment to improve downloading resilience. Thanks!

@tennlee tennlee merged commit e04b816 into ACCESS-Community-Hub:develop Sep 13, 2025
6 checks passed
gemmaellen pushed a commit to gemmaellen/PyEarthTools that referenced this pull request Oct 1, 2025
* use a canary file to detect unfinished downloads
* clean up unfinished downloads if detected
* rerun CNN notebook, remove unused variable add note about download time
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants