-
Notifications
You must be signed in to change notification settings - Fork 1
Calculate real value of monthly total precipitation #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements the calculation of real monthly total precipitation values from ERA5-Land monthly data by multiplying downloaded values by the number of days in each month. The change addresses issue #57 and ensures precipitation data accurately reflects monthly totals rather than daily averages.
- Added
calculate_monthly_precipitationfunction to compute monthly totals - Integrated the calculation into the preprocessing pipeline with new configuration settings
- Updated schema and default settings to support the new feature
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| heiplanet_data/preprocess.py | Adds _check_month_start_data and calculate_monthly_precipitation functions, integrates monthly precipitation calculation into _apply_preprocessing, and renames parameter from agg_funcs to downsample_agg_funcs for clarity |
| heiplanet_data/setting_schema.json | Adds schema definitions for cal_monthly_tp, cal_monthly_tp_vname, cal_monthly_tp_tcoord, and cal_monthly_tp_fname settings, updates descriptions for downsampling parameters, and renames resample_agg_funcs to downsample_agg_funcs |
| heiplanet_data/era5_settings.json | Enables monthly precipitation calculation by default with cal_monthly_tp: true and associated configuration |
| heiplanet_data/test/test_preprocess.py | Adds tests for _check_month_start_data, calculate_monthly_precipitation, and integration test for the preprocessing pipeline, updates existing test to use renamed parameter |
| heiplanet_data/test/test_utils.py | Adds validation tests for new settings related to monthly precipitation calculation |
| docs/source/notebooks/tutorial_B_preprocess_data.ipynb | Documents new monthly precipitation calculation settings and additional downsampling parameters in the preprocessing settings table |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def calculate_monthly_precipitation( | ||
| dataset: xr.Dataset, var_name: str = "tp", time_coord: str = "time" | ||
| ) -> xr.Dataset: | ||
| """Calculate monthly total precipitation from data downloaded from ERA5-Land monthly data. | ||
| The real precipitation of the month = downloaded value * number of days in the month. | ||
| Args: | ||
| dataset (xr.Dataset): Dataset with total precipitation data. | ||
| var_name (str): Name of the precipitation variable in the dataset. Default is "tp". | ||
| time_coord (str): Name of the time coordinate in the dataset. Default is "time". | ||
| Returns: | ||
| xr.Dataset: Dataset with monthly total precipitation values. | ||
| """ | ||
| # check inputs | ||
| if time_coord not in dataset.coords: | ||
| raise ValueError(f"Time coordinate '{time_coord}' not found in dataset.") | ||
|
|
||
| if var_name not in dataset.data_vars: | ||
| raise ValueError(f"Variable '{var_name}' not found in dataset.") | ||
|
|
||
| times = dataset[time_coord] | ||
|
|
||
| if not _check_month_start_data(times): | ||
| raise ValueError("The dataset does not have month start data.") | ||
|
|
||
| # calculate number of days in each month | ||
| days_in_month = times.dt.days_in_month | ||
|
|
||
| # calculate monthly total precipitation | ||
| org_attrs = dataset[var_name].attrs.copy() | ||
| dataset[var_name] = dataset[var_name] * days_in_month | ||
| dataset[var_name].attrs = org_attrs | ||
|
|
||
| return dataset | ||
|
|
Copilot
AI
Dec 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function mutates the input dataset in place without an option to preserve the original dataset. This is inconsistent with similar functions in the codebase like 'convert_to_celsius_with_attributes' and 'convert_m_to_mm_with_attributes', which have an 'inplace' parameter that defaults to False. Consider adding an 'inplace' parameter and creating a copy of the dataset when it's False to maintain consistency with the existing API design pattern.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #89 +/- ##
==========================================
+ Coverage 98.61% 98.65% +0.04%
==========================================
Files 9 9
Lines 2159 2226 +67
==========================================
+ Hits 2129 2196 +67
Misses 30 30 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|



This PR resolves issue #57
monthly tp = downloaded value * number of days in the month