Skip to content

Conversation

@kimlee87
Copy link
Collaborator

This PR resolves issue #57

monthly tp = downloaded value * number of days in the month

  • update python script and related tests
  • update relevant tutorials
  • update default settings and setting schema

@kimlee87 kimlee87 changed the title Calculate real value of montly total precipitation Calculate real value of monthly total precipitation Dec 30, 2025
@kimlee87 kimlee87 requested a review from Copilot December 30, 2025 18:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements the calculation of real monthly total precipitation values from ERA5-Land monthly data by multiplying downloaded values by the number of days in each month. The change addresses issue #57 and ensures precipitation data accurately reflects monthly totals rather than daily averages.

  • Added calculate_monthly_precipitation function to compute monthly totals
  • Integrated the calculation into the preprocessing pipeline with new configuration settings
  • Updated schema and default settings to support the new feature

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
heiplanet_data/preprocess.py Adds _check_month_start_data and calculate_monthly_precipitation functions, integrates monthly precipitation calculation into _apply_preprocessing, and renames parameter from agg_funcs to downsample_agg_funcs for clarity
heiplanet_data/setting_schema.json Adds schema definitions for cal_monthly_tp, cal_monthly_tp_vname, cal_monthly_tp_tcoord, and cal_monthly_tp_fname settings, updates descriptions for downsampling parameters, and renames resample_agg_funcs to downsample_agg_funcs
heiplanet_data/era5_settings.json Enables monthly precipitation calculation by default with cal_monthly_tp: true and associated configuration
heiplanet_data/test/test_preprocess.py Adds tests for _check_month_start_data, calculate_monthly_precipitation, and integration test for the preprocessing pipeline, updates existing test to use renamed parameter
heiplanet_data/test/test_utils.py Adds validation tests for new settings related to monthly precipitation calculation
docs/source/notebooks/tutorial_B_preprocess_data.ipynb Documents new monthly precipitation calculation settings and additional downsampling parameters in the preprocessing settings table

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1079 to +1114
def calculate_monthly_precipitation(
dataset: xr.Dataset, var_name: str = "tp", time_coord: str = "time"
) -> xr.Dataset:
"""Calculate monthly total precipitation from data downloaded from ERA5-Land monthly data.
The real precipitation of the month = downloaded value * number of days in the month.
Args:
dataset (xr.Dataset): Dataset with total precipitation data.
var_name (str): Name of the precipitation variable in the dataset. Default is "tp".
time_coord (str): Name of the time coordinate in the dataset. Default is "time".
Returns:
xr.Dataset: Dataset with monthly total precipitation values.
"""
# check inputs
if time_coord not in dataset.coords:
raise ValueError(f"Time coordinate '{time_coord}' not found in dataset.")

if var_name not in dataset.data_vars:
raise ValueError(f"Variable '{var_name}' not found in dataset.")

times = dataset[time_coord]

if not _check_month_start_data(times):
raise ValueError("The dataset does not have month start data.")

# calculate number of days in each month
days_in_month = times.dt.days_in_month

# calculate monthly total precipitation
org_attrs = dataset[var_name].attrs.copy()
dataset[var_name] = dataset[var_name] * days_in_month
dataset[var_name].attrs = org_attrs

return dataset

Copy link

Copilot AI Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function mutates the input dataset in place without an option to preserve the original dataset. This is inconsistent with similar functions in the codebase like 'convert_to_celsius_with_attributes' and 'convert_m_to_mm_with_attributes', which have an 'inplace' parameter that defaults to False. Consider adding an 'inplace' parameter and creating a copy of the dataset when it's False to maintain consistency with the existing API design pattern.

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Dec 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.65%. Comparing base (b1b716d) to head (15dd455).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #89      +/-   ##
==========================================
+ Coverage   98.61%   98.65%   +0.04%     
==========================================
  Files           9        9              
  Lines        2159     2226      +67     
==========================================
+ Hits         2129     2196      +67     
  Misses         30       30              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Jan 2, 2026

@kimlee87 kimlee87 marked this pull request as ready for review January 2, 2026 10:42
@kimlee87 kimlee87 requested a review from iulusoy January 2, 2026 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants