Skip to content

Conversation

@maxwhitemet
Copy link
Contributor

@maxwhitemet maxwhitemet commented Dec 9, 2025

Addresses #1007

This PR implements quantile mapping into the IMPROVER repo, adding a quantile mapping module, CLI, unit tests, and acceptance tests.

A demonstration of the plugin's functionality is available here.

Testing:

  • Ran tests and they passed OK
  • Added new tests for the new feature(s)

@codecov
Copy link

codecov bot commented Dec 9, 2025

Codecov Report

❌ Patch coverage is 96.10390% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.19%. Comparing base (84a8944) to head (ae2a5ad).
⚠️ Report is 154 commits behind head on master.

Files with missing lines Patch % Lines
improver/calibration/quantile_mapping.py 96.10% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2264      +/-   ##
==========================================
- Coverage   98.39%   95.19%   -3.20%     
==========================================
  Files         124      150      +26     
  Lines       12212    15323    +3111     
==========================================
+ Hits        12016    14587    +2571     
- Misses        196      736     +540     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maxwhitemet maxwhitemet force-pushed the mobt_1007_quantile_mapping_plugin branch from 73363ed to ae2a5ad Compare December 10, 2025 16:11
Copy link
Contributor

@gavinevans gavinevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maxwhitemet 👍

I've added some comments below.

return np.interp(quantiles, empirical_quantiles, sorted_values)


def quantile_mapping(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be good to name this something else to avoid a quantile_mapping function and a QuantileMapping class in the same file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

I have changed the name to 'map_quantiles'. Please let me know if this needs changing.

Comment on lines 247 to 289
# Create a copy of the forecast_cube or forecast_to_calibrate cube to hold
# output data and preserve metadata.
output_cube = (
forecast_cube.copy()
if forecast_to_calibrate is None
else forecast_to_calibrate.copy()
)

# Extract data, handling masked arrays
if np.ma.is_masked(reference_cube.data):
reference_data_flat = reference_cube.data.filled().flatten()
else:
reference_data_flat = reference_cube.data.flatten()

if np.ma.is_masked(forecast_cube.data):
forecast_data_flat = forecast_cube.data.filled().flatten()
else:
forecast_data_flat = forecast_cube.data.flatten()

# Determine values to map and output shape
if forecast_to_calibrate is None:
# Use forecast_cube data
if np.ma.is_masked(output_cube.data):
values_to_map_flat = output_cube.data.filled().flatten()
else:
values_to_map_flat = output_cube.data.flatten()
output_shape = forecast_cube.shape
output_mask = (
forecast_cube.data.mask if np.ma.is_masked(forecast_cube.data) else None
)
else:
# Use provided cube's data
output_cube = forecast_to_calibrate.copy()
if np.ma.is_masked(forecast_to_calibrate.data):
values_to_map_flat = forecast_to_calibrate.data.filled().flatten()
else:
values_to_map_flat = forecast_to_calibrate.data.flatten()
output_shape = forecast_to_calibrate.shape
output_mask = (
forecast_to_calibrate.data.mask
if np.ma.is_masked(forecast_to_calibrate.data)
else None
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you could put this into a separate method / function, so that the process method is simpler. You even put the pattern below into a method / function, given that you re-use a number of times:

        if np.ma.is_masked(forecast_cube.data):
            forecast_data_flat = forecast_cube.data.filled().flatten()
        else:
            forecast_data_flat = forecast_cube.data.flatten()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now removed the use of .filled() as I was concerned this would introduce changes to the statistics. Instead, the code now only processes unmasked data points, and later reinserts the mask where it was.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you may as well move these tests into a quantile_mapping directory to match the pattern of the other tests for calibration methods.

- Move functionality into QuantileMapping class
- Remove redundancy
- Increase variable name clarity
- Refactor into smaller functions
2. Additions:
- Improved readability experience of docstrings
- Fixed improper masked array handling
@maxwhitemet maxwhitemet force-pushed the mobt_1007_quantile_mapping_plugin branch from ae2a5ad to 6ec215f Compare December 29, 2025 16:22
Copy link
Contributor Author

@maxwhitemet maxwhitemet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the feedback received, I have implemented the below modifications:

  • Made lots of changes to docstrings, such that now:
    • More extensive documentation has moved from private to public methods
    • Removed redundant Args sections in private methods, defined elsewhere.
  • Masked arrays
    • I was concerned about what would happen if the reference cube and the post-processed forecast cube had differing mask locations. Thus I have added handling that may require further discussion: combine the masks such that only points that are valid in both cubes are used to build the CDFs.
    • Removed redundant use of np.where for non-masked arrays as I discovered this is implicitly handled in np.ma.where

return np.interp(quantiles, empirical_quantiles, sorted_values)


def quantile_mapping(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

I have changed the name to 'map_quantiles'. Please let me know if this needs changing.

*,
mapping_method: str = "floor",
preservation_threshold: float = None,
forecast_to_calibrate: cli.inputcube = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the option to provide the third cube from the plugin and this CLI script. Thank you.

Comment on lines 247 to 289
# Create a copy of the forecast_cube or forecast_to_calibrate cube to hold
# output data and preserve metadata.
output_cube = (
forecast_cube.copy()
if forecast_to_calibrate is None
else forecast_to_calibrate.copy()
)

# Extract data, handling masked arrays
if np.ma.is_masked(reference_cube.data):
reference_data_flat = reference_cube.data.filled().flatten()
else:
reference_data_flat = reference_cube.data.flatten()

if np.ma.is_masked(forecast_cube.data):
forecast_data_flat = forecast_cube.data.filled().flatten()
else:
forecast_data_flat = forecast_cube.data.flatten()

# Determine values to map and output shape
if forecast_to_calibrate is None:
# Use forecast_cube data
if np.ma.is_masked(output_cube.data):
values_to_map_flat = output_cube.data.filled().flatten()
else:
values_to_map_flat = output_cube.data.flatten()
output_shape = forecast_cube.shape
output_mask = (
forecast_cube.data.mask if np.ma.is_masked(forecast_cube.data) else None
)
else:
# Use provided cube's data
output_cube = forecast_to_calibrate.copy()
if np.ma.is_masked(forecast_to_calibrate.data):
values_to_map_flat = forecast_to_calibrate.data.filled().flatten()
else:
values_to_map_flat = forecast_to_calibrate.data.flatten()
output_shape = forecast_to_calibrate.shape
output_mask = (
forecast_to_calibrate.data.mask
if np.ma.is_masked(forecast_to_calibrate.data)
else None
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now removed the use of .filled() as I was concerned this would introduce changes to the statistics. Instead, the code now only processes unmasked data points, and later reinserts the mask where it was.

Comment on lines 14 to 15
reference_cube: cli.inputcube,
forecast_cube: cli.inputcube,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented your suggestion though excluded the portion of the 'cubes' docstring on land-sea masking handled by the estimate_emos_coefficients plugin here.

Please could you let me know if I should add this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants