Warn about geographic CRS in raster_band_percentile#429
Conversation
dcdenu4
left a comment
There was a problem hiding this comment.
Thanks @megannissel , just a quick variable name suggestion.
There was a problem hiding this comment.
I just wanted to add a suggested change, and I also had one other thought that I wanted to pose (and @dcdenu4, I'd love your input on this as well): aren't there cases where a raster with a geographic coordinate system is perfectly legitimate? I think this warning is a nice feature for some use cases, but I really don't think we can always assume that pixels have area-dependent values. Example: InVEST Scenic Quality has a value raster (admittedly in a projected coordinate system) that is segmented by percentiles, without consideration of value per unit area.
If you both agree, perhaps we could do the following?
- update the language of the warning to eliminate the presumption that all pixel values are per unit area
- allow the warning to be disabled with an optional parameter
Sorry to jump in here!
| base_raster = gdal.OpenEx(base_raster_path_band[0], gdal.OF_RASTER) | ||
| srs = base_raster.GetSpatialRef() | ||
| if srs.IsGeographic(): | ||
| LOGGER.warning( | ||
| f'Raster {base_raster_path_band[0]} has a geographic CRS (pixels ' | ||
| 'do not have equal area). Because `raster_band_percentile` calculates ' | ||
| 'percentiles of pixel values, percentile results will be skewed.') | ||
| base_raster = None | ||
|
|
There was a problem hiding this comment.
Sorry to interject here, but I just found out about gdal.Dataset having the attributes of a context manager! Doing something like the below would help to avoid a case where an error might be raised while checking the spatial reference such as if no spatial reference is defined and None is returned, and the raster should then be closed. If this were to happen, then, at least on Windows a test case would not be able to remove the file because it would still be opened. This way, the raster should be closed when the context manager exits, no matter what happens.
| base_raster = gdal.OpenEx(base_raster_path_band[0], gdal.OF_RASTER) | |
| srs = base_raster.GetSpatialRef() | |
| if srs.IsGeographic(): | |
| LOGGER.warning( | |
| f'Raster {base_raster_path_band[0]} has a geographic CRS (pixels ' | |
| 'do not have equal area). Because `raster_band_percentile` calculates ' | |
| 'percentiles of pixel values, percentile results will be skewed.') | |
| base_raster = None | |
| with gdal.OpenEx(base_raster_path_band[0], gdal.OF_RASTER) as base_raster: | |
| srs = base_raster.GetSpatialRef() | |
| if srs.IsGeographic(): | |
| LOGGER.warning( | |
| f'Raster {base_raster_path_band[0]} has a geographic CRS (pixels ' | |
| 'do not have equal area). Because `raster_band_percentile` calculates ' | |
| 'percentiles of pixel values, percentile results will be skewed.') |
There was a problem hiding this comment.
Oh, that's neat; I didn't know you could use it as a context manager!
There was a problem hiding this comment.
Thanks @phargogh ! I was looking at this as well. Am I right in thinking that this was introduced in GDAL 3.8 and that if we wanted to go this route we'd want to update our requirements for GDAL version?
There was a problem hiding this comment.
Oooo good catch @dcdenu4 ! Yeah, gdal 3.8 seems like a pretty recent version for this, as much as I want to be abele to use it. I'm not sure we should bump the requirement that high just yet, so @megannissel here's an alternative that doesn't use a context manager:
| base_raster = gdal.OpenEx(base_raster_path_band[0], gdal.OF_RASTER) | |
| srs = base_raster.GetSpatialRef() | |
| if srs.IsGeographic(): | |
| LOGGER.warning( | |
| f'Raster {base_raster_path_band[0]} has a geographic CRS (pixels ' | |
| 'do not have equal area). Because `raster_band_percentile` calculates ' | |
| 'percentiles of pixel values, percentile results will be skewed.') | |
| base_raster = None | |
| try: | |
| base_raster = gdal.OpenEx(base_raster_path_band[0], gdal.OF_RASTER) | |
| srs = base_raster.GetSpatialRef() | |
| if srs.IsGeographic(): | |
| LOGGER.warning( | |
| f'Raster {base_raster_path_band[0]} has a geographic CRS (pixels ' | |
| 'do not have equal area). Because `raster_band_percentile` calculates ' | |
| 'percentiles of pixel values, percentile results will be skewed.') | |
| finally: | |
| base_raster = None | |
There was a problem hiding this comment.
Thanks, @phargogh! I think the question still remains about whether we think this warning is a good idea at all, or if we want to simply stick with the updated docstring. I don't have a strong opinion either way; what do you think?
|
Thank you for jumping in and explaining a potential caveat, @phargogh! I had taken it on face value that this would always be a concern with a geographic CRS. Assuming @dcdenu4 also agrees, I'm on board with adding an optional parameter to disable the warning. As for re-wording the warning itself, how does this sound? |
dcdenu4
left a comment
There was a problem hiding this comment.
Thanks @megannissel , I just had a question of whether we need to update our gdal dependency requirement for @phargogh suggestion of using a context manager, which I completely agree would be great!
Yes, I don't think it's always the case that this warning is warranted and that there are use cases where the values of the raster are pixel area agnostic.
Agree! It's more about what THIS function does in its operation on the input. This function operates on pixel values without consider pixel area.
The more I think about this the more I'm torn whether we need to be warning at all and that the docstring update suffices for clarification to PGP users. I don't think there is harm in adding the optional warning, but maybe unnecessary... Thoughts @megannissel and @phargogh ? |
|
The missing piece of information, from my perspective, is a clear understanding of what the units of the raster are. Until this metadata is available to pygeoprocessing (proposal, anyone? 😉 ) I do not think we can safely make assumptions about the nature of the units and whether the warning is relevant. However, as programmers, I could see this warning being useful. If I'm writing a scientific workflow, I know things about my inputs and outputs that pygeoprocessing does not, and I absolutely would use a warning like this to help guard against me using the function in the wrong way. So I'd suggest leaving the warning in place, but disabling the warning by default. Just my |
dcdenu4
left a comment
There was a problem hiding this comment.
Thanks @megannissel. I like @phargogh rationale in his last comment and like this solution. Looks like there are some conflicts, but happy to merge once those are fixed.
Fixes #299