Skip to content

feat: drainage area validation with ABS_DIFF metric and gage filtering#124

Merged
taddyb merged 5 commits intoDeepGroundwater:masterfrom
taddyb:filtering_gages
Feb 8, 2026
Merged

feat: drainage area validation with ABS_DIFF metric and gage filtering#124
taddyb merged 5 commits intoDeepGroundwater:masterfrom
taddyb:filtering_gages

Conversation

@taddyb
Copy link
Collaborator

@taddyb taddyb commented Feb 8, 2026

Summary

  • Add absolute difference (ABS_DIFF) metric to gage reference files — scales correctly across basin sizes unlike PCT_DIFF/REL_ERR (e.g. 4 km² vs 8 km² is 100% relative error but only 4 km² absolute difference)
  • Add COMID_UNITAREA_SQKM column to all gage CSVs (preparation for future Q' scaling of mid-COMID gages)
  • Add max_area_diff_sqkm config threshold (ExperimentConfig) to filter spatially misaligned gages from training/evaluation
  • Add filter_gages_by_area_threshold() pure function in readers.py, integrated into Merit and LynkerHydrofabric geodataset init
  • Extend read_gage_info() to return optional columns (COMID, COMID_DRAIN_SQKM, ABS_DIFF, COMID_UNITAREA_SQKM) when present in CSV
  • Standardize dhbv2_gages.csv columns to match GAGES-II format via new patch_dhbv2_gages.py script

Files changed

Area Files What
Reference build build_gage_references.py unitarea in spatial join, ABS_DIFF in error metrics, updated OUTPUT_COLUMNS
Reference data GAGES-II.csv, camels_670.csv, gages_3000.csv, dhbv2_gages.csv Added ABS_DIFF and COMID_UNITAREA_SQKM columns
Script patch_dhbv2_gages.py New: derives standard columns for dhbv2 from existing data + MERIT catchment lookup
IO readers.py Optional column support in read_gage_info(), new filter_gages_by_area_threshold()
Config configs.py max_area_diff_sqkm field on ExperimentConfig (default 50 km²)
Geodatasets merit.py, lynker_hydrofabric.py Call filter function in _init_training() and _init_inference()

Test plan

  • pytest tests/io/test_readers.py — 14 tests (3 optional columns + 6 filter function + existing)
  • pytest tests/validation/test_configs.py — 3 new threshold tests + existing
  • pytest tests/references/test_build_gage_references.py — 4 tests for compute_error_metrics()
  • pytest tests/ — full suite: 354 passed, 1 skipped, 0 failures
  • pre-commit run --all-files — passes (remaining mypy errors are pre-existing MockGaugeSet issues)

🤖 Generated with Claude Code

@taddyb taddyb merged commit ebba510 into DeepGroundwater:master Feb 8, 2026
4 checks passed
@taddyb taddyb deleted the filtering_gages branch February 8, 2026 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant