Skip to content

references: added percent and relative error in drainage areas#122

Merged
taddyb merged 1 commit intoDeepGroundwater:masterfrom
taddyb:master
Feb 8, 2026
Merged

references: added percent and relative error in drainage areas#122
taddyb merged 1 commit intoDeepGroundwater:masterfrom
taddyb:master

Conversation

@taddyb
Copy link
Collaborator

@taddyb taddyb commented Feb 8, 2026


Summary

  • Added references/geo_io/build_gage_references.py — a reproducible, self-contained pipeline that spatially joins GAGES-II gage points to
    MERIT Hydro catchments and writes three reference CSVs
  • Regenerated all three gage CSVs (GAGES-II.csv, camels_670.csv, gages_3000.csv) with MERIT upstream area (COMID_DRAIN_SQKM) and error
    metrics (PCT_DIFF, REL_ERR)
  • Updated references/gage_info/README.md to document the new columns and reference the build script

Motivation

The spatial join code that originally assigned MERIT COMIDs to gage locations was never checked into the repo — only the downstream
add_merit_uparea.py enrichment script existed. This makes the gage CSV provenance non-reproducible. This PR adds the full pipeline from
raw data sources to final CSVs.

Pipeline

  GAGES-II gpkg (5070)  +  MERIT catchments (4326 → 5070)
           │                        │
           └───── sjoin("within") ──┘
                        │
                gages with COMID
                        │
                merge uparea from MERIT rivers
                        │
                compute PCT_DIFF, REL_ERR
                        │
           ┌────────────┼────────────┐
           │            │            │
     filter by      filter by     drop non-standard
     camels_name    gages3000Info   IDs (>8 digits)
           │            │            │
     camels_670    gages_3000    GAGES-II

Verification

┌────────────────┬───────────────────┬──────────────────────┐
│      CSV       │       Rows        │  COMIDs match old?   │
├────────────────┼───────────────────┼──────────────────────┤
│ camels_670.csv │ 670               │ 670/670 (100%)       │
├────────────────┼───────────────────┼──────────────────────┤
│ gages_3000.csv │ 3,211             │ 3,211/3,211 (100%)   │
├────────────────┼───────────────────┼──────────────────────┤
│ GAGES-II.csv   │ 8,931 (was 8,945) │ All shared IDs match │
└────────────────┴───────────────────┴──────────────────────┘

The 14-row difference in GAGES-II is due to correctly filtering all non-standard station IDs (9–15 digits) that the original ad-hoc run
missed.

Usage

  python references/geo_io/build_gage_references.py \
    --gages-gpkg data/conus_3000_gages.gpkg \
    --merit-catchments data/merit/cat_pfaf_7_MERIT_Hydro_v07_Basins_v01_bugfix1.shp \
    --merit-rivers ~/data/merit/riv_pfaf_7_MERIT_Hydro_v07_Basins_v01_bugfix1.shp \
    --camels-ids data/camels/camels_name.txt \
    --gages3000-ids data/gages3000Info.csv \
    --output-dir references/gage_info

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Code cleanup/refactor
  • Documentation update

Other (please specify):

Checklist

  • Branch is up to date with master
  • Updated tests or added new tests
  • Tests & pre-commit hooks pass
  • Updated documentation (if applicable)
  • Code follows established style and conventions

@taddyb taddyb merged commit 3f2503c into DeepGroundwater:master Feb 8, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant