Skip to content

feat: added hotstart and headwater benchmarks#121

Merged
taddyb merged 10 commits intoDeepGroundwater:masterfrom
taddyb:hot_start_headwater
Feb 8, 2026
Merged

feat: added hotstart and headwater benchmarks#121
taddyb merged 10 commits intoDeepGroundwater:masterfrom
taddyb:hot_start_headwater

Conversation

@taddyb
Copy link
Collaborator

@taddyb taddyb commented Feb 8, 2026

Issue Addressed

  • flow initialization now computes topologically accumulated upstream discharge instead of using raw local lateral inflow
    • (I - N) @ Q = Q' is the equation for hot-starts where you'll start based on the summed Q` discharge
  • Extracted hot-start logic into a standalone, unit-testable function
  • Added documentation explaining the mechanism
  • Added headwater basins into the engine/ code for training/benchmarking
    • DDR routing will work on headwaters likeQ_{t+1} = C₃·Q_t + C₄·Q' since there is no inflow

Description

  • Added compute_hotstart_discharge() — a module-level function that solves (I - N) @ Q = q_prime[0] via triangular_sparse_solve to accumulate upstream lateral inflows. Each node gets the sum of all upstream inflows as its initial discharge, rather than just its own local contribution.
  • setup_inputs() cold-start branch now calls this function instead of the naive self._discharge_t = self.q_prime[0]. The carry_state=True path (inference batches after the first) is unchanged.
  • Updated test_setup_inputs_basic assertion — no longer expects _discharge_t == streamflow[0]; now verifies shape, headwater equality, and that downstream nodes accumulate upstream flow.
  • Added TestComputeHotstartDischarge class with 6 tests:
    • test_linear_chain_uniform_inflow — uniform inflow=2 on 5-node chain produces [2, 4, 6, 8, 10]
    • test_linear_chain_nonuniform_inflow — varying inflow [3, 1, 2, 4] produces cumsum [3, 4, 6, 10]
    • test_single_reach — single node returns its own inflow
    • test_clamping — tiny inflows get clamped to discharge_lb
    • test_setup_inputs_uses_hotstart — end-to-end through setup_inputs()
    • test_carry_state_skips_hotstart — carry_state=True preserves existing state

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Code cleanup/refactor
  • Documentation update

Other (please specify):

Checklist

  • Branch is up to date with master
  • Updated tests or added new tests
  • Tests & pre-commit hooks pass
  • Updated documentation (if applicable)
  • Code follows established style and conventions

@taddyb
Copy link
Collaborator Author

taddyb commented Feb 8, 2026

Summary of new changes:

Engine Changes

engine/src/ddr_engine/merit/build.py

  • Removed headwater skip in build_gauge_adjacencies() — headwater gages now get zarr subgroups with empty COO matrices (zero edges,
    single-element order)
  • Added isolated COMID inclusion in create_adjacency_matrix() — COMIDs with no upstream/downstream connections in the flowpath data are
    appended to the topological order and included in the matrix shape

engine/src/ddr_engine/lynker_hydrofabric/build.py

  • Removed headwater skip. When connections == [], constructs an empty COO matrix and subset_flowpaths = [origin] directly instead of
    skipping
  • Added scipy.sparse import

DDR Core Changes

src/ddr/io/builders.py

  • construct_network_matrix(): Changed empty coordinates from raise ValueError to rows, cols = [], [] — an empty COO with correct shape is
    valid for headwater-only batches

src/ddr/geodatazoo/merit.py (both _collate_gages and _build_routing_data_gages)

  • active_indices now includes gage indices, not just edge indices from coo.row/coo.col — headwater gages were invisible before
  • Guarded compressed row/col construction with if coo.nnz > 0 for empty COO matrices
  • Headwater outflow_idx points to the gage's own compressed index when no downstream edges exist

src/ddr/geodatazoo/lynker_hydrofabric.py (same two methods)

  • Same three fixes as merit.py
  • Modified assertion in _collate_gages to exclude headwater gages from the "to" column validation (headwaters self-reference, which
    doesn't match the flowpath attr pattern)

Benchmark Changes

benchmarks/src/ddr_benchmarks/benchmark.py

  • DiffRoute skips headwater gages (its RivTree requires at least one edge) — predictions array initialized with np.full(..., np.nan),
    headwater rows stay NaN
  • Replaced boolean mask pattern with NaN sentinel filtering throughout: metrics, mass balance, gauge maps, and hydrographs all derive the
    routed set via ~np.isnan()
  • Gauge maps for DiffRoute and summed Q' now exclude unmatched/unrouted gages entirely instead of showing them as zero
  • Fixed shape mismatch in mass balance: sqp_daily covers only common gages (465/670), so DDR totals are now subsetted to match via
    sqp_common_mask
  • load_summed_q_prime() now returns common_mask alongside metrics and predictions

Style

  • All tqdm() calls across the project now consistently use ncols=140, ascii=True

Tests

  • Added TestIsolatedCOMIDs (6 tests) — verifies isolated COMIDs appear in topological order, correct matrix shape, no spurious edges,
    lower triangularity, zarr roundtrip
  • Added TestHeadwaterGaugeAdjacency (4 tests) — verifies headwater and isolated gages produce zarr subgroups with empty COO and correct
    attrs

Written by Claude Bot 🤖

@taddyb taddyb merged commit a7514b7 into DeepGroundwater:master Feb 8, 2026
4 checks passed
@taddyb taddyb deleted the hot_start_headwater branch February 8, 2026 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant