This repository contains code to calibrate a multi‐reservoir (tank) hydrologic model against observed runoff data using a Genetic Algorithm (GA). The calibration process is performed over multiple seasonal periods, optimizing storage parameters, runoff coefficients, side‐outlet heights, and infiltration rates to maximize the coefficient of determination (R²) between simulated and observed runoff.
The main steps are:
- Load observed data (precipitation, actual evapotranspiration, runoff) from a CSV file.
- Define seasonal periods for calibration (February–June, June–September, October–January for years 2009–2012).
- Initialize a GA optimizer (
GAOptimizer) with parameter ranges and genetic‐algorithm settings. - Run GA to find the best tank‐model coefficients for each period.
- Simulate the tank model forward using the optimized coefficients.
- Visualize both the simulated vs. observed runoff (via
visualize_runoff) and the GA‐parameter‐history (visualize_tuning) for each period.
-
Python 3.8 or higher
-
The following Python packages (install via
pip):numpypandasmatplotlib- (Any other dependencies required by modules under
src/, e.g., if additional packages are imported there.)
-
Clone this repository (or download the ZIP):
git clone https://github.com/yourusername/tank-hydrologic-ga.git cd tank-hydrologic-ga -
Verify the directory structure:
tank-hydrologic-ga/ ├── data/ │ └── 3009680_p4.csv ├── src/ │ ├── model.py │ ├── opt.py │ └── vis.py ├── run_calibration.py ├── README.md └── requirements.txt (optional)-
data/3009680_p4.csvmust contain at least the following columns:date(YYYY-MM-DD or parseable bypd.to_datetime)P(precipitation time series)Q(observed runoff time series)AET(actual evapotranspiration time series)
-
-
Place your data file (e.g.,
3009680_p4.csv) under thedata/directory. -
Edit configuration parameters at the top of
run_calibration.pyif needed:-
TANK_LEVEL: Number of tanks in the cascade (e.g., 4). -
TIMESTEPS: Number of timesteps per day (e.g., 60 × 60 × 24 for hourly data). -
AREA: Catchment area (e.g., 601.61 km² or consistent units). -
DATA_PATHandDATA_NAME: Path and filename underdata/. -
Parameter ranges for GA:
STORAGE_RANGE = None RUNOFF_RANGE = [ (0.1, 0.5), # Tank 1 runoff coefficient (0.1, 0.5), # Tank 2 runoff coefficient (0.03, 0.1), # Tank 3 runoff coefficient (0.005, 0.01), # Tank 4 runoff coefficient (0.0005, 0.01) # Outlet runoff from last tank ] SIDE_RANGE = [ (5, 60), # Side‐outlet height tank 1 (mm or consistent units) (5, 60), # Side‐outlet height tank 2 (0, 50), # Side‐outlet height tank 3 (0, 30), # Side‐outlet height tank 4 (0, 0) # No side‐outlet for last tank (fixed) ] INFIL_RANGE = [ (0.1, 0.5), # Infiltration rate tank 1 (0.01, 0.1), # Infiltration rate tank 2 (0.005, 0.01), # Infiltration rate tank 3 (0, 0) # No infiltration for tanks 4+ (fixed zero) ]
-
GA settings:
GENERATION = 200 # Number of GA generations per period POPULATION = 100 # Population size per generation
-
-
Run the calibration script:
This will loop over each defined seasonal period (Feb 1–Jun 14, Jun 15–Sep 30, Oct 1–Jan 31 for 2009–2012). For each period:
-
A new
GAOptimizerinstance is created with the fixed storage (last storage of the previous period). -
The GA searches for the best set of tank parameters (storage, runoff coefficients, side‐outlet heights, infiltration rates) that maximize the R² between simulated and observed runoff.
-
After convergence, the optimized coefficients are printed.
-
The tank model is run forward for that period, and
visualize_runoff(...)plots observed vs. simulated runoff.visualize_tuning(...)plots the GA’s parameter‐history (fitness vs. generation, parameter distributions over time).
-
-
Results:
- Each period’s best coefficients are printed to the console.
- Time‐series plots of runoff comparison (
visualize_runoff) and GA tuning history (visualize_tuning) appear interactively (or save to disk ifsave_pathis provided). - The final storage from the last period will be printed (and can be reused if extending beyond 2012).
-
Seasonal Periods Defined in
run_calibration.pyas:years = range(2009, 2013) periods = [] for y in years: periods.extend([ (f"{y}-02-01", f"{y}-06-14"), (f"{y}-06-15", f"{y}-09-30"), (f"{y}-10-01", f"{y+1}-01-31"), ])
This covers Feb 1–Jun 14, Jun 15–Sep 30, and Oct 1–Jan 31 for each year from 2009 through 2012 (inclusive). Adjust as needed.
-
Objective Function A custom R² score is defined:
def r2_score(y_true, y_pred): y_true = np.array(y_true) y_pred = np.array(y_pred) y_mean = np.mean(y_true) ss_res = np.sum((y_true - y_pred)**2) ss_tot = np.sum((y_true - y_mean)**2) return 1 - (ss_res / ss_tot)
The GA maximizes this metric.
-
Fixed Storage
final_storageis initiallyNone.- After each period’s simulation,
final_storageis set to the last storage state from the best model. - In subsequent periods,
fixed_storage=final_storageensures continuity between seasons.
tank-hydrologic-ga/
├── data/ # Input time‐series data (CSV)
│ └── 3009680_p4.csv # Must contain columns: date, P, Q, AET
│
├── src/ # Core Python modules
│ ├── model.py # Tank model implementation
│ ├── opt.py # Genetic Algorithm optimizer
│ └── vis.py # Visualization utilities
│
├── run_calibration.py # Main script that coordinates:
│ # • data loading
│ # • period splitting
│ # • GA calibration
│ # • simulation and visualization
│
├── README.md # Project documentation (this file)
├── requirements.txt # (Optional) pinned dependencies, e.g.:
│ # numpy==1.25.0
│ # pandas==2.1.0
│ # matplotlib==3.8.0
│
└── results/ # (Optional) output directory for saved plots/coefficients
-
Data Folder (
data/)-
Place your CSV file named exactly
3009680_p4.csv(or updateDATA_NAMEinrun_calibration.pyif you prefer a different name). -
The CSV must have:
date(parsable bypd.to_datetime),P(precipitation),Q(observed runoff),AET(actual evapotranspiration).
-
-
Source Folder (
src/)-
model.py:- Implements a
Tankclass that encapsulates storage → runoff → infiltration computations per timestep. make_models(...)constructs a multi‐tank cascade using the best‐found coefficients.
- Implements a
-
opt.py:-
Implements
GAOptimizer, including:- Parameter encoding/decoding to/from real‐valued vectors.
- Population initialization, selection (e.g., tournament or roulette), crossover, and mutation operators.
- Fitness evaluation loop (calls the tank model simulation internally).
-
-
vis.py:- Two primary functions:
visualize_runoffandvisualize_tuning. - By default, uses
matplotlibto create time‐series and convergence plots.
- Two primary functions:
-
-
Main Script (
run_calibration.py)- This is the entry point. It does not depend on any other external scripts (aside from
src/modules). - To adapt this script for different data or parameter ranges, simply edit the top section (constants and ranges).
- This is the entry point. It does not depend on any other external scripts (aside from
-
Change Number of Tanks
- Modify
TANK_LEVELinrun_calibration.py(e.g.,TANK_LEVEL = 5for a five‐tank cascade). - Ensure you update
RUNOFF_RANGE,SIDE_RANGE, andINFIL_RANGElists to have length ≥TANK_LEVEL + 1where appropriate.
- Modify
-
Use a Different Objective
- Replace the
r2_scorefunction with any user‐defined metric (e.g., Nash–Sutcliffe Efficiency, other statistical metrics). - Pass that function to
GAOptimizer(..., objective=<your_function>, direction="maximize" or "minimize").
- Replace the
-
Save Plots to Disk
-
In each call to
visualize_runoff(...)orvisualize_tuning(...), setsave_pathto astrorpathlib.Pathto write PNG/PDF files automatically:plot_path = DATA_PATH / "results" / f"runoff_{start.date()}_{end.date()}.png" visualize_runoff(dates, runoff_target, total_runoff, show=False, save_path=plot_path)
-
-
Parallel GA Fitness Evaluations
- If running on a multi‐core machine, consider parallelizing the fitness evaluation in
GAOptimizer(e.g., usingmultiprocessing.Pool). This can drastically reduce run times for large populations.
- If running on a multi‐core machine, consider parallelizing the fitness evaluation in
Plots appear one by one for each period. Close the figure window (or press “Enter” if using non‐interactive backends) to proceed to the next period’s GA.
# =============================================================================
# Lim, J. and Yang, S. (2025). *Tank model implementation in Python (Sugawara & Maruyama, 1956)* (v1.0).
# [Computer software]. Department of Civil and Environmental Engineering, Seoul National University.
# https://doi.org/10.5281/zenodo.15464005
# =============================================================================
Last updated: June 2025