Skip to content

Fast, accurate watershed delineation using hybrid vector- and raster-based methods and data from MERIT-Hydro

License

Notifications You must be signed in to change notification settings

hydrosolutions/delineator

 
 

Repository files navigation

Delineator

Fast, accurate watershed delineation for any point on Earth's land surface using hybrid vector/raster methods with MERIT-Hydro and MERIT-Basins datasets.

Citation: DOI 10.5281/zenodo.7314287

Online Demo: https://mghydro.com/watersheds/ (free, easy to use, good for most users)

Overview

Delineator is a Python CLI tool for watershed delineation that combines:

  • High-resolution MERIT-Hydro raster data (flow direction and accumulation)
  • MERIT-Basins vector data (unit catchments and river networks)
  • Hybrid vector/raster algorithms for optimal speed and accuracy

The tool automatically downloads required data, handles large-scale batch processing, and outputs results in standard GIS formats.

Installation

With uv (recommended)

git clone https://github.com/your-org/delineator.git
cd delineator
uv sync

With pip

git clone https://github.com/your-org/delineator.git
cd delineator
pip install -e .

Requirements: Python 3.12+

Quick Start

1. Create an outlets file (TOML format)

# outlets.toml
[[outlets]]
gauge_id = "usgs_12345678"
lat = 47.6062
lng = -122.3321
gauge_name = "Green River near Seattle, WA"  # optional

2. Create a master configuration file

# config.toml
[settings]
output_dir = "./output"
max_fails = 100  # optional

[[regions]]
name = "my_region"
outlets = "outlets.toml"

3. Run delineation

delineator run config.toml

The tool will automatically:

  • Determine required MERIT-Hydro basins from outlet coordinates
  • Download missing data
  • Delineate watersheds
  • Output shapefiles in output/region=my_region/

CLI Commands

Run watershed delineation

delineator run config.toml
delineator run config.toml --dry-run              # Validate config without processing
delineator run config.toml -o ./output            # Override output directory
delineator run config.toml --max-fails 10         # Stop after 10 failures
delineator run config.toml --no-download          # Fail if data is missing (no auto-download)
delineator run config.toml --fill-threshold 50    # Fill holes smaller than 50 pixels
delineator run config.toml --fill-threshold 0     # Fill ALL holes (no size limit)

Download MERIT-Hydro data

# Download by bounding box (min_lon,min_lat,max_lon,max_lat)
delineator download --bbox -125,45,-120,50 -o data/

# Download specific basins by Pfafstetter Level 2 code
delineator download --basins 71,72,73 -o data/

# Download only rasters (no Google Drive credentials needed)
delineator download --bbox -125,45,-120,50 --rasters-only

# Preview what would be downloaded
delineator download --bbox -125,45,-120,50 --dry-run

List available basins

delineator list-basins

Displays all 61 Pfafstetter Level 2 basin codes grouped by continent.

Configuration

Master Config (delineate.toml)

[settings]
output_dir = "./output"               # Required: base output directory
data_dir = "~/data/merit-hydro"       # Optional: path to MERIT-Hydro data (see below)
max_fails = 100                       # Optional: stop after N failures (default: unlimited)
fill_threshold = 100                  # Optional: fill holes smaller than N pixels (default: 100, 0 = fill all)

[[regions]]
name = "region_name"         # Required: used for hive partitioning (region=name/)
outlets = "outlets.toml"     # Required: path to outlets file

Data Directory

The tool needs MERIT-Hydro data (rasters and vectors). You can configure where it looks for this data:

Option 1: Config file (per-project)

[settings]
data_dir = "~/data/merit-hydro"

Option 2: Environment variable (global default)

export DELINEATOR_DATA_DIR="$HOME/data/merit-hydro"

Fallback chain: config file > env var > {output_dir}/../data

For a central data directory used across all projects, add the env var to your shell profile (~/.zshrc):

export DELINEATOR_DATA_DIR="$HOME/data/merit-hydro"

Outlets File Format

[[outlets]]
gauge_id = "unique_id"       # Required: unique identifier (used in output filenames)
lat = 47.6062                # Required: latitude (decimal degrees, EPSG:4326)
lng = -122.3321              # Required: longitude (decimal degrees, EPSG:4326)
gauge_name = "River Name"    # Optional: descriptive name

See examples/ directory for complete configuration examples.

Output Structure

output/
├── region=my_region/
│   └── my_region.shp        # Shapefile with all watersheds for this region
└── FAILED.csv               # Log of failed outlets (if any)

Each watershed shapefile includes attributes:

  • gauge_id: Unique identifier
  • gauge_name: Descriptive name (if provided)
  • area: Watershed area (km²)
  • country: Country code
  • Geometry: Watershed polygon

Data Sources

The tool uses data from:

  • MERIT-Hydro: High-resolution flow direction and accumulation rasters (3-arcsecond, ~90m resolution)
  • MERIT-Basins: Vector unit catchments and river networks

Data is organized by Pfafstetter Level 2 basins (61 continental-scale basins worldwide). The tool automatically downloads required data on first use.

For manual download or offline use:

Google Drive Setup (for Vector Downloads)

Raster data (flow direction, accumulation) downloads automatically from mghydro.com — no credentials needed.

Vector data (MERIT-Basins catchments and rivers) is hosted on Google Drive and requires authentication. If you only need rasters, use --rasters-only to skip this setup.

1. Set the MERIT-Basins Folder ID

The vector data is hosted in a Google Drive folder. Set this environment variable:

export MERIT_BASINS_FOLDER_ID="1owkvZQBMZbvRv3V4Ff3xQPEgmAC48vJo"

This is the pfaf_level_02 folder from the MERIT-Basins bugfix1 release.

2. Create Google Cloud Service Account Credentials

You need a service account to authenticate with Google Drive API.

Option A: Using gcloud CLI (recommended)

# If you have an existing project and service account:
gcloud iam service-accounts keys create ~/drive-credentials.json \
    --iam-account=YOUR_SERVICE_ACCOUNT@YOUR_PROJECT.iam.gserviceaccount.com

# Or create everything from scratch:
gcloud projects create my-delineator-project --set-as-default
gcloud services enable drive.googleapis.com
gcloud iam service-accounts create delineator-sa \
    --display-name="Delineator Service Account"
gcloud iam service-accounts keys create ~/drive-credentials.json \
    --iam-account=delineator-sa@my-delineator-project.iam.gserviceaccount.com

Option B: Using Google Cloud Console

  1. Go to Google Cloud Console
  2. Create or select a project
  3. Enable the Google Drive API
  4. Go to IAM & AdminService Accounts
  5. Create a service account and download the JSON key

3. Set the Credentials Environment Variable

export GOOGLE_APPLICATION_CREDENTIALS="$HOME/drive-credentials.json"

4. Add to Shell Profile (Recommended)

Add these lines to your ~/.zshrc or ~/.bashrc for persistence:

export GOOGLE_APPLICATION_CREDENTIALS="$HOME/drive-credentials.json"
export MERIT_BASINS_FOLDER_ID="1owkvZQBMZbvRv3V4Ff3xQPEgmAC48vJo"

Then reload: source ~/.zshrc

Verification

Test that everything works:

delineator download --basins 12 --dry-run

This should show the files that would be downloaded without errors.

Development

Package Structure

src/delineator/
├── cli/          # Typer CLI (run, download, list-basins commands)
├── config/       # Pydantic configuration schema for TOML configs
├── core/         # Delineation logic (watershed algorithms, dissolve, raster ops)
└── download/     # MERIT-Hydro data download (HTTP and Google Drive)

Running Tests

uv run pytest

Formatting and Linting

uv run ruff format
uv run ruff check --fix

See CLAUDE.md for detailed development guidelines.

License

MIT License. See LICENSE file for details.

Citation

If you use this tool in research, please cite:

Heberger, M. (2022). delineator: Fast watershed delineation using MERIT-Hydro (Version 1.0.0) [Software]. Zenodo. https://doi.org/10.5281/zenodo.7314287

Acknowledgments

Original author: Matthew Heberger

Built using:

About

Fast, accurate watershed delineation using hybrid vector- and raster-based methods and data from MERIT-Hydro

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%