Code and configuration files accompanying:
Vlah et al. 2023. "Virtual gauges: the surprising potential to reconstruct continuous streamflow from strategic measurements". HESS.
All data for this study can be found in this Figshare collection. Within the collection you'll find:
- Input data
- Output data
- Figures
- Composite discharge series for each NEON site, including the best estimates from this study and NEON's published data. You can also visualize the composite series here.
If you use our LSTM configurations for the NeuralHydrology library, you'll need to add the "pbias" metric. See src/lstm_dungeon/metrics.py below. Or remove that metric from the configs.
Structure of this repo:
. top level of this repo
├── cfg configuration files
│ ├── donor_gauges.yml donor gauge IDs for each NEON site
│ ├── google_auth.cfg if you're going to rebuild NEON climate forcings, put your auth email for google earth engine AND google drive here
│ └── model_ranks.csv adjust inclusion, omission, and order of precedence for model estimates used in building composite discharge series
│
│ N = NEON, L = linear regression, G = generalist LSTM, S = specialist LSTM
│ PG = process-guided generalist LSTM, PS = process-guided specialist LSTM
│
├── figs - - - - - - - all figures generated by this analysis. download from https://dx.doi.org/10.6084/m9.figshare.23169362
├── in input data. download from https://dx.doi.org/10.6084/m9.figshare.22349377 and place here
├── log log files generated during model runs will be saved here
├── out all predictions, output data, and LSTM model runs. can be downloaded from https://dx.doi.org/10.6084/m9.figshare.22344589
├── q_sim.Rproj Open this to start the project in Rstudio at the correct working directory, or start with src/01_data_retrieval.R
├── README.md this file
└── src - - - - - - - all R, python, and shell scripts. they're set up to run only what they need (so you should be able to start anywhere, with or without /in)
├── 00_helpers.R sourced in several other files
├── 01_data_retrieval.R the script to start with. if you've already downloaded input data from figshare, it will only run what it needs to
├── 02_regression.R regression analysis
├── 03_organize_camels_macrosheds_nhm.R data prep for LSTM runs
├── 04_run_lstms.R code for running LSTMs. if you don't have a good GPU and weeks to spare, you'll need HPC. see src/lstm_dungeon/slurm_example_config.sh
├── 05_map_fig.R fig 2 code
├── 06_barplot.R fig 3 code
├── 07_build_composite_discharge.R code for cleaning and splicing together the best estimates from all sources
├── 08_ts_plot.R fig 5 code
├── 09_gap_plot.R fig 4 code
└── lstm_dungeon - - - - code that you only need to visit if you're going to rerun/adapt the LSTMs from this study
├── camels_helpers.R sourced in ../03_organize_camels_macrosheds_nhm.R
├── environment.yml python package data for building conda environment
├── metrics.py this file needs to be inserted into the NeuralHydrology library repo, at neuralhydrology/evaluation/metrics.py. then reinstall NH
├── pickle_traintest_periods.py sourced in ../00_helpers.R
├── recompute_camels_climate.R sourced in ../03_organize_camels_macrosheds_nhm.R
├── recompute_camels_soil.R sourced in ../03_organize_camels_macrosheds_nhm.R
├── re-evaluate_models.py sourced in ../00_helpers.R
├── run_lstms_hpc.py example script for running LSTMs on HPC
├── run_lstms_local.py sourced in ../00_helpers.R; may need modification on your machine
├── slurm_example_config.sh example slurm batch script for running LSTMs on HPC
├── summarize_neon_daymet.R sourced in ../03_organize_camels_macrosheds_nhm.R
└── summarize_neon_pet.R sourced in ../03_organize_camels_macrosheds_nhm.R