Conversation
There was a problem hiding this comment.
Pull request overview
This PR appears to be a merge attempt that updates SNOTEL retrieval to use requests, adds a Landsat-derived NDSI time series helper using Google Earth Engine, and introduces an Earth Engine exercise notebook demonstrating data download/merging/plotting.
Changes:
- Switch SNOTEL CSV download in
getSNOTELDatafromurllib3torequests. - Add
get_Landsat_NDSI(...)Earth Engine helper to compute basin-mean NDSI over time. - Add a new
EE_Exercise.ipynbnotebook to run an end-to-end workflow (NLDI basin, SNOTEL download, Landsat NDSI, merge + plot).
Reviewed changes
Copilot reviewed 2 out of 4 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| supporting_scripts/getData.py | Uses requests for SNOTEL retrieval and adds an EE-based Landsat NDSI helper. |
| EE_Exercise.ipynb | New notebook demonstrating EE login, basin setup, SNOTEL download helper, Landsat NDSI helper, merge, plot, and save. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "def download_snotel_csv(SiteName, SiteID, StateAbb, StartDate, EndDate, OutputFolder, tries=5):\n", | ||
| " out_csv = f\"./{OutputFolder}/df_{SiteID}_{StateAbb}_SNTL.csv\"\n", | ||
| "\n", | ||
| " if os.path.exists(out_csv):\n", | ||
| " df = pd.read_csv(out_csv)\n", | ||
| " df[\"Date\"] = pd.to_datetime(df[\"Date\"])\n", | ||
| " return df.set_index(\"Date\")\n", | ||
| "\n", | ||
| " url = (\n", | ||
| " \"https://wcc.sc.egov.usda.gov/reportGenerator/view_csv/\"\n", | ||
| " f\"customSingleStationReport/daily/start_of_period/{SiteID}:{StateAbb}:SNTL\"\n", | ||
| " f\"%7Cid=%22%22%7Cname/{StartDate},{EndDate}/WTEQ::value\"\n", | ||
| " \"?fitToScreen=false\"\n", | ||
| " )\n", | ||
| "\n", | ||
| " for attempt in range(tries):\n", | ||
| " try:\n", | ||
| " r = requests.get(url, timeout=60)\n", | ||
| " r.raise_for_status()\n", | ||
| "\n", | ||
| " with open(out_csv, \"w\", encoding=\"utf-8\") as f:\n", | ||
| " f.write(r.text)\n", | ||
| "\n", | ||
| " df = pd.read_csv(out_csv)\n", | ||
| " df[\"Date\"] = pd.to_datetime(df[\"Date\"])\n", | ||
| " return df.set_index(\"Date\")\n", | ||
| "\n", | ||
| " except Exception as e:\n", | ||
| " if attempt == tries - 1:\n", | ||
| " raise\n", | ||
| " wait = 2 ** attempt\n", | ||
| " print(f\"SNOTEL download failed: {e}\")\n", | ||
| " print(f\"Retrying in {wait} seconds...\")\n", | ||
| " time.sleep(wait)" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
| "id": "25e42b8b", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "Snowtel Data" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": null, | ||
| "id": "43a4d936", | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "def download_snotel_csv(SiteName, SiteID, StateAbb, StartDate, EndDate, OutputFolder, tries=5):\n", | ||
| " os.makedirs(OutputFolder, exist_ok=True)\n", | ||
| " out_csv = f\"./{OutputFolder}/df_{SiteID}_{StateAbb}_SNTL.csv\"\n", | ||
| "\n", |
There was a problem hiding this comment.
download_snotel_csv is defined twice in the notebook; the second definition overwrites the first (with different timeout behavior), which is confusing and makes execution order error-prone. Keep a single definition (or rename one of them) and centralize the parameter differences (e.g., as a timeout argument).
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "stream = stream_df.copy()\n", |
There was a problem hiding this comment.
stream_df is never defined anywhere in this notebook, so stream = stream_df.copy() will fail at runtime. Either create stream_df in the earlier “Streamflow data” section (e.g., using getData.get_usgs_streamflow(...)) or update this cell to use the dataframe you actually created (daily_NLDAS_df or similar).
| "stream = stream_df.copy()\n", | |
| "stream = daily_NLDAS_df.copy()\n", |
| "id": "0f5502c9", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "Clean and Allign" |
There was a problem hiding this comment.
Spelling: "Clean and Allign" → "Clean and Align".
| "Clean and Allign" | |
| "Clean and Align" |
| @@ -1,5 +1,6 @@ | |||
| import os | |||
| import sys | |||
| from urllib import response | |||
There was a problem hiding this comment.
from urllib import response is unused here and also collides with the local variable name response used later in getSNOTELData, which makes the code harder to follow. Remove this import (or rename the local variable if you truly need urllib.response).
| response = requests.get(url, timeout=60) | ||
| response.raise_for_status() | ||
| data = response.text |
There was a problem hiding this comment.
getSNOTELData now calls requests.get(...) but this module isn't imported anywhere in this file, so this will raise NameError: name 'requests' is not defined at runtime. Add an import requests (top-level or inside the function) before using it.
| ee.Authenticate() | ||
| ee.Initialize() | ||
|
|
There was a problem hiding this comment.
get_Landsat_NDSI performs ee.Authenticate()/ee.Initialize() on every call, which can trigger interactive auth repeatedly and makes this helper hard to use in non-interactive/production contexts. Prefer authenticating/initializing once at application entry (or guard initialization so repeated calls are no-ops).
| n = col.size().getInfo() | ||
| img_list = col.toList(n) | ||
|
|
||
| records = [] | ||
|
|
||
| for i in range(n): | ||
| img = ee.Image(img_list.get(i)) | ||
|
|
||
| qa = img.select("QA_PIXEL") | ||
| cloud_mask = ( | ||
| qa.bitwiseAnd(1 << 1).eq(0) | ||
| .And(qa.bitwiseAnd(1 << 2).eq(0)) | ||
| .And(qa.bitwiseAnd(1 << 3).eq(0)) | ||
| .And(qa.bitwiseAnd(1 << 4).eq(0)) | ||
| ) | ||
|
|
||
| green = img.select("SR_B3").multiply(2.75e-05).add(-0.2) | ||
| swir1 = img.select("SR_B6").multiply(2.75e-05).add(-0.2) | ||
|
|
||
| ndsi = green.subtract(swir1).divide(green.add(swir1)).rename("NDSI") | ||
| ndsi = ndsi.updateMask(cloud_mask) | ||
|
|
||
| stat = ndsi.reduceRegion( | ||
| reducer=ee.Reducer.mean(), | ||
| geometry=basin, | ||
| scale=30, | ||
| maxPixels=1e9, | ||
| ).getInfo() | ||
|
|
||
| if stat and stat.get("NDSI") is not None: | ||
| records.append({ | ||
| "Date": pd.to_datetime(img.date().format("YYYY-MM-dd").getInfo()), | ||
| "NDSI": stat["NDSI"], |
There was a problem hiding this comment.
This implementation makes many client↔server round-trips to Earth Engine (col.size().getInfo(), img_list.get(i), reduceRegion(...).getInfo(), and img.date()....getInfo() inside the loop). For larger date ranges this will be very slow and may hit EE quotas/timeouts. Consider doing the reduction server-side (e.g., map over the collection to a FeatureCollection with date+mean NDSI properties) and only call getInfo() once (or export the table).
No description provided.