Skip to content

Commit 5e65ad9

Browse files
authored
Merge pull request #173 from vlahm/master
ready to publish full dataset and eml
2 parents 92a715f + 65bd1e9 commit 5e65ad9

File tree

82 files changed

+3379
-1584
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

82 files changed

+3379
-1584
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,3 +31,6 @@ eml/data_links
3131
eml/eml_out
3232
vault/*
3333
.Rdata
34+
misc_backups
35+
eml/data_links_truncated
36+
eml/data_links_old_justforreference/

CHANGELOG.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
for v2:
2+
for watersheds smaller than 10 ha, GEE reducers sometimes fail to produce a value. these watershed boundary features have been replaced with a 10 ha circle for the purposes of GEE reductions, resulting in cc_precip, cc_temp, and aet values where previously they were missing for calhoun weir_4, baltimore MCDN, etc. as a byproduct, other watershed summary values for these sites may have changed slightly from version 1
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
------------------
2+
Full file contents
3+
------------------
4+
5+
data_use_agreements.docx Terms and conditions for using MacroSheds data.
6+
attribution_and_intellectual_rights_ts.xlsx Specific license requirements and expectations associated with each
7+
primary time-series dataset.
8+
attribution_and_intellectual_rights_ws_attr.csv Information about fair use of watershed attribute data
9+
glossary.txt Glossary of terms related to the MacroSheds dataset.
10+
timeseries_X.csv Time-series (streamflow, precip if available, chemistry) for domain X
11+
ws_attr_summaries.csv Watershed attribute data, summarized across time, for all domains
12+
ws_attr_timeseries.csv Watershed attribute data, temporally explicit, for all domains
13+
CAMELS_compliant_Daymet_forcings.csv Daymet climate forcings for all domains; interoperable with the CAMELS
14+
dataset (https://ral.ucar.edu/solutions/products/camels)
15+
CAMELS_compliant_ws_attr_summaries.csv Watershed attribute data, temporally explicit, for all domains, and
16+
interoperable with the CAMELS dataset
17+
(https://ral.ucar.edu/solutions/products/camels)
18+
sites.csv Stream site metadata
19+
shapefiles.zip Watershed boundaries, stream gauge locations, and precip gauge
20+
locations, for all domains.
21+
variables_time_series.csv Time-series variable metadata (standard units, etc.)
22+
variables_ws_attr_timeseries.csv Variable metadata (standard units and definitions) for temporally explicit watershed attributes
23+
variable_category_codes_ws_attr.csv Watershed attribute category codes (the second letter of the variable
24+
code prefix)
25+
variable_data_source_codes_ws_attr.csv Watershed attribute data source codes (the first letter of the
26+
variable code prefix)
27+
detection_limits.csv Primary data source detection limits
28+
data_coverage_breakdown.csv Number of observations, timespan of observation, by variable and site
29+
disturbance_record.csv A register of known watershed experiments and significant natural
30+
disturbances
31+
range_check_limits.csv Minimum and maximum values allowed to pass through our range filter.
32+
Values exceeding these limits are omitted from the MacroSheds dataset.
33+
data_irregularities.csv Any notable inconsistencies within the MacroSheds dataset
34+
changelog.txt List of changes made since the last version of the MacroSheds dataset.
35+
code_autodocumentation.zip Programmatically assembled pseudo-scripts intended to help users
36+
recreate/edit specific MacroSheds data products (Also see our code on
37+
GitHub).
38+
timeseries_refs.bib Complete bibliographic references for time-series data.
39+
ws_attr_refs.bib Complete bibliographic references for watershed attribute data.
Lines changed: 119 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,137 @@
1-
-----
2-
Glossary of Terms (*as used in this documentation)
3-
-----
1+
------------------
2+
Links
3+
------------------
44

5-
watershed - All land area contributing runoff to a point of interest along a stream. Does not necessarily account for inputs from subsurface flow or human-constructed diversions. We avoid the terms "catchment" and "basin," though they are sometimes used in this way.
5+
MacroSheds data portal
66

7-
site* - An individual gauging station or stream sampling location and its watershed.
7+
⠀⠀⠀⠀https://macrosheds.org
88

9-
domain* - One or more sites under common management.
9+
MacroSheds code
1010

11-
network* - One or more domains under common funding/leadership.
11+
⠀⠀⠀⠀https://github.com/MacroSHEDS⠀
1212

13-
provider* - Also sometimes referred to as a "source" -- a primary source of data assimilated into MacroSheds. May be a network, domain, or third party.
13+
MacroSheds data paper (preprint)
1414

15-
product* - A collection of data, possibly including multiple datasets/tables. Providers may separate products by temporal extent/interval, scientific category, detection method, and/or sampling location.
15+
⠀⠀⠀⠀https://eartharxiv.org/repository/view/3499/
1616

17-
prodname - One of the 7 core product categories included in the MacroSheds dataset. See below.
17+
------------------
18+
Full file contents
19+
------------------
1820

19-
prodcode - An alphanumeric string associated with a product. Providers have their own. MacroSheds uses its own scheme internally. This won't be relevant for most users, but see detailed documentation included with core data downloads for more information.
21+
data_use_agreements.docx
2022

21-
site-product, site-year, etc. - Terms like these are used to designate various subdivisions of the overall MacroSheds dataset. A site-product, for example, is the collection of all data for a single MacroSheds product, available at a single site.
23+
⠀⠀⠀⠀Terms and conditions for using MacroSheds data.
2224

23-
-----
24-
MacroSheds data are organized into the following products:
25-
-----
25+
attribution_and_intellectual_rights_ts.xlsx
2626

27-
discharge - Streamflow; water volume over time; reported in L/s.
27+
⠀⠀⠀⠀Specific license requirements and expectations associated with
2828

29-
stream chemistry - Concentration of chemical constituents in stream water; reported in mg/L or mEq/L.
29+
⠀⠀⠀⠀each primary time-series dataset.
3030

31-
stream flux - Mass of chemical constituents in stream water, per watershed area, over time; reported in kg/ha/d. (not currently included with this dataset, but can be generated via the macrosheds package for R)
31+
attribution_and_intellectual_rights_ws_attr.csv
3232

33-
precipitation - Rainfall, snowfall, or both combined; reported per watershed in mm.
33+
⠀⠀⠀⠀Information about fair use of watershed attribute data
3434

35-
precipitation chemistry - Concentration of chemical constituents in precipitation; reported in mg/L or mEq/L; averaged across watershed area.
35+
glossary.txt
3636

37-
precipitation flux - Mass of chemical constituents in precipitation, per watershed area, over time; reported in kd/ha/d. (not currently included with this dataset, but can be generated via the macrosheds package for R)
37+
⠀⠀⠀⠀Glossary of terms related to the MacroSheds dataset.
3838

39-
watershed attributes - Areal watershed summary statistics, variables available are common to all MacroSheds sites.
39+
timeseries_X.csv
4040

41+
⠀⠀⠀⠀Time-series (streamflow, precip if available, chemistry) for domain X
42+
43+
ws_attr_summaries.csv
44+
45+
⠀⠀⠀⠀Watershed attribute data, summarized across time, for all domains
46+
47+
ws_attr_timeseries.csv
48+
49+
⠀⠀⠀⠀Watershed attribute data, temporally explicit, for all domains
50+
51+
CAMELS_compliant_Daymet_forcings.csv
52+
53+
⠀⠀⠀⠀Daymet climate forcings for all domains; interoperable with the CAMELS
54+
55+
⠀⠀⠀⠀dataset (https://ral.ucar.edu/solutions/products/camels)
56+
57+
CAMELS_compliant_ws_attr_summaries.csv
58+
59+
⠀⠀⠀⠀Watershed attribute data, temporally explicit, for all domains, and
60+
61+
⠀⠀⠀⠀interoperable with the CAMELS dataset
62+
63+
⠀⠀⠀⠀(https://ral.ucar.edu/solutions/products/camels)
64+
65+
sites.csv
66+
67+
⠀⠀⠀⠀Stream site metadata
68+
69+
shapefiles.zip
70+
71+
⠀⠀⠀⠀Watershed boundaries, stream gauge locations, and precip gauge
72+
73+
⠀⠀⠀⠀locations, for all domains.
74+
75+
variables_time_series.csv
76+
77+
⠀⠀⠀⠀Time-series variable metadata (standard units, etc.)
78+
79+
variables_ws_attr_timeseries.csv
80+
81+
⠀⠀⠀⠀Variable metadata (standard units and definitions) for temporally explicit watershed attributes
82+
83+
variable_category_codes_ws_attr.csv
84+
85+
⠀⠀⠀⠀Watershed attribute category codes (the second letter of the variable
86+
87+
⠀⠀⠀⠀code prefix)
88+
89+
variable_data_source_codes_ws_attr.csv
90+
91+
⠀⠀⠀⠀Watershed attribute data source codes (the first letter of the
92+
93+
⠀⠀⠀⠀variable code prefix)
94+
95+
detection_limits.csv
96+
97+
⠀⠀⠀⠀Primary data source detection limits
98+
99+
data_coverage_breakdown.csv
100+
101+
⠀⠀⠀⠀Number of observations, timespan of observation, by variable and site
102+
103+
disturbance_record.csv
104+
105+
⠀⠀⠀⠀A register of known watershed experiments and significant natural
106+
107+
⠀⠀⠀⠀disturbances
108+
109+
range_check_limits.csv
110+
111+
⠀⠀⠀⠀Minimum and maximum values allowed to pass through our range filter.
112+
113+
⠀⠀⠀⠀Values exceeding these limits are omitted from the MacroSheds dataset.
114+
115+
data_irregularities.csv
116+
117+
⠀⠀⠀⠀Any notable inconsistencies within the MacroSheds dataset
118+
119+
changelog.txt
120+
121+
⠀⠀⠀⠀List of changes made since the last version of the MacroSheds dataset.
122+
123+
code_autodocumentation.zip
124+
125+
⠀⠀⠀⠀Programmatically assembled pseudo-scripts intended to help users
126+
127+
⠀⠀⠀⠀recreate/edit specific MacroSheds data products (Also see our code on
128+
129+
⠀⠀⠀⠀GitHub).
130+
131+
timeseries_refs.bib
132+
133+
⠀⠀⠀⠀Complete bibliographic references for time-series data.
134+
135+
ws_attr_refs.bib
136+
137+
⠀⠀⠀⠀Complete bibliographic references for watershed attribute data.

eml/eml_templates/attributes_CAMELS_compliant_Daymet_forcings.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
"attributeName" "attributeDefinition" "class" "unit" "dateTimeFormatString" "missingValueCode" "missingValueCodeExplanation"
22
"date" "date as UTC timestamp with 00:00:00 time" "Date" "Y-M-DTh:m:sZ"
3-
"site_code" "Short name for MacroSheds site. See site_metadata.csv" "character"
3+
"site_code" "Short name for MacroSheds site. See sites.csv" "character"
44
"dayl(s)" "Watershed average seconds of daylight" "numeric" "second" "NA" "missing value"
55
"prcp(mm/day)" "Watershed average precipitation" "numeric" "millimetersPerDay" "NA" "missing value"
66
"srad(W/m2)" "Watershed average solar radiation" "numeric" "wattPerMeterSquared" "NA" "missing value"

eml/eml_templates/attributes_CAMELS_compliant_ws_attr.txt renamed to eml/eml_templates/attributes_CAMELS_compliant_ws_attr_summaries.txt

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
"attributeName" "attributeDefinition" "class" "unit" "dateTimeFormatString" "missingValueCode" "missingValueCodeExplanation"
2-
"site_code" "Short name for MacroSheds site. See site_metadata.csv" "character"
2+
"site_code" "Short name for MacroSheds site. See sites.csv" "character"
33
"p_mean" "mean daily precipitation 1989-10-01 to 2009-09-30 (Daymet)" "numeric" "millimeterPerDay" "NA" "missing value"
4-
"pet_mean" "mean daily PET (estimated using Priestley-Taylor formulation with gridded alpha product from Aschonitis et al. 2017) (Daymet)" "numeric" "millimeterPerDay" "NA" "missing value"
4+
"pet_mean" "mean daily PET (estimated using Priestley-Taylor formulation with gridded alpha product from Aschonitis et al. 2017)" "numeric" "millimeterPerDay" "NA" "missing value"
55
"aridity" "aridity (ratio of pet_mean to p_mean) (Daymet)" "numeric" "dimensionless" "NA" "missing value"
66
"p_seasonality" "seasonality and timing of precipitation (estimated using sine curves to represent the annual temperature and preciptiation cycles, positive [negative] values indicate that precipitation peaks in summer [winter], values close to 0 indicate uniform precipitation throughout the year) (Daymet)" "numeric" "dimensionless" "NA" "missing value"
77
"frac_snow" "fraction of precipitation falling as snow (i.e., on days colder than 0°C) (Daymet)" "numeric" "dimensionless" "NA" "missing value"
@@ -27,8 +27,21 @@
2727
"area" "watershed area" "numeric" "squareKilometers" "NA" "missing value"
2828
"elev_mean" "watershed mean elevation" "numeric" "meter" "NA" "missing value"
2929
"slope_mean" "watershed mean slope" "numeric" "metersPerKilometer" "NA" "missing value"
30-
"frac_forest" "forest fraction" "numeric" "!Add units here!" "NA" "missing value"
30+
"frac_forest" "forest fraction" "numeric" "dimensionless" "NA" "missing value"
3131
"dom_land_cover_frac" "fraction of the catchment area associated with the dominant land cover" "numeric" "dimensionless" "NA" "missing value"
3232
"dom_land_cover" "dominant land cover type (Noah-modified 20-category IGBP-MODIS land cover)" "categorical" "NA" "missing value"
3333
"root_depth_50" "root depth (percentile 50% extracted from a root depth distribution based on IGBP land cover) (MODIS)" "numeric" "meter" "NA" "missing value"
3434
"root_depth_99" "root depth (percentile 99% extracted from a root depth distribution based on IGBP land cover) (MODIS)" "numeric" "meter" "NA" "missing value"
35+
"q_mean" "mean daily discharge as runoff" "numeric" "millimeterPerDay" "NA" "missing value"
36+
"runoff_ratio" "runoff ratio (ratio of mean daily runoff to mean daily precipitation)" "numeric" "dimensionless" "NA" "missing value"
37+
"stream_elas" "streamflow precipitation elasticity (sensitivity of streamflow to changes in precipitation at the annual time scale)" "numeric" "dimensionless" "NA" "missing value"
38+
"slope_fdc" "slope of the flow duration curve (between the log-transformed 33rd and 66th streamflow percentiles)" "numeric" "dimensionless" "NA" "missing value"
39+
"baseflow_index_landson" "baseflow index (ratio of mean daily baseflow to mean daily discharge, hydrograph separation performed using Ladson et al. [2013] digital filter)" "numeric" "dimensionless" "NA" "missing value"
40+
"hfd_mean" "mean half flow date (date on which the cumulative discharge since October 1st reaches half of the annual discharge)" "numeric" "dayOfYear" "NA" "missing value"
41+
"Q5" "5% flow quantile (flow flow)" "numeric" "millimeterPerDay" "NA" "missing value"
42+
"Q95" "95% flow quantile (high flow)" "numeric" "millimeterPerDay" "NA" "missing value"
43+
"high_q_freq" "frequency of high-flow days ( > 9 times the median daily flow)" "numeric" "daysPerYear" "NA" "missing value"
44+
"high_q_dur" "average duration of high-flow events (number of consecutive days > 9 times the median daily flow)" "numeric" "day" "NA" "missing value"
45+
"low_q_freq" "frequency of low-flow days ( < 0.2 times the mean daily flow)" "numeric" "daysPerYear" "NA" "missing value"
46+
"low_q_dur" "average duration of low-flow events (number of consecutive days < 0.2 times the mean daily flow)" "numeric" "day" "NA" "missing value"
47+
"zero_q_freq" "frequency of days with Q = 0 mm/day" "numeric" "percent" "NA" "missing value"
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
"attributeName" "attributeDefinition" "class" "unit" "dateTimeFormatString" "missingValueCode" "missingValueCodeExplanation"
2+
"network" "Short name for the data network associated with a MacroSheds site. A network includes one or more domains under common funding/leadership." "character" "NA" "missing value"
3+
"domain" "Short name for the data domain associated with a MacroSheds site. A domain includes one or more sites under common management." "character" "NA" "missing value"
4+
"macrosheds_prodcode" "An identifier for a primary source data product, used internally during MacroSheds data processing. These are the same as the original product codes given by a primary source, unless designated 'VERSIONLESS...'" "character" "NA" "missing value"
5+
"macrosheds_prodname" "One of the time-series or shapefile data products provided by MacroSheds" "categorical" "NA" "missing value"
6+
"doi" "The digital object identifier of the original data source, if it exists and we were able to find it" "character" "NA" "missing value"
7+
"data_status" "Any special statuses associated with this data product, such as “provisional”" "categorical" "NA" "missing value"
8+
"license" "The license that governs reuse of the original data product" "categorical" "NA" "missing value"
9+
"license_type" "The general category of license, taking into account both the license itself and any custom IR terms attached to the original data product" "categorical" "NA" "missing value"
10+
"license_sharealike" "Details of the “share-alike” clause, if applicable" "categorical" "NA" "missing value"
11+
"IR_acknowledgement_text" "Where specific acknowledgement text is given without 'for example' (i.e. phrasing of the acknowledgement must be verbatim), we include that text in this column." "character" "NA" "missing value"
12+
"IR_acknowledge_domain" "Acknowledgment includes the name of the domain" "categorical" "NA" "missing value"
13+
"IR_acknowledge_funding_sources" "Acknowledgment includes funding sources" "categorical" "NA" "missing value"
14+
"IR_acknowledge_grant_numbers" "Acknowledgment includes the grant numbers of associated funding awards (these are in column 'funding')" "categorical" "NA" "missing value"
15+
"IR_notify_of_intentions" "Primary source authors want to know what you intend to do with their data? Will you publish, etc.?" "categorical" "NA" "missing value"
16+
"IR_notify_on_distribution" "Primary source authors want to know of any publications or derivative works you made with their data, after the fact." "categorical" "NA" "missing value"
17+
"IR_provide_online_access" "Primary source authors want you to provide online access to any digital products derived from the data" "categorical" "NA" "missing value"
18+
"IR_provide_two_reprints" "This rule is likely vestigial. Providing online access should be fine. But it still does exist in a few places." "categorical" "NA" "missing value"
19+
"IR_collaboration_consultation" "You are encouraged to contact the dataset creator, e.g. to prevent duplication of work, or to collaborate where appropriate" "categorical" "NA" "missing value"
20+
"IR_questions" "You are encouraged to contact the dataset creator with any questions about methodology or results" "categorical" "NA" "missing value"
21+
"IR_needs_clarification" "We are still in the process of figuring out the meaning of one or more IR clauses" "character" "NA" "missing value"
22+
"contact" "The email address of primary dataset contact" "character" "NA" "missing value"
23+
"contact_name1" "The name of primary dataset contact" "character" "NA" "missing value"
24+
"creator_name1" "Name of dataset creator" "character" "NA" "missing value"
25+
"funding" "Grant award numbers associated with the dataset. All awards not necessarily included here." "character" "NA" "missing value"
26+
"citation" "The citation for the primary data product" "character" "NA" "missing value"
27+
"link" "A link to the primary data product, or a landing page thereof" "character" "NA" "missing value"
28+
"link_download_datetime" "When this product was last retrieved by MacroSheds" "Date" "Y-M-D h:m:s UTC" "NA" "missing value"
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
"attributeName" "attributeDefinition" "class" "unit" "dateTimeFormatString" "missingValueCode" "missingValueCodeExplanation"
2+
"prodname" "MacroSheds product name" "character"
3+
"primary_source" "Data creator" "character" "NA" "missing value"
4+
"retrieved_from_GEE" "If TRUE, data were not retrieved directly from primary source, but via Google Earth Engine." "character" "NA" "missing value"
5+
"doi" "DOI for data source" "character" "NA" "missing value"
6+
"license" "Link to license, or name of license governing data use" "character" "NA" "missing value"
7+
"citation" "Citation for data product" "character" "NA" "missing value"
8+
"url" "URL for more information" "character" "NA" "missing value"
9+
"addtl_info" "Even more information" "character" "NA" "missing value"

0 commit comments

Comments
 (0)