This component contains a .zip archive of the raw, intermediate, and final data used in the manuscript, including all Bayesian draws. In total, the archive contains 206 files (each file has been split into approximately 100 MB to help facilitate their upload to OSF; in total, the archive is approximate 21 GB). These files were created using the 7zip program on a Windows 10 machine. To access these data you will need to use 7zip, or a similar program, to "unzip" the resultant archive. The result of unzipping the archive is a folder called `Data`.
Below is a description of the various folders and files located in the `Data` folder.
@[toc]
# `Raw_Source/`
This folder contained information associated with raw data files that contained some important non-text source (e.g., shapefiles), or had some complicated structure (e.g., multi-sheet Excel files).
Key among these are the files located in `/Isotope/SIF_reports/`. The `.xlsx` files in this directory contain the analytical isotope results from the [University of Wyoming's Stable Facility (SIF)](http://www.uwyo.edu/sif/).
# `Raw_Text/`
Files in this folder are largely text-based, or have generally simple data structures.
The `/Environmental/` subdirectory contains environmental data collected from the GLEES (Glacier Lakes Ecosystem Experiments Site) AmeriFlux tower, as well as data from NADP (National Atmospheric Deposition Program) sites in the area.
The `/Isotopes/` subdirectory contains useful isotope data from both precipitation and vapor sources. Specifically, precipitation isotope data are recorded in the `WaterIsotopes_WilliamsLab_PrecipOnly_2017-11-09.xlsx` folder. These data are synthesized from the raw SIF reports. The `/ForestService/` subdirectory contains raw vapor measurement data and information. The `/WISER/` subdirectory contains precipitation and vapor data downloaded from IAEA's (International Atomic Energy Agency) [Water Isotope System for Data Analysis, Visualization, and Electronic Retrieval](http://www-naweb.iaea.org/napc/ih/IHS_resources_isohis.html). Similarly, the `/SWVID/` subdirectory contains vapor isotope data downloaded from the [Stable Water Vapor Isotope Database](https://vapor-isotope.yale.edu/). All data in the `/Isotopes/` folder was either used directly, or as priors for the models assessed.
# `Edit_Script/`
Data in this directory are intermediated outputs from R scripts. Prefixed numbers indicate the source script used to help better understand the context in which the data were generated.
`/ExpandedScenarios/` contains model outputs from the 576 "expanded" scenarios analyzed using both the R and Stan statistical programming languages. Files are named by the combination of factors used to generate their processing scenario.
Similarly, the `/CrossValidation/` subdirectory contains posterior outputs for the environmental covariate analysis, which were subject to leave-one-out cross-validation information criteria. Again, files are named based on their processing scenario.
# `Edit_Manual/`
Some processing steps were difficult to implement in R. Thus, this folder contains some geographic data that were processed manually, with ArcGIS (10.3).
# `Output/`
A copy of these files and information are available on HydroShare: [https://www.hydroshare.org/resource/7c7267d19dc94e1b834d0fc82bdffa50/](https://www.hydroshare.org/resource/7c7267d19dc94e1b834d0fc82bdffa50/).
These data were generated via the R script Scripts/015_OutputGeneration.R using both raw and intermediate data (again, see the compendium for more on these files).
This README contains descriptions of the following data sets included in this resource:
* `vapor_isotopes.xlsx`
* `precipitation_isotopes.xlsx`
* `high_resolution_environmental.csv`
* `low_resolution_environmental.csv`
* `expanded_posteriors.csv`
* `core_posteriors.csv`
* `environmental_posteriors.csv`
Additional files used for reference:
* `WI_Template.xlsx`
## `vapor_isotopes.xlsx`
Vapor data measured at GLEES.
### Column definitions
These data were formatted for submission to [waterisotopes.org](waterisotopes.org). To satisfy the requirements of submitting to that site, we used their submission template: WI_Template.xlsx. The template includes a complete set of column definitions, by sheet.
## `precipitation_isotopes.xlsx`
Precipitation data collected at several sites throughout the Snowy Range of the Medicine Bow National Forest.
### Column definitions
These data were formatted for submission to [waterisotopes.org](waterisotopes.org). To satisfy the requirements of submitting to that site, we used their submission template: WI_Template.xlsx. The template includes a complete set of column definitions, by sheet.
## `high_resolution_environmental.csv`
Half-hour environmental/weather data measured at the US-GLE AmeriFlux site, but relevant to this study. For more about these data: [https://ameriflux.lbl.gov/sites/siteinfo/US-GLE](https://ameriflux.lbl.gov/sites/siteinfo/US-GLE).
### Column definitions
* date_time: The date and time a particular observation was made. Data is half-hourly, thus the date and time represent the end of a half-hour measurement.
* format: yyyy-mm-dd hh:mm:ss
* air_pressure: Air pressure.
* unit: kPa
* air_temperature: Air temperature.
* unit: degrees C
* relative_humidity: Relative humidity.
* unit: %
* wind_direction: Wind direction.
* unit: degrees
* wind_speed: Wind speed.
* unit: m/s
* mixing_ratio: Mixing or humidity ratio.
* unit: ppm
* latitude: Latitude.
* unit: decimal degrees
* datum: WGS 84 (EPSG:4326)
* longitude: Longitude.
* unit: decimal degrees
* datum: WGS 84 (EPSG:4326)
* elevation: Elevation above mean sea level.
* unit: m
* datum: WGS 84 (EPSG:4326)
## `low_resolution_environmental.csv`
Daily precipitation data measured at or near the WY95 National Atmospheric Deposition Program site. More about the site: [https://nadp.slh.wisc.edu/data/sites/siteDetails.aspx?net=NTN&id=WY95](https://nadp.slh.wisc.edu/data/sites/siteDetails.aspx?net=NTN&id=WY95).
### Column definitions
* date: Date of an observation.
* format: yyyy-mm-dd
* precipitation: Amount of precipitation recorded.
* unit: mm
* latitude: Latitude.
* unit: decimal degrees
* datum: WGS 84 (EPSG:4326)
* longitude: Longitude.
* unit: decimal degrees
* datum: WGS 84 (EPSG:4326)
* elevation: Elevation above mean sea level.
* unit: m
* datum: WGS 84 (EPSG:4326)
## `expanded_posteriors.csv`
This file contains estimates of the parameters determined in all 576 processing scenarios (the "expanded" scenarios; see Table 1 in the manuscript). That is, rather that provide the full set of posterior draws for all scenarios, just the point estimates are highlighted for their use in future estimation efforts.
The basic form of the model used for each scenario was:
$$
\delta_{A} \sim \mathcal{N}(\mu_{eq} = \delta_{A,eq}, \ \sigma_{eq} \sim \mathcal{N}(\mu, \sigma))
$$
Such that:
* $\delta_{A}$: The measured vapor at US-GLE, given some scenario.
* $\mathcal{N}$: Symbolic representation of the normal distribution, with two parameters: mean ($\mu$) and standard deviation ($\sigma$),
* $\mu_{eq}$: Mean for a given equilibrium scenario.
* $\delta_{A, eq}$: The expected equilibrium value, given a scenario.
* $\sigma_{eq}$: The standard deviation in the uncertainty/error for a given equilibrium scenario. The parameter indicated in the __parameter__ column.
* $\mu$: Mean uncertainty; the value estimated in the __mean__ column.
* $\sigma$: Standard deviation of the uncertainty; the value estimated in the __sd__ column.
(Note: This equation is a slightly more explicit form from that used in the manuscript. This version does not include uncertainty in the vapor measurements, $\delta_A$, which was included in the analysis).
### Column definitions
* isotope: The isotope to which a given scenario applies.
* values: H2, O18
* resolution: The temporal resolution at which a set of measurements or equilibrium estimates were aggregated.
* values: daily, weekly, monthly
* season: The season over which a scenario was run.
* values: annual, summer, winter
* note: annual (Jan-Dec), summer (Jun-Oct), winter (Nov-May)
* equation: The different equations (see the manuscript) used to estimate $\delta_{eq}$.
* values: Equation (3), Equation (4), Equation (5)
* flux_weighting: Whether or not the equilibrium estimate was flux weighted using the relative masses of precipitation associated with a precipitation isotope sample. No flux weighting was associated with the daily __resolution__ data, as we did not have higher-resolution precipitation samples.
* values: unweighted, weighted
* temporal_lag: The temporal lag used to match a set of vapor measurements and equilibrium estimates.
* values: 0, 1, 2, 3, 4, 5, 6, 7
* note: 0 applies to: days, weeks, months; 1 applies to: day, week, month; 2 applies to: days; 3 applies to: days; 4 applies to: days; 5 applies to: days; 6 applies to: days; 7 applies to: days.
* temperature: Whether or not air temperature or an estimate of cloud temperature (hydrometeor) was used to calculate the equilibrium vapor value.
* values: air, hydrometeor
* parameter: The parameter estimated. In this case, sigma ($\sigma_{eq}$) was the value determined.
* mean: The estimated mean uncertainty ($\mu$).
* sd: The estimated standard deviation in the uncertainty ($\sigma$).
## `core_posteriors.csv`
The posterior point estimates for the "core" scenarios that investigate the importance of apparent disequilibrium (separation) between the observed/measured vapor value and the equilibrium vapor value, given a processing scenario.
$$
\delta_A \sim \mathcal{N}(\mu_{eq} = \delta_{A,eq} - \Delta, \ \sigma_{eq} \sim \mathcal{N}(\mu, \sigma))
$$
Where:
$$
\Delta \sim \mathcal{N} (\mu_{\Delta}, \sigma_{\Delta})
$$
Such that:
* $\Delta$: Estimated apparent disequilibrium (separation).
* $\mu_{\Delta}$: Mean apparent disequilibrium.
* $\sigma_{\Delta}$: Standard deviation in the estimate of apparent disequilibrium.
All other parameters are similar to those defined for the file _expanded_posteriors.csv_. (Note: This equation is a slightly more explicit form from that used in the manuscript.)
Some of the columns below contain similar information to the expanded_posteriors.csv data, so some details are omitted for the sake of brevity.
### Column definitions
* isotope: The isotope to which a given scenario applies.
* values: H2, O18
* resolution: The temporal resolution at which a set of measurements or equilibrium estimates were aggregated.
* values: daily, weekly, monthly
* season: The season over which a scenario was run.
* values: annual, summer, winter
* equation: The different equations (see the manuscript) used to estimate $\delta_{eq}$.
* values: Equation (3), Equation (4), Equation (5)
* parameter: The parameter of interest, given the model identified above.
* values: apparent disequilibrium ($\mu_{\Delta}$; $\Delta$), sigma ($\sigma_{\Delta}$)
* mean: The estimated mean uncertainty ($\mu$).
* sd: The estimated standard deviation in the uncertainty ($\sigma$).
## `environmental_posteriors.csv`
The posterior estimates for the environmental covariate model. This includes scenarios in which _no_ environmental covariates were found to be predictive (model contains only an intercept and sigma term) and others in which the covariates were predictive.
When no environmental covariates were found predictive, the model took on the form highlighted for the core_posteriors.csv data.
When there were predictive covariates then the model looked something like:
$$
\delta_A \sim \mathcal{N} (\mu_{eq} = \delta_{A,eq} - E \vec \beta, \sigma_{eq})
$$
Where each element, $i$, in the vector $\vec \beta$ is distributed as:
$$
\beta_i \sim \mathcal{N} (\mu_i, \sigma_i)
$$
Such that:
* $E$: The design matrix (matrix containing all the environmental covariate data).
* $\vec \beta$: A vector of slope parameters, including an intercept. In this context, the intercept takes on the function of the apparent disequilibrium value.
Some of the columns below contain similar information to the expanded_posteriors.csv data, so some details are omitted for the sake of brevity.
### Column definitions
* isotope: The isotope to which a given scenario applies.
* values: H2, O18
* resolution: The resolution at which a set of measurements or equilibrium estimates were aggregated.
* values: daily, weekly, monthly
* season: The season over which a scenario was run.
* values: annual, summer, winter
* equation: The different equations (see the manuscript) used to estimate $\delta_{eq}$.
* values: Equation (3), Equation (4), Equation (5)
* parameter
* values: intercept, air_temperature, air_pressure, snow, vapor_gradient, wind_speed, sigma
* air_temperature, air_pressure, snow, vapor_gradient, and wind_speed represent slopes for the linear model indicated above. In that context, the __mean__ column represents the mean slope value for a given environmental covariate, while the __sd__ column represents the standard deviation in that slope parameter.
* The intercept parameter can be similarly interpreted as apparent disequilibrium, accounting for the linear influence of environmental covariates (if found predictive). In the absence of predictive covariates, then the intercept parameter is equal to apparent disequilibrium.
* mean: The estimated mean uncertainty ($\mu$).
* sd: The estimated standard deviation for a parameter ($\sigma$).