-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathdata_science.qmd
More file actions
71 lines (50 loc) · 3.57 KB
/
data_science.qmd
File metadata and controls
71 lines (50 loc) · 3.57 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
title: "Data Science Labs"
title-block-style: "none"
---
This page highlights skills learned by completing labs from 523 C: Environmental Data Science Applications for Water Resources, taught at Colorado State University by Mike Johnson, PhD.
## COVID Trends
```{r}
#| warning: FALSE
#| echo: FALSE
#| fig-cap: "Weighted Mean Center of the COVID-19 Outbreak"
knitr::include_graphics("media/weighted_means.png")
```
Analyzed and visualized COVID-19 vector data, including data wrangling, spatial and temporal analysis, and linear model development using `dplyr`, `tidyverse`, `lubridate`, `sf`, `zoo`, `ggplot2` and `flextable`.
View the lab here: [Lab 1 - COVID Trends](https://cmhoskins.github.io/csu_523c/lab-01.html)
## Distances and Projections
```{r}
#| warning: FALSE
#| echo: FALSE
#| fig-cap: "Map of the Most Populous cities in Each State Within 100 Miles of the U.S. Border"
knitr::include_graphics("media/border-2.png")
```
Examined the structure and properties of sf, sfc, and sfg objects and applied the `sf` package for spatial data workflows, including coordinate reference system transformations, distance calculations, and geometry handling. Visualized spatial data using `mapview` and `gglighlight`.
View the lab here: [Lab 2 - Distances and Projections](https://cmhoskins.github.io/csu_523c/lab-02.html)
## Tessallations, Point-in-Polygon
```{r}
#| warning: FALSE
#| echo: FALSE
#| fig-cap: "Distribution of Hydroelectric Dams across the USA with Hexagonal Grid Tessallation (above mean + 1 standard deviations)"
knitr::include_graphics("media/hydroelectric.png")
```
Implemented multiple tessellation approaches (Voronoi, triangulated, square grid, and hexagonal grids) to analyze the spatial distribution of U.S. dams within a CONUS area of interest defined using the `AOI` package. Conducted point-in-polygon analysis, simplified geometries with `rmapshaper`, and developed reusable functions to streamline analysis and visualization. Produced interactive maps with `leaflet` and formatted summary tables using `kableExtra`, with attention to implications of the Modifiable Areal Unit Problem (MAUP).
View the lab here: [Lab 3 - Tessallations, Point-in-Polygon](https://cmhoskins.github.io/csu_523c/lab-03.html)
## Rasters & Remote Sensing
```{r}
#| warning: FALSE
#| echo: FALSE
#| fig-cap: "Statellite imagery raster layers of 5 indices for Palo, Iowa during the flooding event on September 26, 2016"
knitr::include_graphics("media/indices.png")
```
Created flood extent maps using satellite imagery from a flooding event along the Cedar and Wapsipinicon rivers. Accessed and processed Landsat Collection 2 imagery with `terra` and `rstac`, generated RGB composites for initial visualization, and applied raster algebra to compute spectral indices, threshold flood signals, and extract flood clusters using the `stats` package. Classified flood extent and flood certainty.
View the lab here: [Lab 4 - Raster & Remote Sensing](https://cmhoskins.github.io/csu_523c/lab-04.html)
## Machine Learning in Hydrology
```{r}
#| warning: FALSE
#| echo: FALSE
#| fig-cap: "Random Forest Model Using PET and Precipitation to Predict Streamflow"
knitr::include_graphics("media/model-pet.png")
```
Applied predictive modeling workflows using the `tidymodels` framework and the CAMELS dataset, implementing and comparing linear regression, random forest, `xgboost` and neural network models using `baguette`. Evaluated model performance using RMSE, R², and MAE, and visualized results using `vip` and `workflows`.
View the lab here: [Lab 5 - Machine Learning in Hydrology](https://cmhoskins.github.io/csu_523c/lab-05.html)