-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathraster-masking-example.qmd
More file actions
208 lines (151 loc) · 11.1 KB
/
raster-masking-example.qmd
File metadata and controls
208 lines (151 loc) · 11.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
---
title: "Creating a Mask for Raster Data in R"
author: "Elke Windschitl"
date: "2023-09-29"
format: html
editor: source
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,
warning = FALSE,
message = FALSE)
```
Description: In this qmd, I showcase how to create a mask for raster data and apply that mask to depth data in the Santa Barbara Channel.
## Introduction
Sometimes when combining spatial data, one needs to change the spatial resolution or extent, or reproject in order to match other datasets. This is necessary if all of the data need to match the same grid for analysis like a Maxent species distribution model. One approach to doing this is to create a common data "mask" and apply that mask to other raster data. In this Rmd, I create a mask by defining the extent, crs of the target data, resolution, and I remove areas from the mask that are not relevant to my analysis. I will be applying the mask to ocean depth data in the Santa Barbara Channel. This analysis was completed by Elke Windschitl as part of the kelpGeoMod Bren School of Environmental Science and Management 2023 capstone project authored by Erika Egg, Jessica French, Javier Parton, and Elke Windschitl. Some of the choices made here were in context of the whole project, such as the resolution and extent chosen.
## The Data
I used data from two sources in this process. The first is vector data from California.gov, the second is depth data from NOAA. The kelpGeoMod repository describes the data as following:
#### Land Bounds:
The [California County Boundaries](https://gis.data.ca.gov/datasets/CALFIRE-Forestry::california-county-boundaries/explore) data contain shape file for California lands to be masked out. County data was used to make the mask which had the Channel Islands already.
The original data source is described as: This layer provides an initial offering as "best available" at 1:24,000 scale. Hosted on CAL FIRE AGOL.
In late 1996, the Dept of Conservation (DOC) surveyed state and federal agencies about the county boundary coverage they used. As a result, DOC adopted the 1:24,000 (24K) scale U.S. Bureau of Reclamation (USBR) dataset (USGS source) for their Farmland Mapping and Monitoring Program (FMMP) but with several modifications. Detailed documentation of these changes is provided by FMMP and included in the lineage section of the metadata.
A dataset was made available (approximately 2004) through CALFIRE - FRAP and the California Spatial Information Library (CaSIL), with additional updates throughout subsequent years. More recently, an effort was made to improve the coastal linework by using the previous interior linework from the 24k data, but replacing the coastal linework based on NOAA's ERMA coastal dataset (which used NAIP 2010).
In this dataset, all bays (plus bay islands and constructed features) are merged into the mainland, and coastal features (such as islands and constructed features) are not included, with the exception of the Channel Islands which ARE included.
This service represents the latest released version, and is updated when new versions are released. As of June, 2019 it represents cnty19_1.
#### Depth:
The [ETOPO Global Relief Model 2022 (Bedrock 15 arcseconds) dataset](https://www.ncei.noaa.gov/products/etopo-global-relief-model) contains ocean depth data in meters for the Santa Barbara Channel. The data were downloaded via the gid extract tool at the original source and transformed using the ETOPO_2022_v1_15s_N45W120_geoid.tif and ETOPO_2022_v1_15s_N45W135_geoid.tif geoid height tiles as recommended in the original user guide section 2.2.
The original data source is described as: ETOPO 2022, a global relief model with 15 arc-second resolution seamlessly integrating topographic and bathymetric data. The ETOPO 2022 model uses a combination of numerous airborne lidar, satellite-derived topography, and shipborne bathymetry datasets from U.S. national and global sources. ETOPO 2022 uses bare-earth topographic data from NASAs ICESat-2 and other vetted data sources to independently validate both the input datasets and the final ETOPO 2022 model. ETOPO 2022 is available in "Ice Surface" (top of Antarctic and Greenland ice sheets) and "Bedrock" (base of the ice sheets) versions.
## Methods
To get started, there are several packages I will be using. I will use *sf* for handling vector, *terra* and *raster* for handling raster data, and *mapview* for viewing my data.
```{r message=FALSE}
#---- Load libraries
library(sf)
library(terra)
library(raster)
library(mapview)
#---- Set up the data directory
#data_dir <- insert your file path to your data
```
```{r include=FALSE}
data_dir <- "/Users/elkewindschitl/Documents/MEDS/kelpGeoMod/final-data"
```
First, I make an empty raster with my desired resolution, extent, crs. I then explore the size of the raster and assign all cells a value of 1.
```{r}
#---- Make an empty raster of 0.008 degree resolution in WGS84 with desired AOI
empty_rast <- rast() # make an empty raster
crs(empty_rast) <- "EPSG:4326" # confirm/set WGS84
crs(empty_rast) # check crs value
ext(empty_rast) <- c(-120.65, -118.80, 33.85, 34.59) # set extent
ext(empty_rast) # check ext
res(empty_rast) <- c(0.008, 0.008) # set resolution
res(empty_rast) # check resoultion
# Explore the size of the raster
nrow(empty_rast) # identify how many rows we have
ncol(empty_rast) # identify how many columns we have
ncell(empty_rast) # identify how many total cells we have
values(empty_rast) <- 1 # fill in values of 1
mask <- empty_rast # rename to mask
mask
```
```{r message=FALSE}
mapview(raster(mask),
col.regions = list("blue4","#c4fbff"),
layer.name = "Raster Value") # this should be a raster with only values of one in our AOI
```
At this point the SpatRaster has the dimensions, resolution, and crs that I would expect. Next, I used land boundary data from California to remove data from the raster where we would expect to find land based on the vector shape provided. I have to transform the crs, convert to a terra object, then rasterize the layer.
```{r}
#---- Remove land boundaries from the mask
# Read in land shapefile
boundaries <- st_read(file.path(data_dir, "01-raw-data/02-ca-county-land-boundaries-raw/California_County_Boundaries/cnty19_1.shp"))
boundaries <- st_transform(x = boundaries, crs = 4326) # Transform crs to exact crs we want
boundaries <- vect(boundaries) # Make a terra vector object
boundaries <- terra::rasterize(x = boundaries,
y = mask) # Rasterize to a terra raster object with ext of mask
mapview(raster(boundaries),
col.regions = list("blue4","#c4fbff"),
layer.name = "Raster Value")
# Right now land = 1. Create a reclassification matrix so land = 0
reclassification_matrix <- matrix(c(NaN, 1, 1, NaN),
ncol = 2,
byrow = TRUE)
# Apply classification matrix
land_bounds <- classify(boundaries, rcl = reclassification_matrix)
mapview(raster(land_bounds),
col.regions = list("blue4","#c4fbff"),
layer.name = "Raster Value") # plot to check
mask <- land_bounds # make this the new mask
```
Great! I now have a mask with values of 1 for the ocean surrounding Santa Barbara and the Channel Islands. This will be used to mask the depth data below. First, though I need read in and manipulate some of the depth data as per the user documentation by NOAA.
```{r message=FALSE, warning=FALSE}
# Read in depth data
depth_dat <- terra::rast(file.path(data_dir, "01-raw-data/06-depth-noaa-raw/exportImage.tiff"))
# Plot to visualize
mapview(raster(depth_dat),
col.regions = list("blue4","green"),
layer.name = "Depth-Elevation (m)")
```
The depth data covers the area I am interested in. However, the [ETOPO User Guide](https://www.ngdc.noaa.gov/mgg/global/relief/ETOPO2022/docs/1.2%20ETOPO%202022%20User%20Guide.pdf) mentions that the data come with "an accompanying "geoid" tile for converting EGM2008 geoid heights into WGS84 ellipsoid elevation heights (EPSG:4979). Since most other geoid, ellipsoid, and/or tidal vertical datums are defined by grids in reference to the WGS84 ellipsoid, this eases the conversion of ETOPO 2022 tiles into other vertical reference datums of the user's choice." We will do that here following their guidelines. My AOI is split by two tiles, so I will need to use both and combine them.
```{r}
# Read in Geoid tiles for depth correction
geoid1 <- terra::rast(file.path(data_dir, "01-raw-data/06-depth-noaa-raw/ETOPO_2022_v1_15s_N45W135_geoid.tif")) # left tile
mapview(raster(geoid1),
col.regions = list("purple","green"),
layer.name = "Conversion Number")
geoid2 <- terra::rast(file.path(data_dir, "01-raw-data/06-depth-noaa-raw/ETOPO_2022_v1_15s_N45W120_geoid.tif")) # right tile
mapview(raster(geoid2),
col.regions = list("purple","green"),
layer.name = "Conversion Number")
geoid_tile <- merge(geoid1, geoid2) # merge the tiles
mapview(raster(geoid_tile),
col.regions = list("purple","green"),
layer.name = "Conversion Number")
# Crop geoid tile to match depth extent
geoid_tile_c <- crop(x = geoid_tile,
y = depth_dat)
mapview(raster(geoid_tile_c),
col.regions = list("purple","green"),
layer.name = "Conversion Number")
# Add the tiles together as per the user guide instructions
depth_wgs84 <- depth_dat + geoid_tile_c
mapview(raster(depth_wgs84),
col.regions = list("blue4","green"),
layer.name = "Depth-Elevation WGS84 Ellipsoid (m)")
```
Now I have the depth data I want, and I have my mask. It is time to get the depth data in line with with the mask. I have to crop and resample. For the resampling, I use bilinear reseampling because I have a continuous numerical variable. Then I can mask.
```{r}
# Check the crs, and reproject the crs if different
crs(depth_wgs84) == crs(mask)
# Crop to mask
depth_wgs84 <- crop(x = depth_wgs84,
y = mask) # crop to set extent to mask extent
# Resample to mask resolution
resampled_depth <- terra::resample(x = depth_wgs84,
y = mask,
method = "bilinear")
# Mask to remove land
resampled_depth <- mask(resampled_depth, mask) #convert to package raster to utilize mapview
# Confirm integrity of raster
crs(resampled_depth) == crs(mask)
ext(resampled_depth) == ext(mask)
res(resampled_depth) == res(mask)
nrow(resampled_depth) == nrow(mask)
ncol(resampled_depth) == ncol(mask)
```
## Results
```{r}
mapview(raster(resampled_depth),
col.regions=list("blue4","#c4fbff"),
layer.name="Resampled & Masked Depth (m)")
```
## Conclusion
A mask is a great way to standardize multiple raster datasets for combining. While this example shows one masking process, this process could be applied to numerous datasets with the same mask. Once data are standardized, they can be combined into a stack or a brick to be used for future analyses. Read the [kelpGeoMod technical documentation](https://github.com/kelpGeoMod/kelpGeoMod-capstone-project/blob/main/04-README-images/technical-doc.pdf) to understand at length how this project standardized the spatiotemporal resolution of many datasets in preparation for kelp forest modeling.