::install_github("BigelowLab/maxnet")
devtoolsinstall.packages(c("dplyr", "sf", "stars", "geodata",
"dismo", "lubridate", "sdmpredictors",
"ggplot2", "cmocean", "janitor", "DT",
"here"))
Fit SDM
We will use the presence and absence data and the environmental layers that we created in the previous steps to assemble the data frames needed to fit our SDM model.
Pre-SDM Set-up
You will need to install a new version of maxnet for this tutorial. No need to do this on the JupyterHub. It is already installed.
Load the needed packages.
suppressPackageStartupMessages({
library(maxnet)
library(dplyr)
library(sf)
library(stars)
library(geodata)
library(dismo)
library(lubridate)
library(sdmpredictors)
#library(zoon)
library(ggplot2)
library(cmocean)
library(janitor)
library(DT)
})
Set the file location.
::i_am("tutorial/Steps_sdm_maxnet.Rmd") here
here() starts at /Users/eli.holmes/Documents/GitHub/ohw23_proj_marinesdms
Load region files. We created our region objects in a separate file Region data and saved these in data/region
. We will load these now.
Load the bounding box polygon and create a bounding box.
#Loading bounding box for the area of interest
<- here::here("data", "region", "BoundingBox.shp")
fil <- sf::read_sf(fil)
extent_polygon <- sf::st_bbox(extent_polygon) bbox
Load occurence data
Here we load the data prepared in the previous step.
# presence data
<- here::here("data", "raw-bio", "io-sea-turtles-clean.csv")
fil <- read.csv(fil) occ.sub
Create sf_points data frame.
<- sf::st_as_sf(occ.sub, coords = c("lon", "lat"), crs = 4326)
occ.points head(occ.points)
Simple feature collection with 6 features and 7 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 53.083 ymin: 6.40193 xmax: 59.87883 ymax: 24.577
Geodetic CRS: WGS 84
sci.name obsv.datetime life.stage bathy SST SSS common.name
1 Caretta caretta 2011-04-12T19:12:41Z Juvenile 3051 28.67 35.74 Loggerhead
2 Chelonia mydas 2018-03-31T06:44:00Z <NA> 14 27.96 38.71 Green
3 Chelonia mydas 2019-05-28T12:15:00Z <NA> 2 27.93 38.65 Green
4 Caretta caretta 2011-04-11T04:03:19Z Juvenile 2736 26.92 36.10 Loggerhead
5 Caretta caretta 2011-03-24T14:33:23Z Juvenile 2030 26.68 36.05 Loggerhead
6 Chelonia mydas 2018-04-05T23:33:00Z <NA> 2 27.94 38.67 Green
geometry
1 POINT (59.87883 6.40193)
2 POINT (53.201 24.508)
3 POINT (53.083 24.577)
4 POINT (54.28696 16.05778)
5 POINT (53.83602 16.52469)
6 POINT (53.092 24.549)
Load background data
# absence data
<- here::here("data", "raw-bio", "pts_absence.csv")
fil <- read.csv(fil) # X is lon and Y is lat pts.abs
Check column names.
colnames(pts.abs)
[1] "X" "Y"
Change columns names to lon and lat and remove any NAs in the data.
colnames(pts.abs) <- c("lon","lat")
<- na.omit(pts.abs) pts.abs
Convert to sf_points object. Set the crs to 4326.
<- sf::st_as_sf(pts.abs, coords = c("lon", "lat"), crs = 4326) abs.points
Get the environment for the lat/lon locations
Here we create the data frame with the environmental variables for our presence and absence locations.
Load environmental layers
Set the location of the data directory.
<- here::here("data", "env")
dir_env options(sdmpredictors_datadir = dir_env)
Specify the layers that we want. The layers were saved to the data/env
directory.
<- c("BO_sstmean", "BO_bathymean", "BO22_ph", "BO2_dissoxmean_bdmean", "BO2_salinitymean_ss", "BO2_chlomean_ss", "BO21_nitratemean_ss") layercodes
Load the layers into the env
object. We want to set rasterstack equal true to get one file for our variables.
<- sdmpredictors::load_layers(layercodes, rasterstack = TRUE) env
Create a stars object
There are a few ways that we can get the values of the environmental variables in our raster stack for the lat/lon locations. We will use the stars and terra functions as these are the new (2023) packages in R for this purpose. You will find older approaches if you search and possibly AI will suggest older approaches.
Step one is to convert our raster stack to a stars object and split the stars object into layers.
<- stars::st_as_stars(env) # convert to stars object
env.stars <- terra::split(env.stars) env.stars
Extract the variables for our points
Now we can extract the variables for our presence and absence points.
<- stars::st_extract(env.stars, sf::st_coordinates(occ.points)) %>%
occ.env ::as_tibble()
dplyr
<- stars::st_extract(env.stars, sf::st_coordinates(abs.points)) %>%
abs.env ::as_tibble() dplyr
Now we have a data frame with the variables for presence and absence.
head(abs.env)
# A tibble: 6 × 7
BO_sstmean BO_bathymean BO22_ph BO2_dissoxmean_bdmean BO2_salinitymean_ss
<dbl> <dbl> <dbl> <dbl> <dbl>
1 28.4 -2445 8.16 111. 36.2
2 26.9 -3142 8.12 119. 36.5
3 28.3 -4549 8.19 159. 36.0
4 27.1 -4970 8.17 180. 35.5
5 26.8 -3869 8.16 156. 36.0
6 28.0 -5095 8.17 183. 35.4
# ℹ 2 more variables: BO2_chlomean_ss <dbl>, BO21_nitratemean_ss <dbl>
Let’s check the sizes. They should have the same number of rows.
dim(occ.points)
[1] 2064 8
dim(occ.env)
[1] 2064 7
Let’s check the sizes. They should have the same number of rows.
dim(abs.points)
[1] 1000 1
dim(abs.env)
[1] 1000 7
Our environmental data have some NAs. We need to remove these.
any(is.na(occ.env))
[1] TRUE
any(is.na(abs.env))
[1] TRUE
Remove the NAs.
<- na.omit(occ.env)
occ.env <- na.omit(abs.env) abs.env
SDM Model
We will fit with MaxEnt using the maxnet package. The function is
mod <- maxnet::maxnet(pres, env_df)
env_df
is a data frame of environmental data (in columns) for each location (in rows) of the presences and absences. presis a string of 0s and 1s specifying if the row in
environ` is a presence (1) or absence (0).
Set up the env_df
We combining the two data frames with rbind()
.
<- rbind(occ.env, abs.env) env_df
Set up the pres
string
The rows of occ.env
are all 1 (presence) and the rows of abs.env
are all 0 (absent).
<- c(rep(1, nrow(occ.env)), rep(0, nrow(abs.env))) pres
Run the model
<- maxnet::maxnet(pres, env_df) sdm.model
Model metrics
<- plot(sdm.model, type = "cloglog") responses
Save model
<- env.stars %>% sf::st_crop(bbox)
env.stars.crop <- here::here("data", "models", "io-turtle.RData")
fil save(sdm.model, env.stars.crop, occ.points, abs.points, file=fil)