Marine SDM Variables

Here we discuss an approach for finding a set of environmental variables to use for our sea turtle SDM. In this case, we are not sea turtle experts so we used AI to help us search for variables to include. The workflow has two steps. We are using the sdmpredictors R package to extract data layers for the marine environment.

  1. Define some variables
  2. Extract the variables for our presence and absence location data

Querying AI

The sdmpredictors package was created in 2022 and ChatGPT does not have data this recent so we used Google Bard to help us find marine environmental data variables for sea turtles.

To obtain the marine environmental data for our SDMs for predicting the occurrence of green sea turtle in the Arabian Sea, we used the sdmpredictors R package. There are over 100 environmental data layers that can be found in the sdmpredictors package. These data layers cover a wide range of environmental variables, including climate, land cover, marine, freshwater, and soil properties.

In order to define the best variables for using in our sea turtle prediction, we used the assistance of the artificial intelligence chatbot developed by Google - Google Bard. Google Bard has access to a vast amount of information on sea turtles and their habitat and so can understand complex relationships between environmental variables and sea turtle distribution.

The prompt used for achieving the best variables for green sea turtles was:

What is the R package sdmpredictors?
How to use it to obtain environmental datasets from an especific region?
Which marine datasets available in sdmpreditors can be used for create a sdm to sea turtles? give me some examples
List all the names of these layers
Which are all the variables I should use to create a sdm for sea turtles?
Give me a table with all the information I need to create a sdm for green sea turtles
Create the most exaustive list possible

The generated table with all the relevant information:

Variable Description Data source
Bathymetry The depth of the ocean floor is important for green sea turtles because they need to be able to reach the seafloor to feed and nest. Bio-ORACLE, ENVIREM, MARSPEC
Sea surface temperature Green sea turtles prefer warmer waters, but they can also tolerate cooler temperatures. Bio-ORACLE, ENVIREM, MARSPEC
Chlorophyll concentration Green sea turtles eat algae, so chlorophyll concentration is a useful predictor of their food availability. Bio-ORACLE, ENVIREM, MARSPEC
Seagrass cover Seagrass is an important food source and nesting habitat for green sea turtles. Bio-ORACLE, ENVIREM, MARSPEC
Salinity Green sea turtles are adapted to specific ranges of salinity. Bio-ORACLE, ENVIREM, MARSPEC
Nitrate concentration Nitrate is a nutrient that is essential for the growth of algae, which is a food source for green sea turtles. Bio-ORACLE, ENVIREM, MARSPEC
Distance to shore Green sea turtles need to be able to reach the ocean floor to feed and nest. Bio-ORACLE, ENVIREM, MARSPEC
Mean annual precipitation Green sea turtles are adapted to a variety of precipitation regimes, but they avoid areas that are too dry or too wet. WorldClim
Mean annual temperature Green sea turtles are adapted to a variety of temperature regimes, but they avoid areas that are too cold or too hot. WorldClim
Mean monthly temperature Green sea turtles are adapted to a variety of temperature regimes, but they avoid areas that have extreme variations in monthly temperatures. WorldClim
Mean monthly precipitation Green sea turtles are adapted to a variety of precipitation regimes, but they avoid areas that have extreme variations in monthly precipitation. WorldClim
Solar radiation Solar radiation is a source of energy for algae and other organisms that green sea turtles eat. WorldClim
Wind speed Wind can stir up sediment and make the water murky, which green sea turtles prefer to avoid. ERA5
Wave height Wave height can make it difficult for green sea turtles to feed and nest, so they avoid areas with high wave heights. ERA5
Turbidity Turbidity is the amount of suspended sediment in the water. Green sea turtles prefer clear water, so turbidity can be a useful predictor of their distribution. Bio-ORACLE, ENVIREM, MARSPEC
Light availability Light is essential for photosynthesis, which is the process that algae use to produce food. Green sea turtles eat algae, so light availability is an important factor in their distribution. Bio-ORACLE, ENVIREM, MARSPEC
Oxygen concentration Oxygen is essential for all life. Green sea turtles avoid areas with low oxygen concentrations. Bio-ORACLE, ENVIREM, MARSPEC
Food availability The amount of food available to green sea turtles in an area is a major factor that influences their distribution. Green sea turtles eat a variety of organisms, including algae, seagrass, jellyfish, and fish. Surveys of sea turtle prey organisms
Predation risk Green sea turtles are preyed upon by a variety of animals, including sharks, crocodiles, and birds. The risk of predation can be a major factor that influences the distribution of green sea turtles. Studies of sea turtle predators
Human activity Human activities, such as pollution, habitat destruction, and overfishing, can have a negative impact on the distribution of green sea turtles. Government reports and scientific studies

With this information in hands, we are now able to find these variables in the sdmpredictors.

View the sdmpredictors variables

sdmpredictors has many variables.

library(tidyverse)
library(DT)
library(sdmpredictors)
library(sf)

There are two marine data sources.

env_datasets <- list_datasets(terrestrial = FALSE, marine = TRUE)
env_datasets %>% 
  select(dataset_code, description, citation) %>% 
  DT::datatable()
env_datasets_vec <- c("Bio-ORACLE", "MARSPEC")
env_layers <- sdmpredictors::list_layers(env_datasets_vec)
DT::datatable(env_layers)

For this tutorial, we chose the “mean” variable from the sdmpredictors package for each marine environmental parameter recommended by Bard.

Variable Data source
Bathymetry BO_bathymean
Sea surface temperature BO2_tempmean_ss
Chlorophyll concentration BO2_chlomean_ss
Salinity BO2_salinitymean_ss
Nitrate concentration BO21_nitratemean_ss
Distance to shore MS_biogeo05_dist_shore_5m
Mean annual temperature MS_biogeo13_sst_mean_5m
Solar radiation BO22_parmean
Turbidity BO22_damean
Oxygen concentration BO2_dissoxmean_bdmean

Extract these variables

First we will load a bounding box for our area of interest. This was saved in []

#Loading bounding box for the area of interest
fil <- here::here("data", "region", "BoundingBox.shp")
extent_polygon <- read_sf(fil)

#Extract polygon geometry 
pol_geometry <- st_as_text(extent_polygon$geometry)

The objective is to extract all these variables for our presence and absence locations.

We will show this in the SDM examples. This is where the team ran into a numbers of hiccups.