startSpecies function

startSpecies: Function used to initialize a multi-species integrated species distribution model.

startSpecies: Function used to initialize a multi-species integrated species distribution model.

This function is used to create an object containing all the data, metadata and relevant components required for the multi-species integrated species distribution model and INLA to work. As a result, the arguments associated with this function are predominantly related to describing variable names within the datasets that are relevant, and arguments related to what terms should be included in the formula for the integrated model. The output of this function is an R6 object, and so there are a variety of public methods within the output of this function which can be used to further specify the model (see ?specifySpecies or .$help() for a comprehensive description of these public methods).

startSpecies( ..., spatialCovariates = NULL, Projection, Mesh, speciesSpatial = "replicate", speciesIntercept = TRUE, speciesEnvironment = TRUE, speciesName, IPS = NULL, Boundary = NULL, pointCovariates = NULL, Offset = NULL, pointsIntercept = TRUE, pointsSpatial = "copy", responseCounts = "counts", responsePA = "present", trialsPA = NULL, temporalName = NULL, Formulas = list(covariateFormula = NULL, biasFormula = NULL) )

Arguments

  • ...: The datasets to be used in the model. Must come as either sf objects, or as a list of named sf objects.
  • spatialCovariates: The spatial covariates used in the model. These covariates must be measured at every location (pixel) in the study area, and must be a SpatialRaster object. Can be either numeric, factor or character data. Defaults to NULL which includes no spatial effects in the model.
  • Projection: The coordinate reference system used by both the spatial points and spatial covariates. Must be of class character.
  • Mesh: An fm_mesh_2d object required for the spatial random fields and the integration points in the model (see fm_mesh_2d_inla from the fmesher package for more details).
  • speciesSpatial: Argument to specify if each species should have their own spatial effect with different hyperparameters to be estimated using INLA's "replicate" feature, of if a the field's should be estimated per species copied across datasets using INLA's "copy" feature. Possible values include: 'replicate', 'copy', 'shared' or NULL if no species-specific spatial effects should be estimated.
  • speciesIntercept: Argument to control the species intercept term. Defaults to TRUE which creates a random intercept term, FALSE creates a fixed intercept term, and NULL removes the intercept term.
  • speciesEnvironment: Argument to control the species environmental term. Defaults to TRUE which creates species level environental effects. To create shared effects across the species, use FALSE.
  • speciesName: Name of the species variable name (class character). Specifying this argument turns the model into a stacked species distribution model, and calculates covariate values for the individual species, as well as a species group model in the shared spatial field. Defaults to NULL. Note that if this argument is non-NULL and pointsIntercepts is missing, pointsIntercepts will be set to FALSE.
  • IPS: The integration points to be used in the model (that is, the points on the map where the intensity of the model is calculated). See fm_int from the fmesher package for more details regarding these points; however defaults to NULL which will create integration points from the Mesh object.
  • Boundary: A sf object of the study area. If not missing, this object is used to help create the integration points.
  • pointCovariates: The non-spatial covariates to be included in the integrated model (for example, in the field of ecology the distance to the nearest road or time spent sampling could be considered). These covariates must be included in the same data object as the points.
  • Offset: Name of the offset variable (class character) in the datasets. Defaults to NULL; if the argument is non-NULL, the variable name needs to be standardized across datasets (but does not need to be included in all datasets). The offset variable will be transformed onto the log-scale in the integrated model.
  • pointsIntercept: Logical argument: should the points be modeled with intercepts. Defaults to TRUE. Note that if this argument is non-NULL and pointsIntercepts is missing, pointsIntercepts will be set to FALSE.
  • pointsSpatial: Argument to determine whether the spatial field is shared between the datasets, or if each dataset has its own unique spatial field. The datasets may share a spatial field with INLA's "copy" feature if the argument is set to copy. May take on the values: "shared", "individual", "copy", "correlate" or NULL if no spatial field is required for the model. Defaults to "copy".
  • responseCounts: Name of the response variable in the counts/abundance datasets. This variable name needs to be standardized across all counts datasets used in the integrated model. Defaults to 'counts'.
  • responsePA: Name of the response variable (class character) in the presence absence/detection non-detection datasets. This variable name needs to be standardized across all present absence datasets. Defaults to 'present'.
  • trialsPA: Name of the trials response variable (class character) for the presence absence datasets. Defaults to NULL.
  • temporalName: Name of the temporal variable (class character) in the model. This variable is required to be in all the datasets. Defaults to NULL.
  • Formulas: A named list with two objects. The first one, covariateFormula, is a formula for the covariates and their transformations for the distribution part of the model. Defaults to NULL which includes all covariates specified in spatialCovariates into the model. The second, biasFormula, specifies which covariates are used for the PO datasets. Defaults to NULL which includes no covariates for the PO datasets.

Returns

A specifySpecies object (class R6). Use ?specifySpecies to get a comprehensive description of the slot functions associated with this object.

Note

The idea with this function is to describe the full model: that is, all the covariates and spatial effects will appear in all the formulas for the datasets and species. If some of these terms should not be included in certain observation models in the integrated model, they can be thinned out using the .$updateFormula function. Note: the point covariate will only be included in the formulas for where they are present in a given dataset, and so these terms do not need to be thinned out if they are not required by certain observation models.

Examples

## Not run: if (requireNamespace('INLA')) { ##REDO WITH OTHER DATA #Get Data data("SolitaryTinamou") proj <- "+proj=longlat +ellps=WGS84" data <- SolitaryTinamou$datasets mesh <- SolitaryTinamou$mesh mesh$crs <- proj #Set base model up baseModel <- startSpecies(data, Mesh = mesh, Projection = proj, responsePA = 'Present', speciesName = 'speciesName') #Print summary baseModel #Set up model with dataset specific spatial fields indSpat <- startSpecies(data, Mesh = mesh, Projection = proj, pointsSpatial = 'individual', responsePA = 'Present', speciesName = 'speciesName') #Model with offset variable offSet <- startSpecies(data, Mesh = mesh, Projection = proj, Offset = 'area', responsePA = 'Present', speciesName = 'speciesName') #Non-random effects for the species speciesInt <- startSpecies(data, Mesh = mesh, Projection = proj, speciesIntercept = FALSE, responsePA = 'Present', speciesName = 'speciesName') #Turn off species level field speciesInt <- startSpecies(data, Mesh = mesh, Projection = proj, speciesSpatial = NULL, responsePA = 'Present', speciesName = 'speciesName') } ## End(Not run)