spatiotemp_block() R function from [dynamicSDM]

Split occurrence records into spatial and temporal blocks for model fitting.

Splits occurrence records into spatial and temporal sampling units and groups sampling units into multiple blocks that have similar mean and range of environmental explanatory variables and sample size.


spatiotemp_block(
  occ.data,
  vars.to.block.by,
  spatial.layer,
  spatial.split.degrees,
  temporal.block,
  n.blocks = 10,
  iterations = 5000
)

Arguments

occ.data: a data frame, with columns for occurrence record co-ordinates and dates with column names as follows; record longitude as "x", latitude as "y", year as "year", month as "month", and day as "day", and associated explanatory variable data.
vars.to.block.by: a character string or vector, the explanatory variable column names to group sampling units based upon.
spatial.layer: optional; a SpatRaster object, a categorical spatial layer for sample unit splitting.
spatial.split.degrees: a numeric value, the grid cell resolution in degrees to split spatial.layer by. Required if spatial.layer given.
temporal.block: optional; a character string or vector, the time step for sampling unit splitting. Any combination of day, month, year or quarter. See details.
n.blocks: optional; a numeric value of two or more, the number of blocks to group occurrence records into. Default; 10.
iterations: optional; a numeric value, the number of random block groupings to trial before selecting the optimal grouping. Default; 5000.

Returns

Returns occurrence data frame with column "BLOCK.CATS", assigning each record to a spatiotemporal block.

Blocking for autocorrelation

Blocking is an established method to account for spatial autocorrelation in SDMs. Following Bagchi et al., (2013), the blocking method involves splitting occurrence data into sampling units based upon non-contiguous ecoregions, which are then grouped into spatially disaggregated blocks of approximately equal sample size, within which the mean and range of explanatory variable data are similar. When species distribution model fitting, blocks are left out in-turn in a jack-knife approach for model training and testing.

We adapt this approach to account for temporal autocorrelation by enabling users to split records into sampling units based upon spatial and temporal characteristic before blocking occurs.

Spatial splitting

If the spatial.layer has categories that take up large contiguous areas, spatiotemp_block() will split categories into smaller units using grid cells at specified resolution (spatial.split.degrees).

Temporal splitting

If temporal.block is given, then occurrence records with unique values for the given level are considered unique sampling unit. For instance, if temporal.block = year, then records from the same year are considered a sampling unit to be grouped into blocks.

Note: If spatial splitting is also used, then spatial characteristics may split these further into separate sampling units.

The temporal.block option quarter splits occurrence records into sampling units based on which quarter of the year the record month belongs to: (1) January-March, (2) April-June, (3) July-September and (4) October-December. This could be employed if seasonal biases in occurrence record collection are driving autocorrelation.

Block generation

Once split into sampling units based upon temporal and spatial characteristics, these units are then assigned into given number of blocks (n.blocks), so that the mean and range of explanatory variables (vars.to.block.by) and total sample size are similar across each. The number of iterations specifies how many random shuffles are used to optimise block equalisation.

Examples


data("sample_explan_data")
data("sample_extent_data")
random_cat_layer <- terra::rast(sample_extent_data)
random_cat_layer <- terra::setValues(random_cat_layer,
                                    sample(0:10, terra::ncell(random_cat_layer),
                                           replace = TRUE))

spatiotemp_block(
 occ.data = sample_explan_data,
 spatial.layer = random_cat_layer,
 spatial.split.degrees = 3,
 temporal.block = c("month"),
 vars.to.block.by = colnames(sample_explan_data)[14:16],
 n.blocks = 3,
 iterations = 30
)

References

Bagchi, R., Crosby, M., Huntley, B., Hole, D. G., Butchart, S. H. M., Collingham, Y., Kalra, M., Rajkumar, J., Rahmani, A. & Pandey, M. 2013. Evaluating the effectiveness of conservation site networks under climate change: accounting for uncertainty. Global Change Biology, 19, 1236-1248.

dynamicSDM package Read PDF manual

Maintainer: Rachel Dobson
License: GPL (>= 3)
Last published: 2024-06-28

Useful links

Downloads (last 30 days):

spatiotemp_block function