Split occurrence records into spatial and temporal blocks for model fitting.
Split occurrence records into spatial and temporal blocks for model fitting.
Splits occurrence records into spatial and temporal sampling units and groups sampling units into multiple blocks that have similar mean and range of environmental explanatory variables and sample size.
occ.data: a data frame, with columns for occurrence record co-ordinates and dates with column names as follows; record longitude as "x", latitude as "y", year as "year", month as "month", and day as "day", and associated explanatory variable data.
vars.to.block.by: a character string or vector, the explanatory variable column names to group sampling units based upon.
spatial.layer: optional; a SpatRaster object, a categorical spatial layer for sample unit splitting.
spatial.split.degrees: a numeric value, the grid cell resolution in degrees to split spatial.layer by. Required if spatial.layer given.
temporal.block: optional; a character string or vector, the time step for sampling unit splitting. Any combination of day, month, year or quarter. See details.
n.blocks: optional; a numeric value of two or more, the number of blocks to group occurrence records into. Default; 10.
iterations: optional; a numeric value, the number of random block groupings to trial before selecting the optimal grouping. Default; 5000.
Returns
Returns occurrence data frame with column "BLOCK.CATS", assigning each record to a spatiotemporal block.
Blocking for autocorrelation
Blocking is an established method to account for spatial autocorrelation in SDMs. Following Bagchi et al., (2013), the blocking method involves splitting occurrence data into sampling units based upon non-contiguous ecoregions, which are then grouped into spatially disaggregated blocks of approximately equal sample size, within which the mean and range of explanatory variable data are similar. When species distribution model fitting, blocks are left out in-turn in a jack-knife approach for model training and testing.
We adapt this approach to account for temporal autocorrelation by enabling users to split records into sampling units based upon spatial and temporal characteristic before blocking occurs.
Spatial splitting
If the spatial.layer has categories that take up large contiguous areas, spatiotemp_block() will split categories into smaller units using grid cells at specified resolution (spatial.split.degrees).
Temporal splitting
If temporal.block is given, then occurrence records with unique values for the given level are considered unique sampling unit. For instance, if temporal.block = year, then records from the same year are considered a sampling unit to be grouped into blocks.
Note: If spatial splitting is also used, then spatial characteristics may split these further into separate sampling units.
The temporal.block option quarter splits occurrence records into sampling units based on which quarter of the year the record month belongs to: (1) January-March, (2) April-June, (3) July-September and (4) October-December. This could be employed if seasonal biases in occurrence record collection are driving autocorrelation.
Block generation
Once split into sampling units based upon temporal and spatial characteristics, these units are then assigned into given number of blocks (n.blocks), so that the mean and range of explanatory variables (vars.to.block.by) and total sample size are similar across each. The number of iterations specifies how many random shuffles are used to optimise block equalisation.
Bagchi, R., Crosby, M., Huntley, B., Hole, D. G., Butchart, S. H. M., Collingham, Y., Kalra, M., Rajkumar, J., Rahmani, A. & Pandey, M. 2013. Evaluating the effectiveness of conservation site networks under climate change: accounting for uncertainty. Global Change Biology, 19, 1236-1248.