Hmsc: Hierarchical Model of Species Communities
Hierarchical Modelling of Species Communities (Hmsc) is a flexible framework for Joint Species Distribution Modelling (JSDMs). The framework can be used to relate species occurrences or abundances to environmental covariates, species traits and phylogenetic relationships. JSDMs are a special case of species distribution models (SDMs) that take into account the multivariate nature of communities which allows us to estimate community level responses as well capture biotic interactions and the influence of missing covariates in residual species associations. The Hmsc package contains functions to fit JSDMs, analyze the output and to generate predictions with these JSDMs. package
A typical workflow for a Hmsc analysis constists of 5 steps:
Step 1: Setting the model structure and fitting the model
The obligatory data for a Hmsc analysis includes a matrix of species occurrences or abundances (Y) and a data frame of environmental covariates (XData). The species matrix Y consists of ns columns representing the species and np rows representing the sampling units. Species data for different species does not have to be on the same scale, i.e. species 1 may be recorded as presence/absence while species 2 is recorded as abundance. XData consists of nc columns representing the environmental variables and np rows representing the sampling units. Y and XData need to have the same amount of rows (sampling units).
Optionally, the user can include information species traits, phylogenetic relationships and information on the spatiotemporal or hierarchical context of the sampling design to account for dependencies among the sampling units.
The model structure is created using the Hmsc
function. As input, this function needs at least the matrix with species data (Y) and the dataframe with environmental variables. This is also the place where the data model is specified, as a default Hmsc assumes normally distributed species data. Other options are 'probit' for binary data, 'Poisson' and 'overdispersedPoisson' for count data. Additionally, you can specify the study design, the random effects structure, the species traits to be used, species phylogeny, how the covariate should be scales and how viable selection should be applied.
The random levels supplied to Hmsc
are generated using HmscRandomLevel
. Here, the user can specify the spatial or temporal information for the units as well as the covariates for covariate-depedent associations. If there is no spatial, temporal or coviate information for this level, the user should provide here the names of the units for this level.
After setting the model structure, the model is fitted by running sampleMcmc
to sample from the posterior distributions of the model parameters.
Step 2: Examining MCMC convergence
After fitting the model, the MCMC convergence needs to be evaluated. The easiest way to do this is generate a coda object from the fitted model using convertToCodaObject
. Convergence of the chains should be assessed qualitatively using trace plots of the model parameters and quantitatively using gelman diagnostics gelman.diag
and by calculating the effective sample size with effectiveSize
.
Step 3: Evaluating model fit
If MCMC convergence is satisfacotry, the model fit can be evaluated. Using the computePredictedValues
function. It is also recommended to also compute predictive power using cross validation. Cross validation is done by supplying the partitioning created using using createPartition
to computePredictedValues
.
Step 4: Exploring parameter estimates
When the fitted model has satisfactory convergence and fit, the next step is generally to investigate the parameter estimates of the model. The posterior distribution of the parameter of choice is extracted from the model object using getPostEstimate
. plotBeta
and plotGamma
can be used to visuzalize the beta and gamma parameter estimates. Additionally, the function biPlot
can be used to construct an ordination plot from the eta and lamda parameters
Additionally, at this stage of the analysis, the user may want to look at how the variance explained by the model is partitioned using the computeVariancePartitioning
and the plotVariancePartitioning
function.
Step 5: Making predictions
Hmsc comes with a generic predict
function that can be used to predict species occurences or counts in new units. These predictions can be unconditional or conditional on the occurence of other species. For conditional predictions, the user needs to supply species data for at least some species in the new units. Additionally, the package specific tools to make predictions along environmental gradients. These predictions can be vizualized at both the species and the community level (constructGradient
, plotGradient
).
Tikhonov, G., Opedal, O.H., Abrego, N., Lehikoinen, A., de Jonge, M.M.J, Oksanen, J. and Ovaskainen, O. (2020) Joint species distribution modelling with the R-package Hmsc. Methods in Ecology and Evolution 11, 442--447. tools:::Rd_expr_doi("10.1111/2041-210X.13345")
Useful links:
Useful links