Stacked species regression models, possibly fitted in parallel
Stacked species regression models, possibly fitted in parallel
stackedsdm( y, formula_X =~1, data =NULL, family ="negative.binomial", trial_size =1, do_parallel =FALSE, ncores =NULL, trace =FALSE)
Arguments
y: A matrix of species responses
formula_X: An object of class formula representing the relationship to the covariates to be fitted. There should be nothing to the left hand side of the "~" sign.
data: Data frame of the covariates
family: Either a single character vector, in which case all responses are assumed to be from this family, or a vector of character strings of the same length as the number of columns of y. Families as strings and not actual family class objects. This could be changed though if desired in the future e.g., for custom link functions. Currently, the following families are supported (hopefully properly!): "gaussian", "negative.binomial" (with quadratic mean-variance relationship), "poisson", "binomial" (with logit link), "tweedie", "Gamma" (with log link), "exponential", "beta" (with logit link), "ordinal" (cumulative logit model), "ztpoisson", "ztnegative.binomial", "zipoisson", "zinegative.binomial".
trial_size: The trial size if any of the responses are binomial. Is either a single number or a matrix with the same dimension as y. If the latter, then all columns that do not correspond to binomial responses are ignored.
do_parallel: Do the separate species model fits in parallel? Defaults to TRUE
ncores: The number of cores to use if separate the species model fits are done in parallel. If do_parallel = TRUE, then it defaults to detectCores() - 2
trace: Print information. This is not actually used currently
Returns
A object of class stackedsdm with the following components: call The function call; fits A list where the j-th element corresponds to the to the fitted model for species j i.e., the j-th column in y; linear_predictor A matrix of the fitted linear predictors fitted A matrix of the fitted values
Details
stackedsdm behaves somewhat like the manyglm or manyany function in the package mvabund, in the sense that it fits a separate regression to each species response i.e., column of y. The main difference is that different families can be permitted for each species, which thus allows for mixed responses types.
data(spider)X <- spider$x
abund <- spider$abund
# Example 1: Simple examplemyfamily <-"negative.binomial"# Example 1: Funkier example where Species are assumed to have different distributions# Fit models including all covariates are linear terms, but exclude for bare sandfit0 <- stackedsdm(abund, formula_X =~. -bare.sand, data = X, family = myfamily, ncores =2)# Example 2: Funkier example where Species are assumed to have different distributionsabund[,1:3]<-(abund[,1:3]>0)*1# First three columns for presence absencemyfamily <- c(rep(c("binomial"),3), rep(c("negative.binomial"),(ncol(abund)-3)))fit0 <- stackedsdm(abund, formula_X =~ bare.sand, data = X, family = myfamily, ncores =2)