samples: the number of MCMC samples to be obtained in each chain
transient: the number of MCMC steps that are executed before starting recording posterior samples
thin: the number of MCMC steps between each recording of samples from the posterior
initPar: a named list of parameter values used for initialization of MCMC states, or alternatively text "fixed effects" to use linear Maximum Likelihood model instead of randomizing from prior; the "fixed effects"
can shorten the transient phase of sampling, but will initialize all chains to the same starting values
verbose: the interval between MCMC steps printed to the console (default is an interval that prints ca. 50 reports)
adaptNf: a vector of length nr with number of MCMC steps at which the adaptation of the number of latent factors is conducted
nChains: number of independent MCMC chains to be run
nParallel: number of parallel processes by which the chains are executed.
useSocket: (logical) use socket clusters in parallel processing; in Windows this is the only option, but in other operating systems fork clusters are a better alternative, and this should be set FALSE.
dataParList: a named list with pre-computed Qg, iQg, RQg, detQg, rLPar
parameters
updater: a named list, specifying which conditional updaters should be ommitted
fromPrior: whether prior (TRUE) or posterior (FALSE) is to be sampled
alignPost: boolean flag indicating whether the posterior of each chains should be aligned
Returns
An Hmsc-class object with chains of posterior samples added to the postList field
Details
The exact number of samples to be recorded in order to get a proper estimate of the full posterior with Gibbs MCMC algorithms, as well as the required thinning and cut-off of transient is very problem-specific and depends both on the model structure and the data itself. Therefore, in general it is very challenging to a priori provide an informed recommendation on what values should be used for a particular problem. A common recommended strategy involves executing the posterior sampling with MCMC with some guess of the values for these arguments, checking the properties of the obtained samples (primarily potential scale reduction factor and effective sample size), and adjusting the guess accordingly.
The value of 1 for thin argument means that at each MCMC step after the transient a sample is recorded.
Typically, the value of nParallel equal to nChains leads to most efficient usage of available parallelization capacities. However, this may be not the case if R is configured with multi-tread linear algebra libraries. For debug and test purposes, the nParallel should be set to 1, since only in this case a details of the potentially encountered errors would be available.
The dataParList argument may be handy for large problems that needs to be refitted multiple times, e.g. with different prior values. In that case, the data parameters that are precomputed for the Hmsc sampling scheme may require an undesirably lot of storage space if they are saved for each of the model. Instead, they could be computed only once and then directly reused, therefore reducing the storing redundancy.
Some of the available conditional updaters partially duplicate each other. In certain cases, the usage of all of them may lead to suboptimal performance, compared to some subset of those. Then, it is possible to manually disable some of them, by adding a $UPDATER_NAME=FALSE pair to the updater argument. Another usage of this argument involves cases when some of the model parameters are known and have to be fixed. However, such tweaks of the sampling scheme should be done with caution, as if compromized they would lead to erroneuos results.
Examples
## you need 1000 or more samples, but that will take too long## in an examplem = sampleMcmc(TD$m, samples=10)## Not run:## Record 1000 posterior samples while skipping 1 MCMC step between samples## from 2 chains after discarding the first 500 MCMC stepsm = sampleMcmc(TD$m, samples=1000, transient=500, thin=2, nChains=2, nParallel=1)## End(Not run)