Calculates an estimator for a derived quantity by summing across multiple predictions. This can be used to approximate an integral when estimating area-expanded abundance, abundance-weighting a covariate to calculate distribution shifts, and/or weighting one model variable by another.
newdata: New data-frame of independent variables used to predict the response, where a total value is calculated by combining across these individual predictions. If these locations are randomly drawn from a specified spatial domain, then integrate_output applies midpoint integration to approximate the total over that area. If locations are drawn sysmatically from a domain, then integrate_output is applying a midpoint approximation to the integral.
area: vector of values used for area-weighted expansion of estimated density surface for each row of newdata
with length of nrow(newdata).
type: Integer-vector indicating what type of expansion to apply to each row of newdata, with length of nrow(newdata).
type=1: Area-weighting: weight predictor by argument area
type=2: Abundance-weighted covariate: weight covariate by proportion of total in each row of newdata
type=3: Abundance-weighted variable: weight predictor by proportion of total in a prior row of newdata. This option is used to weight a prediction for one category based on predicted proportional density of another category, e.g., to calculate abundance-weighted condition in a bivariate model.
type=4: Abundance-expanded variable: weight predictor by density in a prior row of newdata. This option is used to weight a prediction for one category based on predicted density of another category, e.g., to calculate abundance-expanded consumption in a bivariate model.
type=0: Exclude from weighting: give weight of zero for a given row of newdata. Including a row of newdata with type=0 is useful, e.g., when calculating abundance at that location, where the eventual index uses abundance as weighting term but without otherwise using the predicted density in calculating a total value.
weighting_index: integer-vector used to indicate a previous row that is used to calculate a weighted average that is then applied to the given row of newdata. Only used for when type=3.
covariate: numeric-vector used to provide a covariate that is used in expansion, e.g., to provide positional coordinates when calculating the abundance-weighted centroid with respect to that coordinate. Only used for when type=2.
getsd: logical indicating whether to get the standard error, where getsd=FALSE is faster during initial exploration
bias.correct: logical indicating if bias correction should be applied using standard methods in TMB::sdreport()
apply.epsilon: Apply epsilon bias correction using a manual calculation rather than using the conventional method in TMB::sdreport ? See details for more information.
intern: Do Laplace approximation on C++ side? Passed to TMB::MakeADFun().
Returns
A vector containing the plug-in estimate, standard error, the epsilon bias-corrected estimate if available, and the standard error for the bias-corrected estimator. Depending upon settings, one or more of these will be NA values, and the function can be repeatedly called to get multiple estimators and/or statistics.
Details
Analysts will often want to calculate some value by combining the predicted response at multiple locations, and potentially from multiple variables in a multivariate analysis. This arises in a univariate model, e.g., when calculating the integral under a predicted density function, which is approximated using a midpoint or Monte Carlo approximation by calculating the linear predictors at each location newdata, applying the inverse-link-trainsformation, and calling this predicted response mu_g. Total abundance is then be approximated by multiplying mu_g by the area associated with each midpoint or Monte Carlo approximation point (supplied by argument area), and summing across these area-expanded values.
In more complicated cases, an analyst can then use covariate
to calculate the weighted average of a covariate for each midpoint location. For example, if the covariate is positional coordinates or depth/elevation, then type=2
measures shifts in the average habitat utilization with respect to that covariate. Alternatively, an analyst fitting a multivariate model might weight one variable based on another using weighting_index, e.g., to calculate abundance-weighted average condition, or predator-expanded stomach contents.
In practice, spatial integration in a multivariate model requires two passes through the rows of newdata when calculating a total value. In the following, we write equations using C++ indexing conventions such that indexing starts with 0, to match the way that integrate_output expects indices to be supplied. Given inverse-link-transformed predictor μg, function argument type as typeg
function argument area as ag, function argument covariate as xg, function argument weighting_index as \eqn{ h_g }
function argument weighting_index as \eqn{ h_g }
the first pass calculates:
νg=μgag
where the total value from this first pass is calculated as:
ν∗=g=0∑G−1νg
The second pass then applies a further weighting, which depends upon typeg, and potentially upon xg and hg.
If typeg=0 then ϕg=0
If typeg=1 then ϕg=νg
If typeg=2 then ϕg=xgν∗νg
If typeg=3 then ϕg=ν∗νhgμg
If typeg=4 then ϕg=νhgμg
Finally, the total value from this second pass is calculated as:
ϕ∗=g=0∑G−1ϕg
and ϕ∗ is outputted by integrate_output, along with a standard error and potentially using the epsilon bias-correction estimator to correct for skewness and retransformation bias.
Standard bias-correction using bias.correct=TRUE can be slow, and in some cases it might be faster to do apply.epsilon=TRUE and intern=TRUE. However, that option is somewhat experimental, and a user might want to confirm that the two options give identical results. Similarly, using bias.correct=TRUE
will still calculate the standard-error, whereas using apply.epsilon=TRUE and intern=TRUE will not.