dfuncEstim - Estimate a distance-based detection function
dfuncEstim - Estimate a distance-based detection function
Fits a detection function using maximum likelihood.
dfuncEstim(data,...)
Arguments
data: An RdistDf data frame. RdistDf data frames contain one line per transect and a list-based column. The list-based column contains a data frame with detection information. The detection information data frame on each row contains (at least) distances and group sizes of all targets detected on the transect. Function RdistDf creates RdistDf data frames from separate transect and detection data frames. is.RdistDf checks whether data frames are RdistDf's.
...: Arguments passed on to dE.single, dE.multi
formula: A standard formula object. For example, dist ~ 1, dist ~ covar1 + covar2). The left-hand side (before ~) is the name of the vector containing off-transect or radial detection distances. The right-hand side contains the names of covariate vectors to fit in the detection function, and potentially group sizes. Covariates can be either detection level or transect level and can appear in data or exist in the global working environment. Regular R scoping rules apply.
likelihood: String specifying the likelihood to fit. Built-in likelihoods at present are "halfnorm", "hazrate", and "negexp".
w.lo: Lower or left-truncation limit of the distances in distance data. This is the minimum possible off-transect distance. Default is 0. If w.lo is greater than 0, it must be assigned measurement units using units(w.lo) <- "<units>" or w.lo <- units::set_units(w.lo, "<units>"). See examples in the help for set_units.
w.hi: Upper or right-truncation limit of the distances in dist. This is the maximum off-transect distance that could be observed. If unspecified (i.e., NULL), right-truncation is set to the maximum of the observed distances. If w.hi is specified, it must have associated measurement units. Assign measurement units using units(w.hi) <- "<units>" or w.hi <- units::set_units(w.hi, "<units>"). See examples in the help for set_units.
expansions: A scalar specifying the number of terms in series to compute. Depending on the series, this could be 0 through 5. The default of 0 equates to no expansion terms of any type. No expansion terms are allowed (i.e., expansions is forced to 0) if covariates are present in the detection function (i.e., right-hand side of formula includes something other than 1).
series: If expansions > 0, this string specifies the type of expansion to use. Valid values at present are 'simple', 'hermite', and 'cosine'.
x.scl: The x coordinate (a distance) at which the detection function will be scaled. g.x.scl can be a distance or the string "max". When x.scl is specified (i.e., not 0 or "max"), it must have measurement units assigned using either library(units);units(x.scl) <- '<units>'
or `x.scl <- units::set_units(x.scl, <units>)`. See `units::valid_udunits()` for valid symbolic units.
g.x.scl: Height of the distance function at coordinate x. The distance function will be scaled so that g(x.scl) = g.x.scl. If g.x.scl is not a data frame, it must be a numeric value (vector of length 1) between 0 and 1.
warn: A logical scalar specifying whether to issue an R warning if the estimation did not converge or if one or more parameter estimates are at their boundaries. For estimation, warn should generally be left at its default value of TRUE. When computing bootstrap confidence intervals, setting warn = FALSE
turns off annoying warnings when an iteration does not converge. Regardless of `warn`, after completion all messages about convergence and boundary conditions are printed by `print.dfunc`, `print.abund`, and `plot.dfunc`.
outputUnits: A string specifying the symbolic measurement units for results. Valid units are listed in units::valid_udunits(). The strings for common distance symbolic units are: "m" - meters, "ft" - feet, "cm" - centimeters, "mm" - millimeters, "mi" - miles, "nmile" - nautical miles ("nm" is nano meters), "in" - inches, "yd" - yards, "km" - kilometers, "fathom" - fathoms, "chains" - chains, and "furlong" - furlongs. If outputUnits is unspecified (NULL), output units will be the same as those on distances in data.
Returns
An object of class 'dfunc'. Objects of class 'dfunc' are lists containing the following components:
par: The vector of estimated parameter values. Length of this vector for built-in likelihoods is one (for the function's parameter) plus the number of expansion terms plus one if the likelihood is 'hazrate' (which has two parameters).
varcovar: The variance-covariance matrix for coefficients of the distance function, estimated by the inverse of the fit's Hessian evaluated at the estimates. Rdistance estimates the Hessian as the second derivative of the log likelihood surface at the final estimates, where second derivatives are estimated by numeric differentiation (see secondDeriv. There is no guarantee this matrix is positive-definite and should be viewed with caution. Error estimates derived from bootstrapping are generally more reliable. I.e., re-compute coefficient confidence intervals using the bootstrap values in component $B of an abundance object.
loglik: The maximized value of the log likelihood.
convergence: The convergence code. This code is returned by optim or nlminb. Values other than 0 indicate suspect convergence.
likelihood: The name of the likelihood. This is the value of the argument likelihood.
w.lo: Left-truncation value used during the fit.
w.hi: Right-truncation value used during the fit.
mf: A modelframe of detections within the strip or circle used in the fit. Column 'dist' contains the observed distances. Column 'offset(...)' contains group sizes associated with the values of 'dist'. Group sizes are only used in abundEstim. This model frame contains only non-missing distances between w.lo and w.hi.
model.frame: A model.frame object containing observed distances (the 'response'), covariates specified in the formula, and group sizes if they were specified. If specified, the name of the group size column is "offset(-variable-)", not "groupsize(-variable-)", because internally it is easier to treat group sizes as an offset in the model. This component is a proper model.frame and contains both 'terms' and 'contrasts' attributes.
siteID.cols: A vector containing the transect ID column names in detectionData
and siteData. Transect IDs can be a composite of two or more columns and hence this component can have length greater than 1.
expansions: The number of expansion terms used during estimation.
series: The type of expansion used during estimation.
call: The original call of this function.
call.x.scl: The input or user requested distance at which the distance function is scaled.
call.g.x.scl: The input value specifying the height of the distance function at a distance of call.x.scl.
call.observer: The value of input parameter observer. The input observer parameter is only applicable when g.x.scl is a data frame.
fit: The fitted object returned by optim. See documentation for optim.
factor.names: The names of any factors in formula.
pointSurvey: The input value of pointSurvey. This is TRUE if distances are radial from a point. FALSE if distances are perpendicular off-transect.
formula: The formula specified for the detection function.
control: A list containing values of the 'control' parameters set by RdistanceControls.
outputUnits: The measurement units used for output. All distance measurements are converted to these units internally.
x.scl: The actual distance at which the distance function is scaled to some value. i.e., this is the actual x at which g(x) = g.x.scl. Note that call.x.scl = x.scl unless call.x.scl == "max", in which case x.scl is the distance at which g() is maximized.
g.x.scl: The actual height of the distance function at a distance of x.scl. Note that g.x.scl = call.g.x.scl unless call.g.x.scl
is a multiple observer data frame, in which case g.x.scl is the actual height of the distance function at x.scl computed from the multiple observer data frame.
Details
Optimization and estimation controls can be modified using options(). See RdistanceControls.
Group Sizes
To specify non-unity group sizes, use groupsize()
on the RHS of formula. When group sizes are not all 1, they must appear in a column of the 'detections' list-column of data. For example, d ~ habitat + groupsize(number) specifies distances in column d, one covariate named habitat, and that column number
contains the number of individuals associated with each detection. If group sizes are not specified, all group sizes are assumed to be 1.
Contrasts
Factor contrasts in Rdistance are specified the same way as in lm or glm. By default, Rdistance uses contrasts in getOption("contrasts"). To change contrasts, use a statement like options(contrasts = c(unordered = "contr.SAS", ordered = "contr.poly")). Or, to set contrasts for a specific factor in the input data frame, use contrasts(df$A) <- "contr.sum" or similar. See contrasts or the contrasts.arg
of model.matrix.
Measurement Units
As of Rdistance version 3.0.0, measurement units are require on all physical distances. Requiring units ensures that internal calculations and results (e.g., ESW and abundance) are correct and that output units are clear. Physical distances are required on off-transect distances, radial distances, truncation distances (w.lo, unless it is zero; and w.hi, unless it is NULL), scale locations (x.scl, unless it is zero), line-transect lengths, and study area size. All units are 1-dimensional except those on study area, which are 2-dimensional.
Physical measurement units can vary. For example, off-transect distances can be meters ("m"), w.hi can be inches ("in"), and w.lo can be kilometers ("km"). Internally, all distances are converted to the units specified by outputUnits
(or the units of input distances if outputUnits is NULL), and all output is reported in units of outputUnits. Valid conversions must exist between units or an error is thrown. For example, meters cannot be converted into hectares.
Measurement units can be assigned using units()<- after attaching the units
package or with x <- units::set_units(x, "<units>"). See units::valid_udunits()
for a list of valid symbolic units.
If measurements are truly unit-less, or measurement units are unknown, set options(Rdist_requireUnits = FALSE). This suppresses all unit checks and conversions. Users are on their own to make sure inputs are scaled correctly and that output units are known.
Examples
# Sparrow line transect exampledata(sparrowDetectionData)data(sparrowSiteData)sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData)dfunc <- dfuncEstim(sparrowDf, formula = dist ~1)summary(dfunc)data(sparrowDfuncObserver)# pre-estimated object## Not run:# Command to produce 'sparrowDfuncObserver'sparrowDfuncObserver <- sparrowDf |> dfuncEstim( formula = dist ~ observer
)## End(Not run)
sparrowDfuncObserver
summary(sparrowDfuncObserver)plot(sparrowDfuncObserver)
References
Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.
See Also
abundEstim, autoDistSamp. Likelihood-specific help files (e.g., halfnorm.like).