formula: an object of class "Formula" "formula": A symbolic description of the model to be fitted. The details of model specification are given under 'Details'.
data: an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which opsr is called.
subset: an optional vector specifying a subset of observations to be used in the fitting process. (See additional details in the 'Details' section of the model.frame documentation.).
weights: an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, then observation-specific log-likelihood contributions are multiplied by their corresponding weight before summing.
na.action: a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The 'factory-fresh' default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.
start: a numeric vector with the starting values (passed to maxLik::maxLik). If no starting values are provided, reasonable values are auto-generated via the Heckman 2-step procedure opsr_2step. The structure of start has to conform with opsr's expectations. See opsr_check_start for further details.
fixed: parameters to be treated as constants at their start values. If present, it is treated as an index vector of start parameters (passed to maxLik::maxLik).
method: maximzation method (passed to maxLik::maxLik).
iterlim: maximum number of iterations (passed to maxLik::maxLik).
printLevel: larger number prints more working information (passed to maxLik::maxLik).
nThreads: number of threads to be used. Do not pass higher number than number of ordinal outcomes. See also opsr_check_omp and opsr_max_threads.
.get2step: if TRUE, returns starting values as generated by opsr_2step. Will not proceed with the maximum likelihood estimation.
.useR: if TRUE usese loglik_R. Go grab a coffe.
.censorRho: if TRUE, rho starting values are censored to lie in the interval [-0.85, 0.85].
...: further arguments passed to maxLik::maxLik.
Returns
An object of class "opsr" "maxLik" "maxim".
Details
Models for opsr are specified symbolically. A typical model has the form ys | yo ~ terms_s | terms_o1 | terms_o2 | .... ys is the ordered (numeric) response vector (starting from 1, in integer-increasing fashion). For the terms
specification the rules of the regular formula interface apply (see also stats::lm ). The intercept in the terms_s (selection process) is excluded automatically (no need to specify -1). If the user wants to specify the same process for all continuous outcomes, two processes are enough (ys | yo ~ terms_s | terms_o). Note that the model is poorly identifiable if terms_s == terms_o (same regressors are used in selection and outcome processes).
Examples
## simulated datasim_dat <- opsr_simulate()dat <- sim_dat$data # 1000 observationssim_dat$sigma # cov matrix of errorssim_dat$params # ground truth## specify a modelmodel <- ys | yo ~ xs1 + xs2 | xo1 + xo2 | xo1 + xo2 | xo1 + xo2
model <- ys | yo ~ xs1 + xs2 | xo1 + xo2 # since we use the same specification...## estimatefit <- opsr(model, dat)## inferencesummary(fit)## using update and model comparisonfit_updated <- update(fit,~ . |1)# only intercepts for the continuous outcomes## null modelfit_null <- opsr_null_model(fit)## likelihood ratio testanova(fit_null, fit_updated, fit)## predictp1 <- predict(fit, group =1, type ="response")p2 <- predict(fit, group =1, counterfact =2, type ="response")plot(p1, p2)abline(a =0, b =1, col ="red")## produce formatted tablestexreg::screenreg(fit, beside =TRUE, include.pseudoR2 =TRUE, include.R2 =TRUE)