Estimate discrete choice model with random parameters
Estimate discrete choice model with random parameters
Estimation of discrete choice models such as Binary (logit and probit), Poisson and Ordered (logit and probit) model with random coefficients for cross-sectional and panel data using simulated maximum likelihood.
Rchoice( formula, data, subset, weights, na.action, family, start =NULL, ranp =NULL, R =40, haltons =NA, seed =NULL, correlation =FALSE, panel =FALSE, index =NULL, mvar =NULL, print.init =FALSE, init.ran =0.1, gradient =TRUE,...)## S3 method for class 'Rchoice'terms(x,...)## S3 method for class 'Rchoice'model.matrix(object,...)## S3 method for class 'Rchoice'coef(object,...)## S3 method for class 'Rchoice'fitted(object,...)## S3 method for class 'Rchoice'residuals(object,...)## S3 method for class 'Rchoice'df.residual(object,...)## S3 method for class 'Rchoice'update(object, new,...)## S3 method for class 'Rchoice'logLik(object,...)## S3 method for class 'Rchoice'print( x, digits = max(3, getOption("digits")-3), width = getOption("width"),...)## S3 method for class 'Rchoice'summary(object,...)## S3 method for class 'summary.Rchoice'print( x, digits = max(3, getOption("digits")-3), width = getOption("width"),...)
Arguments
formula: a symbolic description of the model to be estimated. The formula consists in two parts. The first one is reserved for standard variables with fixed and random parameters. The second one is reserved for variables that enter in the mean of the random parameters. See for example rFormula,
data: the data. It may be a pdata.frame object or an ordinary data.frame,
subset: an optional vector specifying a subset of observations,
weights: an optional vector of weigths,
na.action: a function wich indicated what should happen when the data contains NA's,
family: the distribution to be used. It might be family = binomial("probit") for a Probit Model, family = binomial("logit") for a Logit model, family = ordinal("probit") for an Ordered Probit Model, family = ordinal("logit") for a Ordered Logit Model for an Ordered Logit Model, and family = "poisson" for a Poisson Model,
start: a vector of starting values,
ranp: a named vector whose names are the random parameters and values the distribution: "n" for normal, "ln" for log-normal, "cn" for truncated normal, "u" for uniform, "t" for triangular, "sb" for Johnson Sb,
R: the number of draws if ranp is not NULL,
haltons: only relevant if ranp is not NULL. If not NULL, halton sequence is used instead of pseudo-random numbers. If haltons=NA, some default values are used for the prime of the sequence and for the number of element dropped. Otherwise, haltons should be a list with elements prime and drop,
seed: the seed for the pseudo-random draws. This is only relevant if haltons = NULL,
correlation: only relevant if ranp is not NULL. If TRUE, the correlation between random parameters is taken into account,
panel: if TRUE a panel data model is estimated,
index: a string indicating the id' for individuals in the data. This argument is not required if data is a pdata.frame` object,
mvar: only valid if ranp is not NULL. This is a named list, where the names correspond to the variables with random parameters, and the values correspond to the variables that enter in the mean of each random parameters,
print.init: if TRUE, the initial values for the optimization procedure are printed,
init.ran: initial values for standard deviation of random parameters. Default is 0.1,
gradient: if FALSE, numerical gradients are used for the optimization procedure of models with random parameters,
...: further arguments passed to maxLik,
x, object: and object of class Rchoice,
new: an updated formula for the update method,
digits: number of digits,
width: width,
Returns
An object of class ```Rchoice`'', a list elements: - coefficients: the named vector of coefficients,
family: type of model,
link: distribution of the errors,
logLik: a set of values of the maximum likelihood procedure,
mf: the model framed used,
formula: the formula (a Formula object),
time: proc.time() minus the start time,
freq: frequency of dependent variable,
draws: type of draws used,
R.model: TRUE if a random parameter model is fitted,
R: number of draws used,
bi: an array of dimension N×R×K with the individual parameters,
Qir: matrix of dimension N×R representing Pir/∑rPir,
ranp: vector indicating the variables with random parameters and their distribution,
probabilities: the fitted probabilities for each individuals,
residuals: the residuals,
call: the matched call.
Details
The models are estimated using the maxLik function from maxLik package.
If ranp is not NULL, the random parameter model is estimated. A random parameter model or random coefficient models permits regression parameter to vary across individuals according to some distribution. A fully parametric random parameter model specifies the latent variable y∗ conditional on regressors x and given parameters βi to have conditional density f(y∣x,βi) where βi are iid with density g(βi∣θi). The density is assumed a priori by the user by the argument ranp. If the parameters are assumed to be normally distributed βiN(β,Σ), then the random parameter are constructed as:
βir=β+Lωir
where LL′=Σ and ωir is the r-th draw from standard normal distribution for individual i.
Once the model is specified by the argument family, the model is estimated using Simulated Maximum Likelihood (SML). The probabilities, given by f(y∣x,βi), are simulated using R pseudo-draws if halton=NULL or R halton draws if halton = NA. The user can also specified the primes and the number of dropped elements for the halton draws. For example, if the model consists of two random parameters, the user can specify haltons = list("prime" = c(2, 3), "drop" = c(11, 11)).
A random parameter hierarchical model can be estimated by including heterogeneity in the mean of the random parameters:
βir=β+π′si+Lωir
Rchoice manages the variables in the hierarchical model by the formula object: all the hierarchical variables (si) are included after the | symbol. The argument mvar indicate which variables enter in each random parameter. See examples below
Examples
## Probit modeldata("Workmroz")probit <- Rchoice(lfp ~ k5 + k618 + age + wc + hc + lwg + inc, data = Workmroz, family = binomial('probit'))summary(probit)## Poisson modeldata("Articles")poisson <- Rchoice(art ~ fem + mar + kid5 + phd + ment, data = Articles, family = poisson)summary(poisson)## Ordered probit modeldata("Health")oprobit <- Rchoice(newhsat ~ age + educ + hhinc + married + hhkids,data = Health, family = ordinal('probit'), subset = year ==1988)summary(oprobit)## Poisson Model with Random Parameterspoisson.ran <- Rchoice(art ~ fem + mar + kid5 + phd + ment, data = Articles, family = poisson, ranp = c(kid5 ="n", phd ="n", ment ="n"))summary(poisson.ran)## Poisson Model with Correlated Random Parameterspoissonc.ran <- Rchoice(art ~ fem + mar + kid5 + phd + ment, data = Articles, ranp = c(kid5 ="n", phd ="n", ment ="n"), family = poisson, correlation =TRUE, R =20)summary(poissonc.ran)## Hierarchical Poisson ModelpoissonH.ran <- Rchoice(art ~ fem + mar + kid5 + phd + ment | fem + phd, data = Articles, ranp = c(kid5 ="n", phd ="n", ment ="n"), mvar = list(phd = c("fem"), ment = c("fem","phd")), family = poisson, R =10)summary(poissonH.ran)## Ordered Probit Model with Random Effects and Random ParametersHealth$linc <- log(Health$hhinc)oprobit.ran <- Rchoice(newhsat ~ age + educ + married + hhkids + linc, data = Health[1:2000,], family = ordinal('probit'), ranp = c(constant ="n", hhkids ="n", linc ="n"), panel =TRUE, index ="id", R =10, print.init =TRUE)summary(oprobit.ran)
References
Greene, W. H. (2012). Econometric Analysis. 7 edition. Prentice Hall.
Train, K. (2009). Discrete Choice Methods with Simulation. Cambridge university press.