GEM function

Main function for GEM models

Main function for GEM models

Main function to estimate and validate GEneralized Mixture models with uncertainty.

GEM(Formula,family=c("cub","cube","ihg","cush"),data,...)

Arguments

  • Formula: Object of class Formula. Response variable is the vector of ordinal observations - see Details.
  • family: Character string indicating which class of GEM models to fit.
  • data: an optional data frame (or object coercible by as.data.frame to a data frame) containing the variables in the model. If missing, the variables are taken from environment(Formula).
  • ...: Additional arguments to be passed for the specification of the model. See details and examples.

Returns

An object of the class "GEM" is a list containing the following elements: - estimates: Maximum likelihood estimates of parameters

  • loglik: Log-likelihood function at the final estimates

  • varmat: Variance-covariance matrix of final estimates

  • niter: Number of executed iterations

  • BIC: BIC index for the estimated model

  • ordinal: Vector of ordinal responses on which the model has been fitted

  • time: Processor time for execution

  • ellipsis: Retrieve the arguments passed to the call and extra arguments generated via the call

  • family: Character string indicating the sub-class of the fitted model

  • formula: Returns the Formula of the call for the fitted model

  • call: Returns the executed call

Details

It is the main function for GEM models estimation, calling for the corresponding function for the specified subclass. The number of categories m is internally retrieved but it is advisable to pass it as an argument to the call if some category has zero frequency.

If family="cub", then a CUB mixture model is fitted to the data to explain uncertainty, feeling and possible shelter effect by further passing the extra argument shelter for the corresponding category. Subjects' covariates can be included by specifying covariates matrices in the Formula as ordinal~Y|W|X, to explain uncertainty (Y), feeling (W) or shelter (X). Notice that covariates for shelter effect can be included only if specified for both feeling and uncertaint (GeCUB models).

If family="cube", then a CUBE mixture model (Combination of Uniform and Beta-Binomial) is fitted to the data to explain uncertainty, feeling and overdispersion. Subjects' covariates can be also included to explain the feeling component or all the three components by specifying covariates matrices in the Formula as ordinal~Y|W|Z to explain uncertainty (Y), feeling (W) or overdispersion (Z). An extra logical argument expinform indicates whether or not to use the expected or the observed information matrix (default is FALSE).

If family="ihg", then an IHG model is fitted to the data. IHG models (Inverse Hypergeometric) are nested into CUBE models (see the references below). The parameter θ\theta gives the probability of observing the first category and is therefore a direct measure of preference, attraction, pleasantness toward the investigated item. This is the reason why θ\theta is customarily referred to as the preference parameter of the IHG model. Covariates for the preference parameter θ\theta have to be specified in matrix form in the Formula as ordinal~U.

If family="cush", then a CUSH model is fitted to the data (Combination of Uniform and SHelter effect). The category corresponding to the inflation should be passed via argument shelter. Covariates for the shelter parameter δ\delta

are specified in matrix form Formula as ordinal~X.

Even if no covariate is included in the model for a given component, the corresponding model matrix needs always to be specified: in this case, it should be set to 0 (see examples below). Extra arguments include the maximum number of iterations (maxiter, default: maxiter=500) for the optimization algorithm and the required error tolerance (toler, default: toler=1e-6).

Standard methods: logLik(), BIC(), vcov(), fitted(), coef(), print(), summary()

are implemented.

The optimization procedure is run via optim() when required. If the estimated variance-covariance matrix is not positive definite, the function returns a warning message and produces a matrix with NA entries.

Examples

library(CUB) ## CUB models with no covariates model<-GEM(Formula(Walking~0|0|0),family="cub",data=relgoods) coef(model,digits=5) # Estimated parameter vector (pai,csi) logLik(model) # Log-likelihood function at ML estimates vcov(model,digits=4) # Estimated Variance-Covariance matrix cormat(model) # Parameter Correlation matrix fitted(model) # Fitted probability distribution makeplot(model) ################ ## CUB model with shelter effect model<-GEM(Formula(officeho~0|0|0),family="cub",shelter=7,data=univer) BICshe<-BIC(model,digits=4) ################ ## CUB model with covariate for uncertainty modelcovpai<-GEM(Formula(Parents~Smoking|0|0),family="cub",data=relgoods) fitted(modelcovpai) makeplot(modelcovpai) ################ ## CUB model with covariates for both uncertainty and feeling components data(univer) model<-GEM(Formula(global~gender|freqserv|0),family="cub",data=univer,maxiter=50,toler=1e-2) param<-coef(model) bet<-param[1:2] # ML estimates of coefficients for uncertainty covariate: gender gama<-param[3:4] # ML estimates of coefficients for feeling covariate: lage ################## ## CUBE models with no covariates model<-GEM(Formula(MeetRelatives~0|0|0),family="cube",starting=c(0.5,0.5,0.1), data=relgoods,expinform=TRUE,maxiter=50,toler=1e-2) coef(model,digits=4) # Final ML estimates vcov(model) fitted(model) makeplot(model) summary(model) ################## ## IHG with covariates modelcov<-GEM(willingn~freqserv,family="ihg",data=univer) omega<-coef(modelcov) ## ML estimates maxlik<-logLik(modelcov) ## makeplot(modelcov) summary(modelcov) ################### ## CUSH models without covariate model<-GEM(Dog~0,family="cush",shelter=1,data=relgoods) delta<-coef(model) # ML estimates of delta maxlik<-logLik(model) # Log-likelihood at ML estimates summary(model) makeplot(model)

References

D'Elia A. (2003). Modelling ranks using the inverse hypergeometric distribution, Statistical Modelling: an International Journal, 3 , 65--78

D'Elia A. and Piccolo D. (2005). A mixture model for preferences data analysis, Computational Statistics & Data Analysis, 49 , 917--937

Capecchi S. and Piccolo D. (2017). Dealing with heterogeneity in ordinal responses, Quality and Quantity, 51 (5), 2375--2393

Iannario M. (2014). Modelling Uncertainty and Overdispersion in Ordinal Data, Communications in Statistics - Theory and Methods, 43 , 771--786

Piccolo D. (2015). Inferential issues for CUBE models with covariates, Communications in Statistics. Theory and Methods, 44 (23), 771--786.

Iannario M. (2015). Detecting latent components in ordinal data with overdispersion by means of a mixture distribution, Quality & Quantity, 49 , 977--987

Iannario M. and Piccolo D. (2016a). A comprehensive framework for regression models of ordinal data. Metron, 74 (2), 233--252.

Iannario M. and Piccolo D. (2016b). A generalized framework for modelling ordinal data. Statistical Methods and Applications, 25 , 163--189.

See Also

logLik, coef, BIC, makeplot, summary, vcov, fitted, cormat

  • Maintainer: Rosaria Simone
  • License: GPL-2 | GPL-3
  • Last published: 2024-02-23

Useful links