GEM() R function from [CUB]

Main function for GEM models

Main function to estimate and validate GEneralized Mixture models with uncertainty.


GEM(Formula,family=c("cub","cube","ihg","cush"),data,...)

Arguments

Formula: Object of class Formula. Response variable is the vector of ordinal observations - see Details.
family: Character string indicating which class of GEM models to fit.
data: an optional data frame (or object coercible by as.data.frame to a data frame) containing the variables in the model. If missing, the variables are taken from environment(Formula).
...: Additional arguments to be passed for the specification of the model. See details and examples.

Returns

An object of the class "GEM" is a list containing the following elements: - estimates: Maximum likelihood estimates of parameters

loglik: Log-likelihood function at the final estimates
varmat: Variance-covariance matrix of final estimates
niter: Number of executed iterations
BIC: BIC index for the estimated model
ordinal: Vector of ordinal responses on which the model has been fitted
time: Processor time for execution
ellipsis: Retrieve the arguments passed to the call and extra arguments generated via the call
family: Character string indicating the sub-class of the fitted model
formula: Returns the Formula of the call for the fitted model
call: Returns the executed call

Details

It is the main function for GEM models estimation, calling for the corresponding function for the specified subclass. The number of categories m is internally retrieved but it is advisable to pass it as an argument to the call if some category has zero frequency.

If family="cub", then a CUB mixture model is fitted to the data to explain uncertainty, feeling and possible shelter effect by further passing the extra argument shelter for the corresponding category. Subjects' covariates can be included by specifying covariates matrices in the Formula as ordinal~Y|W|X, to explain uncertainty (Y), feeling (W) or shelter (X). Notice that covariates for shelter effect can be included only if specified for both feeling and uncertaint (GeCUB models).

If family="cube", then a CUBE mixture model (Combination of Uniform and Beta-Binomial) is fitted to the data to explain uncertainty, feeling and overdispersion. Subjects' covariates can be also included to explain the feeling component or all the three components by specifying covariates matrices in the Formula as ordinal~Y|W|Z to explain uncertainty (Y), feeling (W) or overdispersion (Z). An extra logical argument expinform indicates whether or not to use the expected or the observed information matrix (default is FALSE).

If family="ihg", then an IHG model is fitted to the data. IHG models (Inverse Hypergeometric) are nested into CUBE models (see the references below). The parameter $\theta$ gives the probability of observing the first category and is therefore a direct measure of preference, attraction, pleasantness toward the investigated item. This is the reason why $\theta$ is customarily referred to as the preference parameter of the IHG model. Covariates for the preference parameter $\theta$ have to be specified in matrix form in the Formula as ordinal~U.

If family="cush", then a CUSH model is fitted to the data (Combination of Uniform and SHelter effect). The category corresponding to the inflation should be passed via argument shelter. Covariates for the shelter parameter $\delta$

are specified in matrix form Formula as ordinal~X.

Even if no covariate is included in the model for a given component, the corresponding model matrix needs always to be specified: in this case, it should be set to 0 (see examples below). Extra arguments include the maximum number of iterations (maxiter, default: maxiter=500) for the optimization algorithm and the required error tolerance (toler, default: toler=1e-6).

Standard methods: logLik(), BIC(), vcov(), fitted(), coef(), print(), summary()

are implemented.

The optimization procedure is run via optim() when required. If the estimated variance-covariance matrix is not positive definite, the function returns a warning message and produces a matrix with NA entries.

Examples


library(CUB)
## CUB models with no covariates
model<-GEM(Formula(Walking~0|0|0),family="cub",data=relgoods)
coef(model,digits=5)     # Estimated parameter vector (pai,csi)
logLik(model)            # Log-likelihood function at ML estimates
vcov(model,digits=4)     # Estimated Variance-Covariance matrix
cormat(model)            # Parameter Correlation matrix
fitted(model)            # Fitted probability distribution
makeplot(model)
################
## CUB model with shelter effect
model<-GEM(Formula(officeho~0|0|0),family="cub",shelter=7,data=univer)
BICshe<-BIC(model,digits=4)
################
## CUB model with covariate for uncertainty
modelcovpai<-GEM(Formula(Parents~Smoking|0|0),family="cub",data=relgoods)
fitted(modelcovpai)
makeplot(modelcovpai)
################
## CUB model with covariates for both uncertainty and feeling components
data(univer)
model<-GEM(Formula(global~gender|freqserv|0),family="cub",data=univer,maxiter=50,toler=1e-2)
param<-coef(model)
bet<-param[1:2]      # ML estimates of coefficients for uncertainty covariate: gender
gama<-param[3:4]     # ML estimates of coefficients for feeling covariate: lage
##################
## CUBE models with no covariates
model<-GEM(Formula(MeetRelatives~0|0|0),family="cube",starting=c(0.5,0.5,0.1),
  data=relgoods,expinform=TRUE,maxiter=50,toler=1e-2)
coef(model,digits=4)       # Final ML estimates
vcov(model)
fitted(model)
makeplot(model)
summary(model)
##################
## IHG with covariates
modelcov<-GEM(willingn~freqserv,family="ihg",data=univer)
omega<-coef(modelcov)      ## ML estimates 
maxlik<-logLik(modelcov)   ## 
makeplot(modelcov)
summary(modelcov)
###################
## CUSH models without covariate
model<-GEM(Dog~0,family="cush",shelter=1,data=relgoods)
delta<-coef(model)      # ML estimates of delta
maxlik<-logLik(model)   # Log-likelihood at ML estimates
summary(model)
makeplot(model)

References

D'Elia A. (2003). Modelling ranks using the inverse hypergeometric distribution, Statistical Modelling: an International Journal, 3 , 65--78

D'Elia A. and Piccolo D. (2005). A mixture model for preferences data analysis, Computational Statistics & Data Analysis, 49 , 917--937

Capecchi S. and Piccolo D. (2017). Dealing with heterogeneity in ordinal responses, Quality and Quantity, 51 (5), 2375--2393

Iannario M. (2014). Modelling Uncertainty and Overdispersion in Ordinal Data, Communications in Statistics - Theory and Methods, 43 , 771--786

Piccolo D. (2015). Inferential issues for CUBE models with covariates, Communications in Statistics. Theory and Methods, 44 (23), 771--786.

Iannario M. (2015). Detecting latent components in ordinal data with overdispersion by means of a mixture distribution, Quality & Quantity, 49 , 977--987

Iannario M. and Piccolo D. (2016a). A comprehensive framework for regression models of ordinal data. Metron, 74 (2), 233--252.

Iannario M. and Piccolo D. (2016b). A generalized framework for modelling ordinal data. Statistical Methods and Applications, 25 , 163--189.

GEM function