reglca function

Regularized Latent Class Analysis

Regularized Latent Class Analysis

Estimates the regularized latent class model for dichotomous responses based on regularization methods (Chen, Liu, Xu, & Ying, 2015; Chen, Li, Liu, & Ying, 2017). The SCAD and MCP penalty functions are available.

reglca(dat, nclasses, weights=NULL, group=NULL, regular_type="scad", regular_lam=0, sd_noise_init=1, item_probs_init=NULL, class_probs_init=NULL, random_starts=1, random_iter=20, conv=1e-05, h=1e-04, mstep_iter=10, maxit=1000, verbose=TRUE, prob_min=.0001) ## S3 method for class 'reglca' summary(object, digits=4, file=NULL, ...)

Arguments

  • dat: Matrix with dichotomous item responses. NAs are allowed.

  • nclasses: Number of classes

  • weights: Optional vector of sampling weights

  • group: Optional vector for grouping variable

  • regular_type: Regularization type. Can be scad or mcp. See gdina for more information.

  • regular_lam: Regularization parameter λ\lambda

  • sd_noise_init: Standard deviation for amount of noise in generating random starting values

  • item_probs_init: Optional matrix of initial item response probabilities

  • class_probs_init: Optional vector of class probabilities

  • random_starts: Number of random starts

  • random_iter: Number of initial iterations for random starts

  • conv: Convergence criterion

  • h: Numerical differentiation parameter

  • mstep_iter: Number of iterations in the M-step

  • maxit: Maximum number of iterations

  • verbose: Logical indicating whether convergence progress should be displayed

  • prob_min: Lower bound for probabilities in estimation

  • object: A required object of class gdina, obtained from a call to the function gdina.

  • digits: Number of digits after decimal separator to display.

  • file: Optional file name for a file in which summary

    should be sinked.

  • ...: Further arguments to be passed.

Details

The regularized latent class model for dichotomous item responses assumes CC

latent classes. The item response probabilities P(Xi=1c)=picP(X_i=1|c)=p_{ic} are estimated in such a way such that the number of different picp_{ic} values per item is minimized. This approach eases interpretability and enables to recover the structure of a true (but unknown) cognitive diagnostic model.

Returns

A list containing following elements (selection):

  • item_probs: Item response probabilities

  • class_probs: Latent class probabilities

  • p.aj.xi: Individual posterior

  • p.xi.aj: Individual likelihood

  • loglike: Log-likelihood value

  • Npars: Number of estimated parameters

  • Nskillpar: Number of skill class parameters

  • G: Number of groups

  • n.ik: Expected counts

  • Nipar: Number of item parameters

  • n_reg: Number of regularized parameters

  • n_reg_item: Number of regularized parameters per item

  • item: Data frame with item parameters

  • pjk: Item response probabilities (in an array)

  • N: Number of persons

  • I: Number of items

References

Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110, 850-866.

Chen, Y., Li, X., Liu, J., & Ying, Z. (2017). Regularized latent class analysis with application in cognitive diagnosis. Psychometrika, 82, 660-692.

See Also

See also the gdina and slca functions for regularized estimation.

Examples

## Not run: ############################################################################# # EXAMPLE 1: Estimating a regularized LCA for DINA data ############################################################################# #---- simulate data I <- 12 # number of items # define Q-matrix q.matrix <- matrix(0,I,2) q.matrix[ 1:(I/3), 1 ] <- 1 q.matrix[ I/3 + 1:(I/3), 2 ] <- 1 q.matrix[ 2*I/3 + 1:(I/3), c(1,2) ] <- 1 N <- 1000 # number of persons guess <- rep(seq(.1,.3,length=I/3), 3) slip <- .1 rho <- 0.3 # skill correlation set.seed(987) dat <- CDM::sim.din( N=N, q.matrix=q.matrix, guess=guess, slip=slip, mean=0*c( .2, -.2 ), Sigma=matrix( c( 1, rho,rho,1), 2, 2 ) ) dat <- dat$dat #--- Model 1: Four latent classes without regularization mod1 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0, random_starts=3, random_iter=10, conv=1E-4) summary(mod1) #--- Model 2: Four latent classes with regularization and lambda=.08 mod2 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0.08, regular_type="scad", random_starts=3, random_iter=10, conv=1E-4) summary(mod2) #--- Model 3: Four latent classes with regularization and lambda=.05 with warm start # "warm start" -> use initial parameters from fitted model with higher lambda value item_probs_init <- mod2$item_probs class_probs_init <- mod2$class_probs mod3 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0.05, regular_type="scad", item_probs_init=item_probs_init, class_probs_init=class_probs_init, random_starts=3, random_iter=10, conv=1E-4) ## End(Not run)