indscal() R function from [multiway]

Individual Differences Scaling

Fits Carroll and Chang's Individual Differences Scaling (INDSCAL) model to 3-way dissimilarity or similarity data. Parameters are estimated via alternating least squares with optional constraints.


indscal(X, nfac, nstart = 10, const = NULL, control = NULL,
        type = c("dissimilarity", "similarity"),
        Bfixed = NULL, Bstart = NULL, Bstruc = NULL, Bmodes = NULL,
        Cfixed = NULL, Cstart = NULL, Cstruc = NULL, Cmodes = NULL,
        maxit = 500, ctol = 1e-4, parallel = FALSE, cl = NULL,
        output = c("best", "all"), verbose = TRUE, backfit = FALSE)

Arguments

X: Three-way data array with dim=c(J,J,K) where X[,,k] is dissimilarity matrix. Can also input a list of (dis)similarity matrices or objects output by dist.
nfac: Number of factors.
nstart: Number of random starts.
const: Character vector of length 2 giving the constraints for modes B and C (defaults to unconstrained for B and non-negative for C). See const for the 24 available options. Constraints for Mode C weights are limited to one of the 8 possible non-negative options.
control: List of parameters controlling options for smoothness constraints. This is passed to const.control, which describes the available options.
type: Character indicating if X contains dissimilarity data (default) or similarity data.
Bfixed: Used to fit model with fixed Mode B weights.
Bstart: Starting Mode B weights. Default uses random weights.
Bstruc: Structure constraints for Mode B weights. See Note.
Bmodes: Mode ranges for Mode B weights (for unimodality constraints). See Note.
Cfixed: Used to fit model with fixed Mode C weights.
Cstart: Starting Mode C weights. Default uses random weights.
Cstruc: Structure constraints for Mode C weights. See Note.
Cmodes: Mode ranges for Mode C weights (for unimodality constraints). See Note.
maxit: Maximum number of iterations.
ctol: Convergence tolerance.
parallel: Logical indicating if parLapply should be used. See Examples.
cl: Cluster created by makeCluster. Only used when parallel=TRUE.
output: Output the best solution (default) or output all nstart solutions.
verbose: If TRUE, fitting progress is printed via txtProgressBar. Ignored if parallel=TRUE.
backfit: Should backfitting algorithm be used for cmls?

Details

Given a 3-way array X = array(x,dim=c(J,J,K)) with X[,,k] denoting the k-th subject's dissimilarity matrix rating J objects, the INDSCAL model can be written as


`Z[i,j,k] = sum B[i,r]B[j,r]C[k,r] + E[i,j,k]`

where Z is the array of scalar products obtained from X, B = matrix(b,J,R) are the object weights, C = matrix(c,K,R) are the non-negative subject weights, and E = array(e,dim=c(J,J,K)) is the 3-way residual array. The summation is for r = seq(1,R).

Weight matrices are estimated using an alternating least squares algorithm with optional constraints.

Returns

If output="best", returns an object of class "indscal" with the following elements: - B: Mode B weight matrix.

C: Mode C weight matrix.
SSE: Sum of Squared Errors.
Rsq: R-squared value.
GCV: Generalized Cross-Validation.
edf: Effective degrees of freedom.
iter: Number of iterations.
cflag: Convergence flag. See Note.
const: See argument const.
control: See argument control.
fixed: Logical vector indicating whether 'fixed' weights were used for each mode.
struc: Logical vector indicating whether 'struc' constraints were used for each mode.

Otherwise returns a list of length nstart where each element is an object of class "indscal".

References

Carroll, J. D., & Chang, J-J. (1970). Analysis of individual differences in multidimensional scaling via an n-way generalization of "Eckart-Young" decomposition. Psychometrika, 35, 283-319. tools:::Rd_expr_doi("10.1007/BF02310791")

Author(s)

Nathaniel E. Helwig helwig@umn.edu

Note

Structure constraints should be specified with a matrix of logicals (TRUE/FALSE), such that FALSE elements indicate a weight should be constrained to be zero. Default uses unstructured weights, i.e., a matrix of all TRUE values.

When using unimodal constraints, the *modes inputs can be used to specify the mode search range for each factor. These inputs should be matrices with dimension c(2,nfac) where the first row gives the minimum mode value and the second row gives the maximum mode value (with respect to the indicies of the given corresponding matrix).

Output cflag gives convergence information: cflag = 0 if algorithm converged normally, cflag = 1 if maximum iteration limit was reached before convergence, and cflag = 2 if algorithm terminated abnormally due to a problem with the constraints.

Warnings

The algorithm can perform poorly if the number of factors nfac is set too large.

Examples


##########   array example   ##########

# create random data array with INDSCAL structure
set.seed(3)
mydim <- c(50,5,10)
nf <- 2
X <- array(0, dim = c(rep(mydim[2],2), mydim[3]))
for(k in 1:mydim[3]) {
  X[,,k] <- as.matrix(dist(t(matrix(rnorm(prod(mydim[1:2])), mydim[1], mydim[2]))))
}

# fit INDSCAL model
imod <- indscal(X, nfac = nf, nstart = 1)
imod

# check solution
Xhat <- fitted(imod)
sum((array(apply(X,3,ed2sp), dim = dim(X)) - Xhat)^2)
imod$SSE

# reorder and resign factors
imod$B[1:4,]
imod <- reorder(imod, 2:1)
imod$B[1:4,]
imod <- resign(imod, newsign = c(1,-1))
imod$B[1:4,]
sum((array(apply(X,3,ed2sp), dim = dim(X)) - Xhat)^2)
imod$SSE

# rescale factors
colSums(imod$B^2)
colSums(imod$C^2)
imod <- rescale(imod, mode = "C")
colSums(imod$B^2)
colSums(imod$C^2)
sum((array(apply(X,3,ed2sp), dim = dim(X)) - Xhat)^2)
imod$SSE

##########   list example   ##########

# create random data array with INDSCAL structure
set.seed(4)
mydim <- c(100, 8, 20)
nf <- 3
X <- vector("list", mydim[3])
for(k in 1:mydim[3]) {
  X[[k]] <- dist(t(matrix(rnorm(prod(mydim[1:2])), mydim[1], mydim[2])))
}

# fit INDSCAL model (orthogonal B, non-negative C)
imod <- indscal(X, nfac = nf, nstart = 1, const = c("orthog", "nonneg"))
imod

# check solution
Xhat <- fitted(imod)
sum((array(unlist(lapply(X,ed2sp)), dim = mydim[c(2,2,3)]) - Xhat)^2)
imod$SSE
crossprod(imod$B)

## Not run:

##########   parallel computation   ##########

# create random data array with INDSCAL structure
set.seed(3)
mydim <- c(50,5,10)
nf <- 2
X <- array(0,dim=c(rep(mydim[2],2), mydim[3]))
for(k in 1:mydim[3]) {
  X[,,k] <- as.matrix(dist(t(matrix(rnorm(prod(mydim[1:2])), mydim[1], mydim[2]))))
}

# fit INDSCAL model (10 random starts -- sequential computation)
set.seed(1)
system.time({imod <- indscal(X, nfac = nf)})
imod

# fit INDSCAL model (10 random starts -- parallel computation)
cl <- makeCluster(detectCores())
ce <- clusterEvalQ(cl,library(multiway))
clusterSetRNGStream(cl, 1)
system.time({imod <- indscal(X, nfac = nf, parallel = TRUE, cl = cl)})
imod
stopCluster(cl)
## End(Not run)

multiway package Read PDF manual

Maintainer: Nathaniel E. Helwig
License: GPL (>= 2)
Last published: 2025-04-15

Useful links

indscal function