Fits Timmerman and Kiers's four Simultaneous Component Analysis (SCA) models to a 3-way data array or a list of 2-way arrays with the same number of columns.
X: List of length K where the k-th element contains the I[k]-by-J data matrix X[[k]]. If I[k]=I[1] for all k, can input 3-way data array with dim=c(I,J,K).
nfac: Number of factors.
nstart: Number of random starts.
maxit: Maximum number of iterations.
type: Type of SCA model to fit.
rotation: Rotation to use for type="sca-p" or type="sca-ecp".
ctol: Convergence tolerance.
parallel: Logical indicating if parLapply should be used. See Examples.
cl: Cluster created by makeCluster. Only used when parallel=TRUE.
verbose: If TRUE, fitting progress is printed via txtProgressBar. Ignored if parallel=TRUE.
Details
Given a list of matrices X[[k]] = matrix(xk,I[k],J) for k = seq(1,K), the SCA model is
X[[k]] = tcrossprod(D[[k]],B) + E[[k]]
where D[[k]] = matrix(dk,I[k],R) are the Mode A (first mode) weights for the k-th level of Mode C (third mode), B = matrix(b,J,R) are the Mode B (second mode) weights, and E[[k]] = matrix(ek,I[k],J) is the residual matrix corresponding to k-th level of Mode C.
There are four different versions of the SCA model: SCA with invariant pattern (SCA-P), SCA with Parafac2 constraints (SCA-PF2), SCA with INDSCAL constraints (SCA-IND), and SCA with equal average crossproducts (SCA-ECP). These four models differ with respect to the assumed crossproduct structure of the D[[k]] weights:
where Phi[[k]] is specific to the k-th level of Mode C, Phi is common to all K levels of Mode C, and C = matrix(c,K,R) are the Mode C (third mode) weights. This function estimates the weight matrices D[[k]] and B (and C if applicable) using alternating least squares.
Returns
D: List of length K where k-th element contains D[[k]].
B: Mode B weight matrix.
C: Mode C weight matrix.
Phi: Mode A common crossproduct matrix (if type!="sca-p").
SSE: Sum of Squared Errors.
Rsq: R-squared value.
GCV: Generalized Cross-Validation.
edf: Effective degrees of freedom.
iter: Number of iterations.
cflag: Convergence flag.
type: Same as input type.
rotation: Same as input rotation.
References
Helwig, N. E. (2013). The special sign indeterminacy of the direct-fitting Parafac2 model: Some implications, cautions, and recommendations, for Simultaneous Component Analysis. Psychometrika, 78, 725-739.
Timmerman, M. E., & Kiers, H. A. L. (2003). Four simultaneous component models for the analysis of multivariate time series from more than one subject to model intraindividual and interindividual differences. Psychometrika, 68, 105-121.
Default use is 10 random strarts (nstart=10) with 500 maximum iterations of the ALS algorithm for each start (maxit=500) using a convergence tolerance of 1e-4 (ctol=1e-4). The algorithm is determined to have converged once the change in R^2 is less than or equal to ctol.
Output cflag gives convergence information: cflag=0 if ALS algorithm converged normally, cflag=1 if maximum iteration limit was reached before convergence, and cflag=2 if ALS algorithm terminated abnormally due to problem with non-negativity constraints.
Warnings
The ALS algorithm can perform poorly if the number of factors nfac is set too large.
Computational Details
The least squares SCA-P solution can be obtained from the singular value decomposition of the stacked matrix rbind(X[[1]],...,X[[K]]).
The least squares SCA-PF2 solution can be obtained using the uncontrained Parafac2 ALS algorithm (see parafac2).
The least squares SCA-IND solution can be obtained using the Parafac2 ALS algorithm with orthogonality constraints on Mode A.
The least squares SCA-ECP solution can be obtained using the Parafac2 ALS algorithm with orthogonality constraints on Mode A and the Mode C weights fixed at C[k,] = rep(I[k]^0.5,R).
Examples
########## sca-p ########### create random data list with SCA-P structureset.seed(3)mydim <- c(NA,10,20)nf <-2nk <- rep(c(50,100,200), length.out = mydim[3])Dmat <- matrix(rnorm(sum(nk)*nf),sum(nk),nf)Bmat <- matrix(runif(mydim[2]*nf),mydim[2],nf)Dmats <- vector("list",mydim[3])Xmat <- Emat <- vector("list",mydim[3])dfc <-0for(k in1:mydim[3]){ dinds <-1:nk[k]+ dfc
Dmats[[k]]<- Dmat[dinds,] dfc <- dfc + nk[k] Xmat[[k]]<- tcrossprod(Dmats[[k]],Bmat) Emat[[k]]<- matrix(rnorm(nk[k]*mydim[2]),nk[k],mydim[2])}rm(Dmat)Emat <- nscale(Emat,0, ssnew = sumsq(Xmat))# SNR=1X <- mapply("+",Xmat,Emat)# fit SCA-P model (no rotation)scamod <- sca(X,nfac=nf,nstart=1)scamod
# check solutioncrossprod(scamod$D[[1]]%*% diag(scamod$C[1,]^-1))/ nk[1]crossprod(scamod$D[[5]]%*% diag(scamod$C[5,]^-1))/ nk[5]Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])# reorder and resign factorsscamod$B[1:4,]scamod <- reorder(scamod,2:1)scamod$B[1:4,]scamod <- resign(scamod, mode="B", newsign=c(1,-1))scamod$B[1:4,]Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])# rescale factorscolSums(scamod$B^2)colSums(scamod$C^2)scamod <- rescale(scamod, mode="C")colSums(scamod$B^2)colSums(scamod$C^2)Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])########## sca-pf2 ########### create random data list with SCA-PF2 (Parafac2) structureset.seed(3)mydim <- c(NA,10,20)nf <-2nk <- rep(c(50,100,200), length.out = mydim[3])Gmat <-10*matrix(rnorm(nf^2),nf,nf)Bmat <- matrix(runif(mydim[2]*nf),mydim[2],nf)Cmat <- matrix(runif(mydim[3]*nf),mydim[3],nf)Xmat <- Emat <- Fmat <- vector("list",mydim[3])for(k in1:mydim[3]){ Fmat[[k]]<- svd(matrix(rnorm(nk[k]*nf),nk[k],nf),nv=0)$u
Xmat[[k]]<- tcrossprod(Fmat[[k]]%*%Gmat%*%diag(Cmat[k,]),Bmat) Emat[[k]]<- matrix(rnorm(nk[k]*mydim[2]),nk[k],mydim[2])}Emat <- nscale(Emat,0, ssnew = sumsq(Xmat))# SNR=1X <- mapply("+",Xmat,Emat)# fit SCA-PF2 modelscamod <- sca(X,nfac=nf,nstart=1,type="sca-pf2")scamod
# check solutionscamod$Phi
crossprod(scamod$D[[1]]%*% diag(scamod$C[1,]^-1))/ nk[1]crossprod(scamod$D[[5]]%*% diag(scamod$C[5,]^-1))/ nk[5]Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])# reorder and resign factorsscamod$B[1:4,]scamod <- reorder(scamod,2:1)scamod$B[1:4,]scamod <- resign(scamod, mode="B", newsign=c(1,-1))scamod$B[1:4,]Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])# rescale factorscolSums(scamod$B^2)colSums(scamod$C^2)scamod <- rescale(scamod, mode="C")colSums(scamod$B^2)colSums(scamod$C^2)Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])########## sca-ind ########### create random data list with SCA-IND structureset.seed(3)mydim <- c(NA,10,20)nf <-2nk <- rep(c(50,100,200), length.out = mydim[3])Gmat <- diag(nf)# SCA-IND is Parafac2 with Gmat=identityBmat <- matrix(runif(mydim[2]*nf),mydim[2],nf)Cmat <-10*matrix(runif(mydim[3]*nf),mydim[3],nf)Xmat <- Emat <- Fmat <- vector("list",mydim[3])for(k in1:mydim[3]){ Fmat[[k]]<- svd(matrix(rnorm(nk[k]*nf),nk[k],nf),nv=0)$u
Xmat[[k]]<- tcrossprod(Fmat[[k]]%*%Gmat%*%diag(Cmat[k,]),Bmat) Emat[[k]]<- matrix(rnorm(nk[k]*mydim[2]),nk[k],mydim[2])}Emat <- nscale(Emat,0, ssnew = sumsq(Xmat))# SNR=1X <- mapply("+",Xmat,Emat)# fit SCA-IND modelscamod <- sca(X,nfac=nf,nstart=1,type="sca-ind")scamod
# check solutionscamod$Phi
crossprod(scamod$D[[1]]%*% diag(scamod$C[1,]^-1))/ nk[1]crossprod(scamod$D[[5]]%*% diag(scamod$C[5,]^-1))/ nk[5]Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])# reorder and resign factorsscamod$B[1:4,]scamod <- reorder(scamod,2:1)scamod$B[1:4,]scamod <- resign(scamod, mode="B", newsign=c(1,-1))scamod$B[1:4,]Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])# rescale factorscolSums(scamod$B^2)colSums(scamod$C^2)scamod <- rescale(scamod, mode="C")colSums(scamod$B^2)colSums(scamod$C^2)Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])########## sca-ecp ########### create random data list with SCA-ECP structureset.seed(3)mydim <- c(NA,10,20)nf <-2nk <- rep(c(50,100,200), length.out = mydim[3])Gmat <- diag(nf)Bmat <- matrix(runif(mydim[2]*nf),mydim[2],nf)Cmat <- matrix(sqrt(nk),mydim[3],nf)Xmat <- Emat <- Fmat <- vector("list",mydim[3])for(k in1:mydim[3]){ Fmat[[k]]<- svd(matrix(rnorm(nk[k]*nf),nk[k],nf),nv=0)$u
Xmat[[k]]<- tcrossprod(Fmat[[k]]%*%Gmat%*%diag(Cmat[k,]),Bmat) Emat[[k]]<- matrix(rnorm(nk[k]*mydim[2]),nk[k],mydim[2])}Emat <- nscale(Emat,0, ssnew = sumsq(Xmat))# SNR=1X <- mapply("+",Xmat,Emat)# fit SCA-ECP modelscamod <- sca(X,nfac=nf,nstart=1,type="sca-ecp")scamod
# check solutionscamod$Phi
crossprod(scamod$D[[1]]%*% diag(scamod$C[1,]^-1))/ nk[1]crossprod(scamod$D[[5]]%*% diag(scamod$C[5,]^-1))/ nk[5]Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])# reorder and resign factorsscamod$B[1:4,]scamod <- reorder(scamod,2:1)scamod$B[1:4,]scamod <- resign(scamod, mode="B", newsign=c(-1,1))scamod$B[1:4,]Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])# rescale factorscolSums(scamod$B^2)colSums(scamod$C^2)scamod <- rescale(scamod, mode="B")colSums(scamod$B^2)colSums(scamod$C^2)Xhat <- fitted(scamod)sse <- sumsq(mapply("-",Xmat,Xhat))sse/(sum(nk)*mydim[2])## Not run:########## parallel computation ########### create random data list with SCA-IND structureset.seed(3)mydim <- c(NA,10,20)nf <-2nk <- rep(c(50,100,200), length.out = mydim[3])Gmat <- diag(nf)# SCA-IND is Parafac2 with Gmat=identityBmat <- matrix(runif(mydim[2]*nf),mydim[2],nf)Cmat <-10*matrix(runif(mydim[3]*nf),mydim[3],nf)Xmat <- Emat <- Fmat <- vector("list",mydim[3])for(k in1:mydim[3]){ Fmat[[k]]<- svd(matrix(rnorm(nk[k]*nf),nk[k],nf),nv=0)$u
Xmat[[k]]<- tcrossprod(Fmat[[k]]%*%Gmat%*%diag(Cmat[k,]),Bmat) Emat[[k]]<- matrix(rnorm(nk[k]*mydim[2]),nk[k],mydim[2])}Emat <- nscale(Emat,0, ssnew = sumsq(Xmat))# SNR=1X <- mapply("+",Xmat,Emat)# fit SCA-PF2 model (10 random starts -- sequential computation)set.seed(1)system.time({scamod <- sca(X,nfac=nf,type="sca-pf2")})scamod
# fit SCA-PF2 model (10 random starts -- parallel computation)cl <- makeCluster(detectCores())ce <- clusterEvalQ(cl,library(multiway))clusterSetRNGStream(cl,1)system.time({scamod <- sca(X,nfac=nf,type="sca-pf2",parallel=TRUE,cl=cl)})scamod
stopCluster(cl)# fit SCA-IND model (10 random starts -- sequential computation)set.seed(1)system.time({scamod <- sca(X,nfac=nf,type="sca-ind")})scamod
# fit SCA-IND model (10 random starts -- parallel computation)cl <- makeCluster(detectCores())ce <- clusterEvalQ(cl,library(multiway))clusterSetRNGStream(cl,1)system.time({scamod <- sca(X,nfac=nf,type="sca-ind",parallel=TRUE,cl=cl)})scamod
stopCluster(cl)# fit SCA-ECP model (10 random starts -- sequential computation)set.seed(1)system.time({scamod <- sca(X,nfac=nf,type="sca-ecp")})scamod
# fit SCA-ECP model (10 random starts -- parallel computation)cl <- makeCluster(detectCores())ce <- clusterEvalQ(cl,library(multiway))clusterSetRNGStream(cl,1)system.time({scamod <- sca(X,nfac=nf,type="sca-ecp",parallel=TRUE,cl=cl)})scamod
stopCluster(cl)## End(Not run)