cpd function

N-way Canonical Polyadic Decomposition

N-way Canonical Polyadic Decomposition

Fits Frank L. Hitchcock's Canonical Polyadic Decomposition (CPD) to N-way data arrays. Parameters are estimated via alternating least squares.

cpd(X, nfac, nstart = 10, maxit = 500, ctol = 1e-4, parallel = FALSE, cl = NULL, output = "best", verbose = TRUE)

Arguments

  • X: N-way data array. Missing data are allowed (see Note).
  • nfac: Number of factors.
  • nstart: Number of random starts.
  • maxit: Maximum number of iterations.
  • ctol: Convergence tolerance (R^2 change).
  • parallel: Logical indicating if parLapply should be used. See Examples.
  • cl: Cluster created by makeCluster. Only used when parallel=TRUE.
  • output: Output the best solution (default) or output all nstart solutions.
  • verbose: If TRUE, fitting progress is printed via txtProgressBar. Ignored if parallel=TRUE.

Details

This is an N-way extension of the parafac function without constraints. The form of the CPD for 3-way and 4-way data is given in the documentation for the parafac function. For N > 4, the CPD has the form

X[i.1, ..., i.N] = sum A.1[i.1,r] * ... * A.N[i.N,r] + E[i.1, ..., i.N]

where A.n are the n-th mode's weights for n = 1, ..., N, and E is the N-way residual array. The summation is for r = seq(1,R).

Returns

  • A: List of length N containing the weights for each mode.

  • SSE: Sum of Squared Errors.

  • Rsq: R-squared value.

  • GCV: Generalized Cross-Validation.

  • edf: Effective degrees of freedom.

  • iter: Number of iterations.

  • cflag: Convergence flag. See Note.

References

Harshman, R. A. (1970). Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multimodal factor analysis. UCLA Working Papers in Phonetics, 16, 1-84.

Harshman, R. A., & Lundy, M. E. (1994). PARAFAC: Parallel factor analysis. Computational Statistics and Data Analysis, 18, 39-72.

Hitchcock, F. L. (1927). The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics, 6, 164-189.

Author(s)

Nathaniel E. Helwig helwig@umn.edu

Note

Missing data should be specified as NA values in the input X. The missing data are randomly initialized and then iteratively imputed as a part of the algorithm.

Output cflag gives convergence information: cflag = 0 if algorithm converged normally and cflag = 1 if maximum iteration limit was reached before convergence.

Warnings

The algorithm can perform poorly if the number of factors nfac is set too large.

See Also

The parafac function provides a more flexible implemention for 3-way and 4-way arrays.

The fitted.cpd function creates the model-implied fitted values from a fit "cpd" object.

The resign.cpd function can be used to resign factors from a fit "cpd" object.

The rescale.cpd function can be used to rescale factors from a fit "cpd" object.

The reorder.cpd function can be used to reorder factors from a fit "cpd" object.

Examples

########## 3-way example ########## # create random data array with CPD/Parafac structure set.seed(3) mydim <- c(50, 20, 5) nf <- 3 Amat <- matrix(rnorm(mydim[1]*nf), nrow = mydim[1], ncol = nf) Bmat <- matrix(runif(mydim[2]*nf), nrow = mydim[2], ncol = nf) Cmat <- matrix(runif(mydim[3]*nf), nrow = mydim[3], ncol = nf) Xmat <- tcrossprod(Amat, krprod(Cmat, Bmat)) Xmat <- array(Xmat, dim = mydim) Emat <- array(rnorm(prod(mydim)), dim = mydim) Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR = 1 X <- Xmat + Emat # fit CPD model set.seed(0) cano <- cpd(X, nfac = nf, nstart = 1) cano # fit Parafac model set.seed(0) pfac <- parafac(X, nfac = nf, nstart = 1) pfac ########## 4-way example ########## # create random data array with CPD/Parafac structure set.seed(4) mydim <- c(30,10,8,10) nf <- 4 aseq <- seq(-3, 3, length.out = mydim[1]) Amat <- cbind(dnorm(aseq), dchisq(aseq+3.1, df=3), dt(aseq-2, df=4), dgamma(aseq+3.1, shape=3, rate=1)) Bmat <- svd(matrix(runif(mydim[2]*nf), nrow = mydim[2], ncol = nf), nv = 0)$u Cmat <- matrix(runif(mydim[3]*nf), nrow = mydim[3], ncol = nf) Cstruc <- Cmat > 0.5 Cmat <- Cmat * Cstruc Dmat <- matrix(runif(mydim[4]*nf), nrow = mydim[4], ncol = nf) Xmat <- tcrossprod(Amat, krprod(Dmat, krprod(Cmat, Bmat))) Xmat <- array(Xmat, dim = mydim) Emat <- array(rnorm(prod(mydim)), dim = mydim) Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR = 1 X <- Xmat + Emat # fit CPD model set.seed(0) cano <- cpd(X, nfac = nf, nstart = 1) cano # fit Parafac model set.seed(0) pfac <- parafac(X, nfac = nf, nstart = 1) pfac ########## 5-way example ########## # create random data array with CPD/Parafac structure set.seed(5) mydim <- c(5, 6, 7, 8, 9) nmode <- length(mydim) nf <- 3 Amat <- vector("list", nmode) for(n in 1:nmode) { Amat[[n]] <- matrix(rnorm(mydim[n] * nf), mydim[n], nf) } Zmat <- krprod(Amat[[3]], Amat[[2]]) for(n in 4:5) Zmat <- krprod(Amat[[n]], Zmat) Xmat <- tcrossprod(Amat[[1]], Zmat) Xmat <- array(Xmat, dim = mydim) Emat <- array(rnorm(prod(mydim)), dim = mydim) Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR = 1 X <- Xmat + Emat # fit CPD model set.seed(0) cano <- cpd(X, nfac = nf, nstart = 1) cano
  • Maintainer: Nathaniel E. Helwig
  • License: GPL (>= 2)
  • Last published: 2019-03-13

Useful links