Parafac function

Robust Parafac estimator for compositional data

Robust Parafac estimator for compositional data

Compute a robust Parafac model for compositional data

Parafac(X, ncomp = 2, center = FALSE, center.mode = c("A", "B", "C", "AB", "AC", "BC", "ABC"), scale=FALSE, scale.mode=c("B", "A", "C"), const="none", conv = 1e-06, start="svd", maxit=10000, optim=c("als", "atld", "int2"), robust = FALSE, coda.transform=c("none", "ilr", "clr"), ncomp.rpca = 0, alpha = 0.75, robiter = 100, crit=0.975, trace = FALSE)

Arguments

  • X: 3-way array of data

  • ncomp: Number of components

  • center: Whether to center the data

  • center.mode: If centering the data, on which mode to do this

  • scale: Whether to scale the data

  • scale.mode: If scaling the data, on which mode to do this

  • const: Optional constraints for each mode. Can be a three element character vector or a single character, one of "none" for no constraints (default), "orth" for orthogonality constraints, "nonneg" for nonnegativity constraints or "zerocor" for zero correlation between the extracted factors. For example, const="orth" means orthogonality constraints for all modes, while const=c("orth", "none", "none") sets the orthogonality constraint only for mode A.

  • conv: Convergence criterion, defaults to 1e-6

  • start: Initial values for the A, B and C components. Can be "svd"

    for starting point of the algorithm from SVD's, "random" for random starting point (orthonormalized component matrices or nonnegative matrices in case of nonnegativity constraint), or a list containing user specified components.

  • maxit: Maximum number of iterations, default is maxit=10000.

  • optim: How to optimize the CP loss function, default is to use ALS, i.e. optim="als". Other optins are ATLD (optim="atld") and INT2 (optim="INT2"). Please note that ATLD cannot be used with the robust option.

  • robust: Whether to apply a robust estimation

  • coda.transform: If the data are a composition, use an ilr or clr transformation. Default is non-compositional data, i.e. coda.transform="none"

  • ncomp.rpca: Number of components for robust PCA

  • alpha: Measures the fraction of outliers the algorithm should resist. Allowed values are between 0.5 and 1 and the default is 0.75

  • robiter: Maximal number of iterations for robust estimation

  • crit: Cut-off for identifying outliers, default crit=0.975

  • trace: Logical, provide trace output

Details

The function can compute four versions of the Parafac model:

  1. Classical Parafac,
  2. Parafac for compositional data,
  3. Robust Parafac and
  4. Robust Parafac for compositional data.

This is controlled though the paramters robust=TRUE and coda.transform=c("none", "ilr").

Returns

An object of class "parafac" which is basically a list with components: - fit: Fit value

  • fp: Fit percentage

  • ss: Sum of squares

  • A: Orthogonal loading matrix for the A-mode

  • B: Orthogonal loading matrix for the A-mode

  • Bclr: Orthogonal loading matrix for the B-mode, clr transformed. Available only if coda.transform="ilr", otherwise NULL

  • C: Orthogonal loading matrix for the C-mode

  • Xhat: (Robustly) reconstructed array

  • const: Optional constraints (same as the input parameter)

  • iter: Number of iterations

  • rd: Residual distances

  • sd: Score distances

  • flag: The observations whose residual distance rd is larger than cutoff.rd or score distance sd is larger than cutoff.sd, can be considered outliers and receive a flag equal to zero. The regular observations receive a flag 1

  • robust: The paramater robust, whether robust method is used or not

  • coda.transform: Which coda transformation is used, can be coda.transform=c("none", "ilr", "clr").

References

Harshman, R.A. (1970). Foundations of Parafac procedure: models and conditions for an "explanatory" multi-mode factor analysis. UCLA Working Papers in Phonetics, 16: 1--84.

Engelen, S., Frosch, S. and Jorgensen, B.M. (2009). A fully robust PARAFAC method analyzing fluorescence data. Journal of Chemometrics, 23(3): 124--131.

Kroonenberg, P.M. (1983).Three-mode principal component analysis: Theory and applications (Vol. 2), DSWO press.

Rousseeuw, P.J. and Driessen, K.V. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41(3): 212--223.

Egozcue J.J., Pawlowsky-Glahn V., Mateu-Figueras G. and Barcel'o-Vidal, C. (2003). Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35(3): 279-300

Author(s)

Valentin Todorov valentin.todorov@chello.at and Maria Anna Di Palma madipalma@unior.it and Michele Gallo mgallo@unior.it

Examples

############# ## ## Example with the UNIDO Manufacturing value added data data(va3way) dim(va3way) ## Treat quickly and dirty the zeros in the data set (if any) va3way[va3way==0] <- 0.001 ## res <- Parafac(va3way) res print(res$fit) print(res$A) ## Distance-distance plot plot(res, which="dd", main="Distance-distance plot") data(ulabor) res <- Parafac(ulabor, robust=TRUE, coda.transform="ilr") res ## Plot Orthonormalized A-mode component plot plot(res, which="comp", mode="A", main="Component plot, A-mode") ## Plot Orthonormalized B-mode component plot plot(res, which="comp", mode="B", main="Component plot, B-mode") ## Plot Orthonormalized C-mode component plot plot(res, which="comp", mode="C", main="Component plot, C-mode")
  • Maintainer: Valentin Todorov
  • License: GPL (>= 3)
  • Last published: 2024-02-06