cp_gen function

Generate PARAFAC data sets, optionally with outliers

Generate PARAFAC data sets, optionally with outliers

Generates nsim data sets according to the given parameters. If eps > 0, the specified fraction of random outliers of the identified by the parameter type type are added to the data sets.

cp_gen( I = 20, J = 20, K = 20, nsim = 200, nf = 3, noise = 0.05, noise1 = 0, Acol = TRUE, Bcol = TRUE, Ccol = TRUE, congA = 0.5, congB = 0.5, congC = 0.5, eps = 0, type = c("none", "bl", "gl", "og"), c1 = 10, c2 = 0.1, silent = FALSE )

Arguments

  • I: number of observations

  • J: number of variables

  • K: number of occasions

  • nsim: number of data sets to generate

  • nf: number of PARAFAC components

  • noise: level of homoscedastic (HO) noise

  • noise1: level of heteroscedastic (HE) noise

  • Acol: whether to apply collinearity with factor congA to mode A

  • Bcol: whether to apply collinearity with factor congB to mode B

  • Ccol: whether to apply collinearity with factor congC to mode C

  • congA: collinearity factor for mode A

  • congB: collinearity factor for mode B

  • congC: collinearity factor for mode C

  • eps: fraction of outliers (percent contamination)

  • type: type of outliers: one of "none" for no outliers (possible only of eps==0), "bl" for bad leverage points, "gl" for good leverage points and "og" for orthogonal outliers

  • c1: parameter for outlier generation (c1=10 for type="bl"

    or type="gl" and c1=1 for type="og")

  • c2: parameter for outlier generation (c2=0.1 for type="bl"

    or type="og" and c2=0 for type="gl")

  • silent: whether to issue warnings

Returns

A list consisting of the following lists:

  • As list of nsim matrices for the mode A
  • Bs list of nsim matrices for the mode B
  • Cs list of nsim matrices for the mode C
  • Xs list of nsim PARAFAC data sets, each with dimension IxJxK
  • Os list of nsim vectors containing the added outliers (if any)
  • param list of parameters used for generation of the data sets

Examples

## Generate one PARAFAC data set (nsim=1) with R=2 components (nf=2) and dimensions ## 50x10x10. Apply 0.15 homoscedastic noise and 0.10 heteroscedastic noise, apply ## collinearity with congruence factor 0.5 to all modes. Add 20% bad leverage points. library(rrcov3way) xdat <- cp_gen(I=50, J=100, K=10, nsim=1, nf=2, noise=0.15, noise1=0.10, Acol=TRUE, Bcol=TRUE, Ccol=TRUE, congA=0.5, congB=0.5, congC=0.5, eps=0.2, type="bl") names(xdat)

References

Todorov, V. and Simonacci, V. and Gallo, M. and Trendafilov, N. (2023). A novel estimation procedure for robust CANDECOMP/PARAFAC model fitting. Econometrics and Statistics. In press.

Tomasi, G. and Bro, R., (2006). A comparison of algorithms for fitting the PARAFAC model. Computational Statistics & Data Analysis 50 (7), 1700--1734.

Faber, N.M. and Bro, R. and Hopke, P.K. (2003). Recent developments in CANDECOMP/PARAFAC algorithms: A critical review. Chemometrics and Intelligent Laboratory Systems 65 , 119--137.

  • Maintainer: Valentin Todorov
  • License: GPL (>= 3)
  • Last published: 2024-02-06