kappa_tools function

Compute kappa tools for data dimensionality analysis

Compute kappa tools for data dimensionality analysis

For a given data set and a given Archetypal Analysis (AA) solution, it finds a set of useful proxies for the dimensionality.

kappa_tools(aa, df = NULL, numBins = 100, chvertices = NULL, verbose = FALSE, ...)

Arguments

  • aa: An object of the class 'archetypal'
  • df: The data frame that was used for AA
  • numBins: The number of bins to be used for computing entropy
  • chvertices: The Convex Hull vertices, if they are given
  • verbose: Logical, set to TRUE if details must be printed
  • ...: Other areguments, not used.

Details

The ECDF for the Squared Errors (SE) is computed and then the relevant curve is classified as 'convex' or 'concave' and its UIK & inflcetion point is found. Then the number of used rows for cfreating archetypes is found. A procedure for creating BIC and andjusted BIC is used. Finally the pecentage of used points that lie on the exact Convex Hull is given.

Returns

A list with next arguments: - ecdf: The ECDF of SE

  • Convexity: The convex or concave classification for ECDF curve

  • UIK: The UIK points of ECDF curve by using [1]

  • INFLECTION: The inflection points of ECDF curve by using [2]

  • NumberRowsUsed: The number of rows used for creating archetypes

  • RowsUsed: The exact rows used for creating archetypes

  • SSE: The Sum of SE

  • BIC: The computed BIC by using [3], [4]

  • adjBIC: The computed adjusted BIC by using [3], [4]

  • CXHE: The percentage of used points that lie on the exact Convex Hull

References

[1] Demetris T. Christopoulos, Introducing Unit Invariant Knee (UIK) As an Objective Choice for Elbow Point in Multivariate Data Analysis Techniques (March 1, 2016). Available at SSRN: https://ssrn.com/abstract=3043076 or http://dx.doi.org/10.2139/ssrn.3043076

[2] Demetris T. Christopoulos, On the efficient identification of an inflection point,International Journal of Mathematics and Scientific Computing,(ISSN: 2231-5330), vol. 6(1), 2016.

[3] Felix Abramovich, Yoav Benjamini, David L. Donoho, Iain M. Johnstone. "Adapting to unknown sparsity by controlling the false discovery rate." The Annals of Statistics, 34(2) 584-653 April 2006. https://doi.org/10.1214/009053606000000074

[4] Murari, Andrea, Emmanuele Peluso, Francesco Cianfrani, Pasquale Gaudio, and Michele Lungaroni. 2019. "On the Use of Entropy to Improve Model Selection Criteria" Entropy 21, no. 4: 394. https://doi.org/10.3390/e21040394

Author(s)

Demetris T. Christopoulos, David F. Midgley (creator of BIC and adjBIC procedures)

Examples

{ ## Use the sample data "wd2" data(wd2) require("geometry") ch=convhulln(as.matrix(wd2),'Fx') chlist=as.list(ch) chvertices = unique(do.call(c,chlist)) aa=archetypal(wd2, 3) out=kappa_tools(aa , df = wd2, numBins = 100, chvertices, verbose = T ) out }
  • Maintainer: Demetris Christopoulos
  • License: GPL (>= 2)
  • Last published: 2024-05-23

Useful links