Robust Distance based observation orderings based on robust "Six pack"
Robust Distance based observation orderings based on robust "Six pack"
Compute six initial robust estimators of multivariate location and scatter (scale); then, for each, compute the distances dij and take the h (h>n/2) observations with smallest distances. Then compute the statistical distances based on these h observations.
Return the indices of the observations sorted in increasing order.
h: integer, typically around (and slightly larger than) n/2.
full.h: logical specifying if the full (length n) observation ordering should be returned; otherwise only the first h are. For .detmcd(), full.h=FALSE is typical.
scaled: logical indicating if the data x is already scaled; if false, we apply x <- doScale(x, median, scalefn).
scalefn: a function(u) to compute a robust univariate scale of u.
Details
The six initial estimators are
Hyperbolic tangent of standardized data
Spearmann correlation matrix
Tukey normal scores
Spatial sign covariance matrix
BACON
Raw OGK estimate for scatter
References
Hubert, M., Rousseeuw, P. J. and Verdonck, T. (2012) A deterministic algorithm for robust location and scatter. Journal of Computational and Graphical Statistics 21 , 618--637.
Returns
a h′x6matrix of observation indices, i.e., with values from 1..n. If full.h is true, h′=n, otherwise h′=h.
Author(s)
Valentin Todorov, based on the original Matlab code by Tim Verdonck and Mia Hubert. Martin Maechler for tweaks (performance etc), and full.h.
data(pulpfiber)dim(m.pulp <- data.matrix(pulpfiber))# 62 x 8dim(fr6 <- r6pack(m.pulp, h =40, full.h=FALSE))# h x 6 = 40 x 6dim(fr6F <- r6pack(m.pulp, h =40, full.h=TRUE))# n x 6 = 62 x 6stopifnot(identical(fr6, fr6F[1:40,]))