x: A matrix containing the data set. Note that the rows are sample observations and the columns are variables.
L: A factor with the group labels.
lambda.var: Shrinkage intensity for the variances. If not specified it is estimated from the data, see details below. lambda.var=0 implies no shrinkage and lambda.var=1 complete shrinkage.
lambda.freqs: Shrinkage intensity for the frequencies. If not specified it is estimated from the data. lambda.freqs=0 implies no shrinkage (i.e. empirical frequencies) and lambda.freqs=1 complete shrinkage (i.e. uniform frequencies).
var.groups: Estimate group-specific variances.
centered.data: Return column-centered data matrix.
verbose: Provide some messages while computing.
Details
As estimator of the variance we employ var.shrink as described in Opgen-Rhein and Strimmer (2007). For the estimates of frequencies we rely on freqs.shrink as described in Hausser and Strimmer (2009). Note that the pooled mean is computed using the estimated frequencies.
Returns
centroids returns a list with the following components: - samples: a vector containing the samples sizes in each group,
freqs: a vector containing the estimated frequency in each group,
means: the group means and the pooled mean,
variances: the group-specific and the pooled variances, and
centered.data: a matrix containing the centered data.
# load sda librarylibrary("sda")## prepare data setdata(iris)# good old iris dataX = as.matrix(iris[,1:4])Y = iris[,5]## estimate centroids and empirical pooled variancescentroids(X, Y, lambda.var=0)## also compute group-specific variancescentroids(X, Y, var.groups=TRUE, lambda.var=0)## use shrinkage estimator for the variancescentroids(X, Y, var.groups=TRUE)## return centered dataxc = centroids(X, Y, centered.data=TRUE)$centered.data
apply(xc,2, mean)## useful, e.g., to compute the inverse pooled correlation matrixpowcor.shrink(xc, alpha=-1)