cluster_means() R function from [adproclus]

Cluster Means based on Original Variables

Obtain a cluster-by-variable dataframe where the values are the cluster means for the given variables. Takes as input a (low dimensional) ADPROCLUS model of class adpc and a dataset. This dataset must have the same number of rows as the cluster membership matrix $A$ of the model. The variables can be different from the ones the model was trained on. The function uses the cluster membership matrix of the model to computer per cluster the mean of the variables in the dataset. In the output matrix of cluster means, the last row Cl0 corresponds to the baseline cluster consisting of all the observations that were not assigned to a cluster, if this cluster is not empty. This function effectively computes column means of the dataset separately for each cluster.


cluster_means(data, model, digits = 3)

Arguments

data: Object-by-variable matrix. Can contain other variables than the ADPROCLUS model. IMPORTANT: The number of rows must be equal to the number of observations in the ADPROCLUS model.
model: ADPROCLUS solution (class: adpc). Low dimensional model possible.
digits: Integer. The number of decimal places that all decimal numbers will be rounded to.

Returns

Cluster-by-variable dataframe where the values are the cluster means for the given variable.

Details

It is worth noting that the output of this function is different from the last output matrix in the summary() method applied to an ADPROCLUS model. The former computes the means over the original variable values while the latter computes them over the approximated model variable values.

Examples


# Obtain data, compute model, report cluster means
x <- CGdata
model <- adproclus(x, 3)
cluster_means(data = x, model = model)

adproclus package Read PDF manual

Maintainer: Henry Heppe
License: GPL (>= 3)
Last published: 2024-08-17

Useful links

cluster_means function

Cluster Means based on Original Variables

Arguments

Returns

Details

Examples