Obtain a cluster-by-variable dataframe where the values are the cluster means for the given variables. Takes as input a (low dimensional) ADPROCLUS model of class adpc and a dataset. This dataset must have the same number of rows as the cluster membership matrix A of the model. The variables can be different from the ones the model was trained on. The function uses the cluster membership matrix of the model to computer per cluster the mean of the variables in the dataset. In the output matrix of cluster means, the last row Cl0 corresponds to the baseline cluster consisting of all the observations that were not assigned to a cluster, if this cluster is not empty. This function effectively computes column means of the dataset separately for each cluster.
cluster_means(data, model, digits =3)
Arguments
data: Object-by-variable matrix. Can contain other variables than the ADPROCLUS model. IMPORTANT: The number of rows must be equal to the number of observations in the ADPROCLUS model.
model: ADPROCLUS solution (class: adpc). Low dimensional model possible.
digits: Integer. The number of decimal places that all decimal numbers will be rounded to.
Returns
Cluster-by-variable dataframe where the values are the cluster means for the given variable.
Details
It is worth noting that the output of this function is different from the last output matrix in the summary() method applied to an ADPROCLUS model. The former computes the means over the original variable values while the latter computes them over the approximated model variable values.
Examples
# Obtain data, compute model, report cluster meansx <- CGdata
model <- adproclus(x,3)cluster_means(data = x, model = model)