maxpear function

Maximize/Compute Posterior Expected Adjusted Rand Index

Maximize/Compute Posterior Expected Adjusted Rand Index

Based on a posterior similarity matrix of a sample of clusterings maxpear finds the clustering that maximizes the posterior expected Rand adjusted index (PEAR) with the true clustering, while pear computes PEAR for several provided clusterings.

maxpear(psm, cls.draw = NULL, method = c("avg", "comp", "draws", "all"), max.k = NULL) pear(cls,psm)

Arguments

  • psm: a posterior similarity matrix, usually obtained from a call to comp.psm.
  • cls, cls.draw: a matrix in which every row corresponds to a clustering of the ncol(cls) objects. cls.draw refers to the clusterings that have been used to compute psm, cls.draw has to be provided if method="draw" or "all".
  • method: the maximization method used. Should be one of "avg", "comp", "draws" or "all". The default is "avg".
  • max.k: integer, if method="avg" or "comp" the maximum number of clusters up to which the hierarchical clustering is cut. Defaults to ceiling(nrow(psm)/8).

Details

For method="avg" and "comp" 1-psm is used as a distance matrix for hierarchical clustering with average/complete linkage. The hierachical clustering is cut for the cluster sizes 1:max.k and PEAR computed for these clusterings.

Method "draws" simply computes PEAR for each row of cls.draw and takes the maximum.

If method="all" all maximization methods are applied.

Returns

  • cl: clustering with maximal value of PEAR. If method="all" a matrix containing the clustering with the higest value of PEAR over all methods in the first row and the clusterings of the individual methods in the next rows.

  • value: value of PEAR. A vector corresponding to the rows of cl if method="all".

  • method: the maximization method used.

References

Fritsch, A. and Ickstadt, K. (2009) An improved criterion for clustering based on the posterior similarity matrix, Bayesian Analysis, accepted.

Author(s)

Arno Fritsch, arno.fritsch@tu-dortmund.de

See Also

comp.psm for computing posterior similarity matrix, minbinder, medv, relabel

for other possibilities for processing a sample of clusterings.

Examples

data(cls.draw1.5) # sample of 500 clusterings from a Bayesian cluster model tru.class <- rep(1:8,each=50) # the true grouping of the observations psm1.5 <- comp.psm(cls.draw1.5) mpear1.5 <- maxpear(psm1.5) table(mpear1.5$cl, tru.class) # Does hierachical clustering with Ward's method lead # to a better value of PEAR? hclust.ward <- hclust(as.dist(1-psm1.5), method="ward") cls.ward <- t(apply(matrix(1:20),1, function(k) cutree(hclust.ward,k=k))) ward1.5 <- pear(cls.ward, psm1.5) max(ward1.5) > mpear1.5$value
  • Maintainer: Arno Fritsch
  • License: GPL (>= 2)
  • Last published: 2022-05-02

Useful links