Fast and Robust Hierarchical Clustering with Noise Point Detection
Internal Cluster Validity Measures
External Cluster Validity Measures and Pairwise Partition Similarity S...
Hierarchical Clustering Algorithm Genie
The Genie Hierarchical Clustering Algorithm (with Extras)
Inequality Measures
Minimum Spanning Tree of the Pairwise Distance Graph
The Genie algorithm (Gagolewski, 2021 <DOI:10.1016/j.softx.2021.100722>) is a robust and outlier-resistant hierarchical clustering method (Gagolewski, Bartoszuk, Cena, 2016 <DOI:10.1016/j.ins.2016.05.003>). This package features its faster and more powerful version. It allows clustering with respect to mutual reachability distances, enabling it to act as a noise point detector or a version of 'HDBSCAN*' that can identify a predefined number of clusters. The package also features an implementation of the Gini and Bonferroni inequality indices, external cluster validity measures (e.g., the normalised clustering accuracy, the adjusted Rand index, the Fowlkes-Mallows index, and normalised mutual information), and internal cluster validity indices (e.g., the Calinski-Harabasz, Davies-Bouldin, Ball-Hall, Silhouette, and generalised Dunn indices). The 'Python' version of 'genieclust' is available via 'PyPI'.
Useful links