model_kselect function

Methods for selecting clusters

Methods for selecting clusters

These functions help select the number of clusters to return from hc, some hierarchical clustering object:

  • k_strict() selects a number of clusters in which there is no distance between cluster members.
  • k_elbow() selects a number of clusters in which there is a fair trade-off between parsimony and fit according to the elbow method.
  • k_silhouette() selects a number of clusters that optimises the silhouette score.

These functions are generally not user-facing but used internally in e.g. the *_equivalence() functions.

k_strict(hc, .data) k_elbow(hc, .data, census, range) k_silhouette(hc, .data, range)

Arguments

  • hc: A hierarchical clustering object.

  • .data: An object of a manynet-consistent class:

    • matrix (adjacency or incidence) from {base} R
    • edgelist, a data frame from {base} R or tibble from {tibble}
    • igraph, from the {igraph} package
    • network, from the {network} package
    • tbl_graph, from the {tidygraph} package
  • census: A motif census object.

  • range: An integer indicating the maximum number of options to consider. The minimum of this and the number of nodes in the network is used.

References

On the elbow method

Thorndike, Robert L. 1953. "Who Belongs in the Family?". Psychometrika, 18(4): 267–76. tools:::Rd_expr_doi("10.1007/BF02289263") .

On the silhouette method

Rousseeuw, Peter J. 1987. “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis.” Journal of Computational and Applied Mathematics, 20: 53–65. tools:::Rd_expr_doi("10.1016/0377-0427(87)90125-7") .

  • Maintainer: James Hollway
  • License: MIT + file LICENSE
  • Last published: 2024-11-05