utils_cluster_hclust_optimizer function

Optimize the Silhouette Width of Hierarchical Clustering Solutions

Optimize the Silhouette Width of Hierarchical Clustering Solutions

Performs a parallelized grid search to find the number of clusters maximizing the overall silhouette width of the clustering solution (see utils_cluster_silhouette()). When method = NULL, the optimization also includes all methods available in stats::hclust() in the grid search. This function supports parallelization via future::plan() and a progress bar generated by the progressr package (see Examples).

utils_cluster_hclust_optimizer(d = NULL, method = NULL)

Arguments

  • d: (required, matrix) distance matrix typically resulting from distantia_matrix(), but any other square matrix should work. Default: NULL

  • method: (optional, character string) Argument of stats::hclust() defining the agglomerative method. One of: "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). Unambiguous abbreviations are accepted as well.

    This function supports a parallelization setup via future::plan(), and progress bars provided by the package progressr.

Returns

data frame

Examples

#weekly covid prevalence #in 10 California counties #aggregated by month tsl <- tsl_initialize( x = covid_prevalence, name_column = "name", time_column = "time" ) |> tsl_subset( names = 1:10 ) |> tsl_aggregate( new_time = "months", fun = max ) if(interactive()){ #plotting first three time series tsl_plot( tsl = tsl_subset( tsl = tsl, names = 1:3 ), guide_columns = 3 ) } #compute dissimilarity matrix psi_matrix <- distantia( tsl = tsl, lock_step = TRUE ) |> distantia_matrix() #optimize hierarchical clustering hclust_optimization <- utils_cluster_hclust_optimizer( d = psi_matrix ) #best solution in first row head(hclust_optimization)

See Also

Other distantia_support: distantia_aggregate(), distantia_boxplot(), distantia_cluster_hclust(), distantia_cluster_kmeans(), distantia_matrix(), distantia_model_frame(), distantia_spatial(), distantia_stats(), distantia_time_delay(), utils_block_size(), utils_cluster_kmeans_optimizer(), utils_cluster_silhouette()

  • Maintainer: Blas M. Benito
  • License: MIT + file LICENSE
  • Last published: 2025-02-01