utils_cluster_kmeans_optimizer function

Optimize the Silhouette Width of K-Means Clustering Solutions

Optimize the Silhouette Width of K-Means Clustering Solutions

Generates k-means solutions from 2 to nrow(d) - 1 number of clusters and returns the number of clusters with a higher silhouette width median. See utils_cluster_silhouette() for more details.

This function supports a parallelization setup via future::plan(), and progress bars provided by the package progressr.

utils_cluster_kmeans_optimizer(d = NULL, seed = 1)

Arguments

  • d: (required, matrix) distance matrix typically resulting from distantia_matrix(), but any other square matrix should work. Default: NULL
  • seed: (optional, integer) Random seed to be used during the K-means computation. Default: 1

Returns

data frame

Examples

#weekly covid prevalence #in 10 California counties #aggregated by month tsl <- tsl_initialize( x = covid_prevalence, name_column = "name", time_column = "time" ) |> tsl_subset( names = 1:10 ) |> tsl_aggregate( new_time = "months", fun = max ) if(interactive()){ #plotting first three time series tsl_plot( tsl = tsl_subset( tsl = tsl, names = 1:3 ), guide_columns = 3 ) } #compute dissimilarity matrix psi_matrix <- distantia( tsl = tsl, lock_step = TRUE ) |> distantia_matrix() #optimize hierarchical clustering kmeans_optimization <- utils_cluster_kmeans_optimizer( d = psi_matrix ) #best solution in first row head(kmeans_optimization)

See Also

Other distantia_support: distantia_aggregate(), distantia_boxplot(), distantia_cluster_hclust(), distantia_cluster_kmeans(), distantia_matrix(), distantia_model_frame(), distantia_spatial(), distantia_stats(), distantia_time_delay(), utils_block_size(), utils_cluster_hclust_optimizer(), utils_cluster_silhouette()

  • Maintainer: Blas M. Benito
  • License: MIT + file LICENSE
  • Last published: 2025-02-01