Optimize the Silhouette Width of K-Means Clustering Solutions
Optimize the Silhouette Width of K-Means Clustering Solutions
Generates k-means solutions from 2 to nrow(d) - 1 number of clusters and returns the number of clusters with a higher silhouette width median. See utils_cluster_silhouette() for more details.
This function supports a parallelization setup via future::plan(), and progress bars provided by the package progressr.
utils_cluster_kmeans_optimizer(d =NULL, seed =1)
Arguments
d: (required, matrix) distance matrix typically resulting from distantia_matrix(), but any other square matrix should work. Default: NULL
seed: (optional, integer) Random seed to be used during the K-means computation. Default: 1
Returns
data frame
Examples
#weekly covid prevalence#in 10 California counties#aggregated by monthtsl <- tsl_initialize( x = covid_prevalence, name_column ="name", time_column ="time")|> tsl_subset( names =1:10)|> tsl_aggregate( new_time ="months", fun = max
)if(interactive()){#plotting first three time series tsl_plot( tsl = tsl_subset( tsl = tsl, names =1:3), guide_columns =3)}#compute dissimilarity matrixpsi_matrix <- distantia( tsl = tsl, lock_step =TRUE)|> distantia_matrix()#optimize hierarchical clusteringkmeans_optimization <- utils_cluster_kmeans_optimizer( d = psi_matrix
)#best solution in first rowhead(kmeans_optimization)