Model-based clustering with variable selection and estimation of the number of clusters which is either based on [Marbac/Sedki, 2017],[Marbac et al., 2020], or on [Scrucca and Raftery, 2014].
Data: [1:n,1:d] matrix of dataset to be clustered. It consists of n cases of d-dimensional data points. Every case has d attributes, variables or features.
ClusterNo: Numeric which defines number of cluster to search for.
Type: String, either VarSelLCM [Marbac/Sedki, 2017],[Marbac et al., 2020], or clustvarsel [Scrucca and Raftery, 2014].
PlotIt: (optional) Boolean. Default = FALSE = No plotting performed.
...: Further arguments passed on to VarSelCluster or clustvarsel .
Returns
List of - Cls: [1:n] numerical vector with n numbers defining the classification as the main output of the clustering algorithm. It has k unique numbers representing the arbitrary labels of the clustering.
Object: Object defined by clustering algorithm as the other output of this algorithm
References
[Marbac/Sedki, 2017] Marbac, M. and Sedki, M.: Variable selection for model-based clustering using the integrated complete-data likelihood. Statistics and Computing, 27(4), pp. 1049-1063, 2017.
[Marbac et al., 2020] Marbac, M., Sedki, M., & Patin, T.: Variable selection for mixed data clustering: application in human population genomics, Journal of Classification, Vol. 37(1), pp. 124-142. 2020.
Author(s)
Quirin Stier, Michael Thrun
Examples
# Heptadata("Hepta")Data = Hepta$Data
V = ModelBasedVarSelClustering(Data, ClusterNo=7,Type="VarSelLCM")Cls = V$Cls
ClusterAccuracy(Hepta$Cls, Cls, K =7)V = ModelBasedVarSelClustering(Data, ClusterNo=7,Type="clustvarsel")Cls = V$Cls
ClusterAccuracy(Hepta$Cls, Cls, K =7)## Not run:# Heartsheart=VarSelLCM::heart
ztrue <- heart[,"Class"]Data <- heart[,-13]V <- ModelBasedVarSelClustering(Data,2,Type="VarSelLCM")Cls = V$Cls
ClusterAccuracy(ztrue, Cls, K =2)## End(Not run)