Hierarchical clustering of variables with consolidation
Hierarchical Cluster Analysis of a set of variables with consolidation. Directional or local groups may be defined. Each group of variables is associated with a latent component. Moreover, the latent component may be constrained using external information collected on the observations or on the variables.
CLV( X, Xu = NULL, Xr = NULL, method = NULL, sX = TRUE, sXr = FALSE, sXu = FALSE, nmax = 20, maxiter = 20, graph = TRUE )
X
: : The matrix of variables to be clustered
Xu
: : The external variables associated with the columns of X
Xr
: : The external variables associated with the rows of X
method
: : The criterion to be use in the cluster analysis.
1 or "directional" : the squared covariance is used as a measure of proximity (directional groups).
2 or "local" : the covariance is used as a measure of proximity (local groups)
sX
: ,TRUE/FALSE : standardization or not of the columns X (TRUE by default)
(predefined -> cX = TRUE : column-centering of X)
sXr
: ,TRUE/FALSE : standardization or not of the columns Xr (FALSE by default)
(predefined -> cXr = TRUE : column-centering of Xr)
sXu
: ,TRUE/FALSE : standardization or not of the columns Xu (FALSE by default)
(predefined -> cXu= FALSE : no centering, Xu considered as a weight matrix)
nmax
: : maximum number of partitions for which the consolidation will be done (by default nmax=20)
maxiter
: : maximum number of iterations allowed for the consolidation/partitioning algorithm (by default maxiter=20)
graph,
: TRUE/FALSE (by default TRUE) : dendrogram and variation of the optimization criterion.
These plots can also be obtained with "plot"
tabres: Results of the clustering algorithm. In each line you find the results of one specific step of the hierarchical clustering.
Columns 1 and 2: The numbers of the two groups which are merged
Column 3: Name of the new cluster
Column 4: The value of the aggregation criterion for the Hierarchical Ascendant Clustering (HAC)
Column 5: The value of the clustering criterion for the HAC
Column 6: The percentage of the explained initial criterion value
(method 1 => % var. expl. by the latent comp.)
Column 7: The value of the clustering criterion after consolidation
Column 8: The percentage of the explained initial criterion value after consolidation
Column 9: The number of iterations in the partitioning algorithm.
Remark : A zero in columns 7 to 9 indicates that no consolidation was done
partition K: contains a list for each number of clusters of the partition, K=2 to nmax with
If external variables are used, define either Xr or Xu, but not both. Use the LCLV function when Xr and Xu are simultaneously provided.
data(apples_sh) #directional groups resclvX <- CLV(X = apples_sh$senso, method = "directional", sX = TRUE) plot(resclvX,type="dendrogram") plot(resclvX,type="delta") #local groups with external variables Xr resclvYX <- CLV(X = apples_sh$pref, Xr = apples_sh$senso, method = "local", sX = FALSE, sXr = TRUE)
Vigneau E., Qannari E.M. (2003). Clustering of variables around latents components. Comm. Stat, 32(4), 1131-1150.
Vigneau E., Chen M., Qannari E.M. (2015). ClustVarLV: An R Package for the clustering of Variables around Latent Variables. The R Journal, 7(2), 134-148
CLV_kmeans, LCLV
Useful links