Benhc function

Performs bootstrap ensemble hierarchical clustering for categorical data.

Performs bootstrap ensemble hierarchical clustering for categorical data.

This function performs a bootstrap ensemble hierarchical clustering of categorical data, as described in details below.

Benhc(x, En)

Arguments

  • x: A nxp data matrix or data frame; n is the number of observations and p is the number of dimensions.

  • En: Number of clusterings to include in the ensemble, i.e., cardinality of the ensemble.

Details

The function 'Benhc' generates a dissimilarity matrix via the bootstrap ensemble. The ensembled dissimilarity matrix is generated using the same procedure as described for the function `enhc' except that each clustering is based on a bootstrap sample of the data. The number of clusters for each clustering is selected randomly from {2,...,sqrt(n)}.

References

Amiri, S., Clarke, B., and Clarke, J. (2015). Clustering categorical data via ensembling dissimilarity matrices. arXiv preprint arXiv:1506.07930.

Examples

#data('zoo') ### zoo includes the zoo data downloaded from UCI ### Machine Learning Repository ### Calculate ensemble dissimilarities with 150 ensemble members #disten<-Benhc(zoo$obs,En=150) ### This function performs a hierarchical cluster analysis using ### dissimilarities obtained by the ensembling procedure in Benhc #en<-hclust(disten,method='average') ### A plot of the dendrogram can be generated by #plot(en,label=zoo$lab)
  • Maintainer: Saeid Amiri
  • License: GPL (>= 2)
  • Last published: 2017-02-01

Useful links