Fast K-Mer Counting and Clustering for Biological Sequence Analysis
Divisive k-means clustering.
K-mer counting.
K-mer distance matrix computation.
Fast K-mer Counting and Clustering for Biological Sequence Analysis.
Convert sequences to vectors of distances to a subset of seed sequence...
Cluster sequences into operational taxonomic units.
Contains tools for rapidly computing distance matrices and clustering large sequence datasets using fast alignment-free k-mer counting and recursive k-means partitioning. See Vinga and Almeida (2003) <doi:10.1093/bioinformatics/btg005> for a review of k-mer counting methods and applications for biological sequence analysis.