Unsupervised Learning Based Definition of Microbial Rare Biosphere
Check average Silhouette score index
Check Calinski-Harabasz index
Check Davies-Bouldin Index
Define Rare Biosphere
Evaluate k from all samples in a dataset
Evaluate sample k
Plot ulrb clustering results and silhouette scores
Plot Rank Abundance Curve of classification results
Plot silhouette scores from clustering results
Prepare data in tidy format
Suggest k
ulrb: Unsupervised Learning Based Definition of Microbial Rare Biosphe...
A tool to define rare biosphere. 'ulrb' solves the problem of the definition of rarity by replacing arbitrary thresholds with an unsupervised machine learning algorithm (partitioning around medoids, or k-medoids). This algorithm works for any type of microbiome data, provided there is a species abundance table. For validation of this method to different species abundance tables see Pascoal et al, 2024 (in peer-review). This method also works for non-microbiome data.