blsd function

Barcode level signal denoising

Barcode level signal denoising

True taxa are detected on multiple barcodes and with a proprotional number of total and unique k-mer sequences across barcodes, measured as a significant Spearman correlation between the number of total and unique k-mers across barcodes. (padj < 0.05)

blsd( kmer, method = "spearman", ..., p.adjust = "BH", min_kmer_len = 3L, min_number = 3L )

Arguments

  • kmer: kmer data returned by prep_dataset().
  • method: A character string indicating which correlation coefficient is to be used for the test. One of "pearson", "kendall", or "spearman", can be abbreviated.
  • ...: Other arguments passed to cor.test .
  • p.adjust: Pvalue correction method, a character string. Can be abbreviated. Details see p.adjust .
  • min_kmer_len: An integer, the minimal number of kmer to filter taxa. SAHMI use 2.
  • min_number: An integer, the minimal number of cell per taxid. SAHMI use 4.

Returns

A polars DataFrame

Examples

## Not run: # 1. `sahmi_datasets` should be the output of all samples from `prep_dataset()` # 2. `real_taxids_slsd` should be the output of `slsd()` umi_list <- lapply(sahmi_datasets, function(dataset) { (barcode k-mer correlation test) blsd <- blsd(dataset$kmer) real_taxids <- blsd$filter(pl$col("padj")$lt(0.05))$get_column("taxid") # only keep taxids pass Sample level signal denoising real_taxids <- real_taxids$filter(real_taxids$is_in(real_taxids_slsd)) # remove contaminants real_taxids <- real_taxids$filter( real_taxids$is_in(attr(truly_microbe, "truly")) ) # filter UMI data dataset$umi$filter(pl$col("taxid")$is_in(real_taxids)) }) ## End(Not run)

See Also

https://github.com/sjdlabgroup/SAHMI

  • Maintainer: Yun Peng
  • License: MIT + file LICENSE
  • Last published: 2025-03-24