numbat1.5.2 package

Haplotype-Aware CNV Analysis from scRNA-Seq

aggregate_counts

Utility function to make reference gene expression profiles

analyze_bulk

Call CNVs in a pseudobulk profile using the Numbat joint HMM

annot_consensus

Annotate a set of segments on a pseudobulk dataframe

annot_haplo_segs

Annotate haplotype segments after HMM decoding

annot_segs

Annotate copy number segments after HMM decoding

annot_theta_mle

Annotate the theta parameter for each segment

annot_theta_roll

Annotate rolling estimate of imbalance level theta

annotate_genes

Annotate genes on allele dataframe

approx_phi_post

Laplace approximation of the posterior of expression fold change phi

approx_theta_post

Laplace approximation of the posterior of allelic imbalance theta

binary_entropy

calculate entropy for a binary variable

calc_allele_lik

Calculate allele likelihoods

calc_allele_LLR

Calculate LLR for an allele HMM

calc_cluster_dist

Calculate expression distance matrix between cell populatoins

calc_exp_LLR

Calculate LLR for an expression HMM

calc_phi_mle_lnpois

Calculate the MLE of expression fold change phi

check_allele_df

Check the format of a allele dataframe

check_contam

check inter-individual contamination

check_exp_noise

check noise level

check_exp_ref

check the format of lambdas_ref

check_gtf_input

Check and format the GTF input

check_matrix

Check the format of a count matrix

check_segs_fix

check the format of a given consensus segment dataframe

check_segs_loh

Check the format of a given clonal LOH segment dataframe

choose_ref_cor

choose beest reference for each cell based on correlation

classify_alleles

classify alleles using viterbi and forward-backward

cnv_heatmap

Plot CNV heatmap

combine_bulk

Combine allele and expression pseudobulks

compute_posterior

Do bayesian averaging to get posteriors

contract_nodes

Merge adjacent set of nodes

detect_clonal_loh

Call clonal LOH using SNP density. Rcommended for cell lines or tumor ...

exp_hclust

Run smoothed expression-based hclust

expand_states

expand multi-allelic CNVs into separate entries in the single-cell pos...

fill_neu_segs

Fill neutral regions into consensus segments

filter_genes

filter for mutually expressed genes

find_common_diploid

Find the common diploid region in a group of pseudobulks

fit_bbinom

fit a Beta-Binomial model by maximum likelihood

fit_gamma

fit gamma maximum likelihood

fit_lnpois

fit a PLN model by maximum likelihood

fit_ref_sse

Fit a reference profile from multiple references using constrained lea...

fit_snp_rate

negative binomial model

generate_postfix

Generate alphabetical postfixes

genotype

Genotyping main function

get_allele_bulk

Aggregate into pseudobulk alelle profile

get_allele_hmm

Get an allele HMM

get_allele_post

get CNV allele posteriors

get_bulk

Aggregate single-cell data into combined bulk expression and allele pr...

get_clone_post

Map cells to the phylogeny (or genotypes) based on CNV posteriors

get_exp_bulk

Aggregate into bulk expression profile

get_exp_likelihoods

get the single cell expression likelihoods

get_exp_post

compute single-cell expression posteriors

get_exp_sc

get the single cell expression dataframe

get_gtree

Get a tidygraph tree with simplified mutational history.

get_haplotype_post

Get phased haplotypes

get_inter_cm

Helper function to get inter-SNP distance

get_internal_nodes

Helper function to get the internal nodes of a dendrogram and the leaf...

get_joint_post

get joint posteriors

get_lambdas_bar

Get average reference expressio profile based on single-cell ref choic...

get_move_cost

Get the cost of a mutation reassignment

get_move_opt

Get the least costly mutation reassignment

get_nodes_celltree

Get the internal nodes of a dendrogram and the leafs in each subtree

get_ordered_tips

Get ordered tips from a tree

get_segs_consensus

Extract consensus CNV segments

get_segs_neu

get neutral segments from multiple pseudobulks

get_snps

process VCFs into SNP dataframe

get_tree_post

Find maximum lilkelihood assignment of mutations on a tree

label_edges

Annotate the direct upstream or downstream mutations on the edges

label_genotype

Label the genotypes on a mutation graph

log_mem

Log memory usage

log_message

Log a message

make_group_bulks

Make a group of pseudobulks

mark_tumor_lineage

Mark the tumor lineage of a phylogeny

Modes

Get the modes of a vector

Numbat

Numbat R6 class

phi_hat_roll

Rolling estimate of expression fold change phi

phi_hat_seg

Estimate of expression fold change phi in a segment

plot_bulks

Plot a group of pseudobulk HMM profiles

plot_consensus

Plot consensus CNVs

plot_exp_roll

Plot single-cell smoothed expression magnitude heatmap

plot_mut_history

Plot mutational history

plot_phylo_heatmap

Plot single-cell CNV calls along with the clonal phylogeny

plot_psbulk

Plot a pseudobulk HMM profile

plot_sc_tree

Plot single-cell smoothed expression magnitude heatmap

pnorm.range.log

Get the total probability from a region of a normal pdf

preprocess_allele

Preprocess allele data

relevel_chrom

Relevel chromosome column

resolve_cnvs

Get unique CNVs from set of segments

retest_bulks

retest consensus segments on pseudobulks

retest_cnv

retest CNVs in a pseudobulk

return_missing_columns

Check the format of a given file

run_group_hmms

Run multiple HMMs

run_numbat

Run workflow to decompose tumor subclones

simes_p

Calculate simes' p

simplify_history

Simplify the mutational history based on likelihood evidence

smooth_expression

filtering, normalization and capping

smooth_segs

Smooth the segments after HMM decoding

switch_prob_cm

predict phase switch probablity as a function of genetic distance

t_test_pval

T-test wrapper, handles error for insufficient observations

test_multi_allelic

test for multi-allelic CNVs

theta_hat_roll

Rolling estimate of imbalance level theta

theta_hat_seg

Estimate of imbalance level theta in a segment

transfer_links

Annotate the direct upstream or downstream node on the edges

upgma

UPGMA and WPGMA clustering

viterbi_loh

Viterbi for clonal LOH detection

A computational method that infers copy number variations (CNVs) in cancer scRNA-seq data and reconstructs the tumor phylogeny. 'numbat' integrates signals from gene expression, allelic ratio, and population haplotype structures to accurately infer allele-specific CNVs in single cells and reconstruct their lineage relationship. 'numbat' can be used to: 1. detect allele-specific copy number variations from single-cells; 2. differentiate tumor versus normal cells in the tumor microenvironment; 3. infer the clonal architecture and evolutionary history of profiled tumors. 'numbat' does not require tumor/normal-paired DNA or genotype data, but operates solely on the donor scRNA-data data (for example, 10x Cell Ranger output). Additional examples and documentations are available at <https://kharchenkolab.github.io/numbat/>. For details on the method please see Gao et al. Nature Biotechnology (2022) <doi:10.1038/s41587-022-01468-y>.

  • Maintainer: Teng Gao
  • License: MIT + file LICENSE
  • Last published: 2026-02-04