Haplotype-Aware CNV Analysis from scRNA-Seq
Utility function to make reference gene expression profiles
Call CNVs in a pseudobulk profile using the Numbat joint HMM
Annotate a set of segments on a pseudobulk dataframe
Annotate haplotype segments after HMM decoding
Annotate copy number segments after HMM decoding
Annotate the theta parameter for each segment
Annotate rolling estimate of imbalance level theta
Annotate genes on allele dataframe
Laplace approximation of the posterior of expression fold change phi
Laplace approximation of the posterior of allelic imbalance theta
calculate entropy for a binary variable
Calculate allele likelihoods
Calculate LLR for an allele HMM
Calculate expression distance matrix between cell populatoins
Calculate LLR for an expression HMM
Calculate the MLE of expression fold change phi
Check the format of a allele dataframe
check inter-individual contamination
check noise level
check the format of lambdas_ref
Check and format the GTF input
Check the format of a count matrix
check the format of a given consensus segment dataframe
Check the format of a given clonal LOH segment dataframe
choose beest reference for each cell based on correlation
classify alleles using viterbi and forward-backward
Plot CNV heatmap
Combine allele and expression pseudobulks
Do bayesian averaging to get posteriors
Merge adjacent set of nodes
Call clonal LOH using SNP density. Rcommended for cell lines or tumor ...
Run smoothed expression-based hclust
expand multi-allelic CNVs into separate entries in the single-cell pos...
Fill neutral regions into consensus segments
filter for mutually expressed genes
Find the common diploid region in a group of pseudobulks
fit a Beta-Binomial model by maximum likelihood
fit gamma maximum likelihood
fit a PLN model by maximum likelihood
Fit a reference profile from multiple references using constrained lea...
negative binomial model
Generate alphabetical postfixes
Genotyping main function
Aggregate into pseudobulk alelle profile
Get an allele HMM
get CNV allele posteriors
Aggregate single-cell data into combined bulk expression and allele pr...
Map cells to the phylogeny (or genotypes) based on CNV posteriors
Aggregate into bulk expression profile
get the single cell expression likelihoods
compute single-cell expression posteriors
get the single cell expression dataframe
Get a tidygraph tree with simplified mutational history.
Get phased haplotypes
Helper function to get inter-SNP distance
Helper function to get the internal nodes of a dendrogram and the leaf...
get joint posteriors
Get average reference expressio profile based on single-cell ref choic...
Get the cost of a mutation reassignment
Get the least costly mutation reassignment
Get the internal nodes of a dendrogram and the leafs in each subtree
Get ordered tips from a tree
Extract consensus CNV segments
get neutral segments from multiple pseudobulks
process VCFs into SNP dataframe
Find maximum lilkelihood assignment of mutations on a tree
Annotate the direct upstream or downstream mutations on the edges
Label the genotypes on a mutation graph
Log memory usage
Log a message
Make a group of pseudobulks
Mark the tumor lineage of a phylogeny
Get the modes of a vector
Numbat R6 class
Rolling estimate of expression fold change phi
Estimate of expression fold change phi in a segment
Plot a group of pseudobulk HMM profiles
Plot consensus CNVs
Plot single-cell smoothed expression magnitude heatmap
Plot mutational history
Plot single-cell CNV calls along with the clonal phylogeny
Plot a pseudobulk HMM profile
Plot single-cell smoothed expression magnitude heatmap
Get the total probability from a region of a normal pdf
Preprocess allele data
Relevel chromosome column
Get unique CNVs from set of segments
retest consensus segments on pseudobulks
retest CNVs in a pseudobulk
Check the format of a given file
Run multiple HMMs
Run workflow to decompose tumor subclones
Calculate simes' p
Simplify the mutational history based on likelihood evidence
filtering, normalization and capping
Smooth the segments after HMM decoding
predict phase switch probablity as a function of genetic distance
T-test wrapper, handles error for insufficient observations
test for multi-allelic CNVs
Rolling estimate of imbalance level theta
Estimate of imbalance level theta in a segment
Annotate the direct upstream or downstream node on the edges
UPGMA and WPGMA clustering
Viterbi for clonal LOH detection
A computational method that infers copy number variations (CNVs) in cancer scRNA-seq data and reconstructs the tumor phylogeny. 'numbat' integrates signals from gene expression, allelic ratio, and population haplotype structures to accurately infer allele-specific CNVs in single cells and reconstruct their lineage relationship. 'numbat' can be used to: 1. detect allele-specific copy number variations from single-cells; 2. differentiate tumor versus normal cells in the tumor microenvironment; 3. infer the clonal architecture and evolutionary history of profiled tumors. 'numbat' does not require tumor/normal-paired DNA or genotype data, but operates solely on the donor scRNA-data data (for example, 10x Cell Ranger output). Additional examples and documentations are available at <https://kharchenkolab.github.io/numbat/>. For details on the method please see Gao et al. Nature Biotechnology (2022) <doi:10.1038/s41587-022-01468-y>.