Measure cell-type specificity of cell-weighted Fold-changes
Measure cell-type specificity of cell-weighted Fold-changes
This function normalizes cwFold-changes by each gene to help visualize the cell-type specificity of DEGs. It then tests if a cell-type has a large change in correlation from bulk DEGs. Finally, it identifies genes that may be specific to each cell-type.
cwFC: A matrix or data frame of cell-weighted fold-changes of DEGs. Rows are DEGs and columns are cell-types.
celltype_prop: A matrix or data frame of cell-type proportions. Rows are different cell-types and columns are different samples. These cell-type proportions can come from any source (not just scMappR).
DEG_list: An object with the first column as gene symbols within the bulk dataset (doesn't have to be in signature matrix), second column is the adjusted p-value, and the third the log2FC path to a .tsv file containing this info is also acceptable.
gene_cutoff: Additional cut-off of normalized cwFold-change to see if a gene is cut-off.
sd_cutoff: Number of standard deviations or median absolute deviations to calculate outliers.
Returns
List with the following elements: - gene_level_investigation: data frame of genes showing the Euclidian distances between cwFold-change and null vector as well as if cwFold-changes are distributed.
celltype_level_investigation: data frame of Spearman's and Pearson's correlation between bulk DEGs and cwFold-changes.
cwFoldchange_vs_bulk_rank_change: data frame of the change in rank of DEG between the bulk fold-change and cwFold-change.
cwFoldChange_normalized: cwFold-change normalized such that each gene sums to 1.
cwFoldchange_gene_assigned: List of cell-types where genes are designated to cell-type specific differential expression.
cwFoldchange_gene_flagged_FP: Mapped cwFoldchanges that are flagged as false-positives. These are genes that are driven by the reciprical ratio of cell-type proportions between case and control. These genes may be DE in a non-cell-type specific manner but are falsely assigned to cell-types with very large differences in proportion between condition.
Details
cwFold-changes and re-normalized and re-processed to interrogate cell-type specificity at the level of the cell-type and at the level of the gene. At the level of the cell-type, cwFold-changes are correlated to bulk DEGs. The difference in rank between bulk DEGs and cwFold-changes are also compared. At the level of the gene, cwFold-changes are re-normalized so that each gene sums to 1. Normalization of their distributions are tested with a Shapiro test. Then, outlier cell-types for each gene are measured by testing for sd_cutoff's mad or sd's greater than the median or mean depending on if the cwFold-change is non-normally or normally distributed respectively. Cell-types considered outliers are then further filtered so their normalized cwFold-changes are greater than the cell-type proportions of that gene and gene_cutoff if the user sets it.