process_dgTMatrix_lists function

Count Matrix To Signature Matrix

Count Matrix To Signature Matrix

This function takes a list of count matrices, processes them, calls cell-types, and generates signature matrices.

process_dgTMatrix_lists( dgTMatrix_list, name, species_name, naming_preference = -9, rda_path = "", panglao_set = FALSE, haveUMAP = FALSE, saveSCObject = FALSE, internal = FALSE, toSave = FALSE, path = NULL, use_sctransform = FALSE, test_ctname = "wilcox", genes_integrate = 2000, genes_include = FALSE )

Arguments

  • dgTMatrix_list: A list of matrices in the class of dgTMatrix object -- sparce object -- compatible with Seurat rownames should be of the same species for each.
  • name: The name of the outputted signature matrices, cell-type preferences, and Seurat objects if you choose to save them.
  • species_name: Mouse or human symbols, -9 if internal as Panglao objects have gene symbol and ensembl combined.
  • naming_preference: For cell-type naming, see if cell-types given the inputted tissues are more likely to be named within one of the categories. These categories are: "brain", "epithelial", "endothelial", "blood", "connective","eye", "epidermis", "Digestive", "Immune", "pancreas", "liver", "reproductive", "kidney", "respiratory".
  • rda_path: If saved, directory to where data from scMappR_data is downloaded.
  • panglao_set: If the inputted matrices are from Panglao (i.e. if they're internal).
  • haveUMAP: Save the UMAPs - requires additional packages (see Seurat for details).
  • saveSCObject: Save the Seurat object as an RData object (T/F).
  • internal: Was this used as part of the internal processing of Panglao datasets (T/F).
  • toSave: Allow scMappR to write files in the current directory (T/F)
  • path: If toSave == TRUE, path to the directory where files will be saved.
  • use_sctransform: If you should use sctransform or the Normalize/VariableFeatures/ScaleData pipeline (T/F).
  • test_ctname: statistical test for calling CT markers -- must be in Seurat
  • genes_integrate: The number of genes to include in the integration anchors feature when combining datasets.
  • genes_include: TRUE or FALSE -- include 2000 genes in signature matrix or all matrix.

Returns

List with the following elements: - wilcoxon_rank_mat_t: A dataframe containing the signature matrix of ranks (-log10(Padj) * sign(fold-change)).

  • wilcoxon_rank_mat_or: A dataframe containing the signature matrix of odds-ratios.

  • generes: All cell-type markers for each cell-type with p-value and fold changes.

  • cellLabel: matrix where each row is a cluster and each column provides information on the cell-type. Columns provide info on the cluster from seurat, the cell-type label from CellMarker and Panglao using the fisher's exact test and GSVA, and the top 30 markers per cluser.

Details

This function is a one line wrapper to process count matrices into a signature matrix. It combines process_from_count, two methods of identifying cell-type identities (GSVA and Fisher's test). Then, it takes the output of cell-type markers and converts it into a signature matrix of p-value ranks and odds ratios. It saves the Seurat object (if chosen with saveSCObject), cell-type identities from GSVA (its own object), and the signature matrices. Cell-type marker outputs are also saved in the generes .RData list. This is a list of cell-types containing all of the cell-type markers found with the FindMarkers function. Names of the generes lists and the signature matrices are kept.

Examples

data(sm) toProcess <- list(example = sm) tst1 <- process_dgTMatrix_lists(dgTMatrix_list = toProcess, name = "testProcess", species_name = "mouse", naming_preference = "eye", rda_path = "")
  • Maintainer: Dustin Sokolowski
  • License: GPL-3
  • Last published: 2023-06-30