get_table_from_maf() R function from [ICBioMark]

Produce a Mutation Matrix from a MAF

A function to, given a mutation annotation dataset with columns for sample barcode, gene name and mutation type, to reformulate this as a mutation matrix, with rows denoting samples, columns denoting gene/mutation type combinations, and the individual entries giving the number of mutations observed. This will likely be very sparse, so we save it as a sparse matrix for efficiency.


get_table_from_maf(
  maf,
  sample_list = NULL,
  gene_list = NULL,
  acceptable_genes = NULL,
  for_biomarker = "TIB",
  include_synonymous = TRUE,
  dictionary = NULL
)

Arguments

maf: (dataframe) A table of annotated mutations containing the columns 'Tumor_Sample_Barcode', 'Hugo_Symbol', and 'Variant_Classification'.
sample_list: (character) Optional parameter specifying the set of samples to include in the mutation matrix.
gene_list: (character) Optional parameter specifying the set of genes to include in the mutation matrix.
acceptable_genes: (character) Optional parameter specifying a set of acceptable genes, for example those which are in an ensembl databse.
for_biomarker: (character) Used for defining a dictionary of mutations. See the function get_mutation_dictionary() for details.
include_synonymous: (logical) Optional parameter specifying whether to include synonymous mutations in the mutation matrix.
dictionary: (character) Optional parameter directly specifying the mutation dictionary to use. See the function get_mutation_dictionary() for details.

Returns

A list with the following entries:

matrix: A mutation matrix, a sparse matrix showing the number of mutations present in each sample, gene and mutation type.
sample_list: A vector of characters specifying the samples included in the matrix: the rows of the mutation matrix correspond to each of these.
gene_list: A vector of characters specifying the the genes included in the matrix.
mut_types_list: A vector of characters specifying the mutation types (as grouped into an appropriate dictionary) to be included in the matrix.
col_names: A vector of characters identifying the columns of the mutation matrix. Each entry will be comprised of two parts separated by the character '_', the first identifying the gene in question and the second identifying the mutation type. E.g. 'GENE1_NS" where 'GENE1' is an element of gene_list, and 'NS' is an element of the dictionary vector.

Examples


# We use the preloaded maf file example_maf_data
# Now we make a mutation matrix
table <- get_table_from_maf(example_maf_data$maf, sample_list = paste0("SAMPLE_", 1:100))

print(names(table))
print(table$matrix[1:10,1:10])
print(table$col_names[1:10])

ICBioMark package Read PDF manual

Maintainer: Jacob R. Bradley
License: MIT + file LICENSE
Last published: 2021-11-15

Useful links

get_table_from_maf function

Produce a Mutation Matrix from a MAF

Arguments

Returns

Examples