get_mutation_tables() R function from [ICBioMark]

Produce Training, Validation and Test Matrices

This function allows for i) separation of a mutation dataset into training, validation and testing components, and ii) conversion from annotated mutation format to sparse mutation matrices, as described in the function get_table_from_maf().


get_mutation_tables(
  maf,
  split = c(train = 0.7, val = 0.15, test = 0.15),
  sample_list = NULL,
  gene_list = NULL,
  acceptable_genes = NULL,
  for_biomarker = "TIB",
  include_synonymous = TRUE,
  dictionary = NULL,
  seed_id = 1234
)

Arguments

maf: (dataframe) A table of annotated mutations containing the columns 'Tumor_Sample_Barcode', 'Hugo_Symbol', and 'Variant_Classification'.
split: (double) A vector of three positive values with names 'train', 'val' and 'test'. Specifies the proportions into which to split the dataset.
sample_list: sample_list (character) Optional parameter specifying the set of samples to include in the mutation matrices.
gene_list: (character) Optional parameter specifying the set of genes to include in the mutation matrices.
acceptable_genes: (character) Optional parameter specifying a set of acceptable genes, for example those which are in an ensembl databse.
for_biomarker: (character) Used for defining a dictionary of mutations. See the function get_mutation_dictionary() for details.
include_synonymous: (logical) Optional parameter specifying whether to include synonymous mutations in the mutation matrices.
dictionary: (character) Optional parameter directly specifying the mutation dictionary to use. See the function get_mutation_dictionary() for details.
seed_id: (numeric) Input value for the function set.seed().

Returns

A list of three items with names 'train', 'val' and 'test'. Each element will contain a sparse mutation matrix for the samples in that branch, alongside other information as described as the output of the function get_table_from_maf().

Examples


tables <- get_mutation_tables(example_maf_data$maf, sample_list = paste0("SAMPLE_", 1:100))

print(names(tables))
print(names(tables$train))

ICBioMark package Read PDF manual

Maintainer: Jacob R. Bradley
License: MIT + file LICENSE
Last published: 2021-11-15

Useful links

get_mutation_tables function

Produce Training, Validation and Test Matrices

Arguments

Returns

Examples