get_biomarker_tables function

Get True Biomarker Values on Training, Validation and Test Sets

Get True Biomarker Values on Training, Validation and Test Sets

A function, similar to get_mutation_tables(), but returning the true biomarker values for a training, validation and test sets.

get_biomarker_tables( maf, biomarker = "TIB", sample_list = NULL, gene_list = NULL, biomarker_name = NULL, tables = NULL, split = c(train = 0.7, val = 0.15, test = 0.15), seed_id = 1234 )

Arguments

  • maf: (dataframe) A table of annotated mutations containing the columns 'Tumor_Sample_Barcode', 'Hugo_Symbol', and 'Variant_Classification'.
  • biomarker: (character) Which biomarker needs calculating? If "TMB" or "TIB", then appropriate mutation types will be selected. Otherwise, will be interpreted as a vector of characters denoting mutation types to include.
  • sample_list: (character) Vector of characters giving a list of values of Tumor_Sample_Barcode to include.
  • gene_list: (character) Vector of characters giving a list of genes to include in calculation of biomarker.
  • biomarker_name: (character) Name of biomarker. Only needed if biomarker is not "TMB" or "TIB"
  • tables: (list) Optional parameter, the output of a call to get_mutation_tables(), which already has a train/val/test split.
  • split: (numeric) Optional parameter directly specifying the proportions of a train/test/val split.
  • seed_id: (numeric) Input value for the function set.seed().

Returns

A list of three objects: 'train', 'val' and 'test. Each comprises a dataframe with two columns, denoting sample ID and biomarker value.

Examples

print(head(get_biomarker_tables(example_maf_data$maf, sample_list = paste0("SAMPLE_", 1:100))))
  • Maintainer: Jacob R. Bradley
  • License: MIT + file LICENSE
  • Last published: 2021-11-15

Useful links