pred_first_fit() R function from [ICBioMark]

First-Fit Predicitve Model with Group Lasso

This function implements the first-fit procedure described in Bradley and Cannings, 2021. It requires at least a generative model and a dataframe containing gene lengths as input.


pred_first_fit(
  gen_model,
  lambda = exp(seq(-16, -24, length.out = 100)),
  biomarker = "TMB",
  marker_mut_types = c("NS", "I"),
  training_matrix,
  gene_lengths,
  marker_training_values = NULL,
  K_method = max,
  free_genes = c()
)

Arguments

gen_model: (list) A generative mutation model, fitted by fit_gen_model().
lambda: (numeric) A vector of penalisation weights for input to the group lasso optimiser gglasso.
biomarker: (character) The biomarker in question. If "TMB" or "TIB", then automatically defines the subsequent variable marker_mut_types.
marker_mut_types: (character) The set of mutation type groupings constituting the biomarker being estimated. Should be a vector comprising of elements of the mut_types_list vector in the 'names' attribute of gen_model.
training_matrix: (sparse matrix) A sparse matrix of mutations in the training dataset, produced by get_mutation_tables().
gene_lengths: (dataframe) A table with two columns: Hugo_Symbol and max_cds, providing the lengths of the genes to be modelled.
marker_training_values: (dataframe) A dataframe containing two columns: 'Tumor_Sample_Barcode', containing the sample IDs for the training dataset, and a second column containing training values for the biomarker in question.
K_method: (function) How to select a representative biomarker value from the training dataset. Defaults to max().
free_genes: (character) Which genes should escape penalisation (for example when augmenting a pre-existing panel).

Returns

A list of six elements:

fit: Output of call to gglasso.
panel_genes: A matrix where each row corresponds to a gene, each column to an iteration of the group lasso with a different penalty factor, and the elements booleans specifying whether that gene was selected to be included in that iteration.
panel_lengths: A vector giving total panel length for each gglasso iteration.
p: The vector of weights used in the optimisation procedure.
K: The bias penalty factor used in the optimisation procedure.
names: Gene and mutation type information as used when fitting the generative model.

Examples


example_first_fit <- pred_first_fit(example_gen_model, lambda = exp(seq(-9, -14, length.out = 100)),
                                    training_matrix = example_tables$train$matrix,
                                    gene_lengths = example_maf_data$gene_lengths)

ICBioMark package Read PDF manual

Maintainer: Jacob R. Bradley
License: MIT + file LICENSE
Last published: 2021-11-15

Useful links

pred_first_fit function

First-Fit Predicitve Model with Group Lasso

Arguments

Returns

Examples