A Two-Stage Approach to Maximize Interpretability of Drug Response Models Based on Multiple Molecular Data Types
Returns the regression coefficients from a TANDEM fit
Estimating predictive performance via nested cross-validation
Creates a prediction using a tandem-object
Determine the relative contribution per data type
Fits a TANDEM model by performing a two-stage regression
A two-stage regression method that can be used when various input data types are correlated, for example gene expression and methylation in drug response prediction. In the first stage it uses the upstream features (such as methylation) to predict the response variable (such as drug response), and in the second stage it uses the downstream features (such as gene expression) to predict the residuals of the first stage. In our manuscript (Aben et al., 2016, <doi:10.1093/bioinformatics/btw449>), we show that using TANDEM prevents the model from being dominated by gene expression and that the features selected by TANDEM are more interpretable.