Inner and Outer Cross Validations for Lasso Elastic Net Survival predictive models and Classification
Inner and Outer Cross Validations for Lasso Elastic Net Survival predictive models and Classification
The function does cross validation for Lasso, Elastic net and Ridge regressions models based on fixed or top selected metabolites from CVLasoelacox with classifier validated on a independent sample for the survial analysis and classification. The survival analysis is based on the selected metabolites in the presence or absene of prognostic factors.
Survival: A vector of survival time with length equals to number of subjects
Censor: A vector of censoring indicator
Prognostic: A dataframe containing possible prognostic(s) factor and/or treatment effect to be used in the model.
Mdata: A large or small metabolic profile matrix. A matrix with metabolic profiles where the number of rows should be equal to the number of metabolites and number of columns should be equal to number of patients.
Fold: number of folds to be used for the cross validation. Its value ranges between 3 and the numbe rof subjects in the dataset
Ncv: Number of validations to be carried out. The default is 25.
Nicv: Number of validations to be carried out for the inner loop. The default is 5.
Alpha: The mixing parameter for glmnet (see glmnet). The range is 0<= Alpha <= 1. The Default is 1
TopK: Top list of metabolites. Usually this can be mostly selected metabolites by function CVLasoelacox.
Weights: A logical flag indicating if a fixed or non-fixed weights should be used during the classifier evaluations. Default is FALSE.
Returns
A object of class fcv is returned with the following values - Runtime: A vector of runtime for each iteration measured in seconds.
Fold: Number of folds used.
Ncv: Number of outer cross validations used.
Nicv: Number of inner cross validations used.
TopK: The Top metabolites used
HRInner: A 3-way array in which first, second, and third dimensions correspond to Nicv, 1, and Ncv respectively. This contains estimated HR for low risk group on the out of bag data.
HRTest: A matrix of survival information for the test dataset based on the out of bag data. It has three columns representing the estimated HR, the 95% lower confidence interval and the 95% upper confidence interval.
Weight: A matrix with columns equals number of TopK metabolites and rows Ncv. Note that Weights are estimated as colMeans of coefficients matrix return from the inner cross validations.
Details
The function does cross validation for Lasso, Elastic net and Ridge regressions models based on fixed or top selected metabolites from CVLasoelacox with classifier validated on a independent sample for the survial analysis and classification. The survival analysis is based on the selected metabolites in the presence or absene of prognostic factors. The classifier is built on the weights obtain from the inner cross validations results and it is tested on out-of-bag data. These weights can be fixed or can be updated at each outer iteration. If weights are not fixed then patients are classified using majority votes. Otherwise, weights obtained from the inner cross validations are summarized by mean weights and used in the classifier. Inner cross validations are performed by calling to function CVLasoelacox. Hazard ratio for low risk group is estimated using out-of-bag data.
Examples
## FIRSTLY SIMULATING A METABOLIC SURVIVAL DATAData = MSData(nPatients =100, nMet =150, Prop =0.5)## USING THE FUNCTIONResults = Icvlasoel(Data$Survival, Data$Censor, Data$Prognostic,t(Data$Mdata), Fold =3,Ncv =5, Nicv =7, Alpha =1,TopK = colnames(Data$Mdata[,80:100]), Weights =FALSE)## NUMBER OF Outer CVResults@Ncv
## NUMBER OF Inner CVResults@Nicv
## HR of low risk group for the Inner CVResults@HRInner
## HR of low risk group for the out of bag datasetResults@HRTest
## The weight for the analysisResults@Weight