Cross validation for the Metabolite specific analysis
Cross validation for the Metabolite specific analysis
The function performs cross validation for each metabolite depending the number of fold which guides the division into the train and testing dataset. The classifier is then obtained on the training dataset to be validated on the test dataset
Fold: Number of times in which the dataset is divided. Default is 3 which implies dataset will be divided into three groups and 2/3 of the dataset will be the train datset and 1/3 will be to train the results.
Survival: A vector of survival time with length equals to number of subjects
Mdata: A large or small metabolic profile matrix. A matrix with metabolic profiles where the number of rows should be equal to the number of metabolites and number of columns should be equal to number of patients.
Censor: A vector of censoring indicator
Reduce: A boolean parameter indicating if the metabolic profile matrix should be reduced, default is TRUE and larger metabolic profile matrix is reduced by supervised pca approach and first pca is extracted from the reduced matrix to be used in the classifier.
Select: Number of metabolites (default is 15) to be selected from supervised PCA. This is valid only if th argument Reduce=TRUE
Prognostic: A dataframe containing possible prognostic(s) factor and/or treatment effect to be used in the model.
Quantile: The cut off value for the classifier, default is the median cutoff
Ncv: The Number of cross validation loop. Default is 50 but it is recommended to have at least 100.
Returns
A object of class cvmm is returned with the following values - HRTrain: The Train dataset HR statistics for each metabolite by the number of CV
HRTest: The Test dataset HR statistics for each metabolite by the number of CV
train: The selected subjects for each CV in the train dataset
train: The selected subjects for each CV in the test dataset
n.mets: The number of metabolite used in the analysis
Ncv: The number of cross validation performed
Rdata: The Metabolite data matrix that was used for the analysis either same as Mdata or a reduced version.
Details
This function performs the cross validation for metabolite by metabolite analysis. The data will firstly be divided into data train dataset and test datset. Furthermore, a metabolite-specific model is fitted on train data and a classifier is built. In addition, the classifier is then evaluated on test dataset for each particular metabolite. The Process is repeated for all the full or reduced metabolites to obtaind the HR statistics of the low risk group. The following steps depends on the number of cross validation specified.
Examples
## FIRSTLY SIMULATING A METABOLIC SURVIVAL DATAData = MSData(nPatients =100, nMet =150, Prop =0.5)## USING THE FUNCTIONResult = CVMetSpecificCoxPh(Fold=3,Survival=Data$Survival,Mdata=t(Data$Mdata),Censor= Data$Censor,Reduce=TRUE,Select=150,Prognostic=Data$Prognostic,Quantile =0.5,Ncv=3)## GET THE CLASS OF THE OBJECTclass(Result)# An "cvmm" Class## METHOD THAT CAN BE USED FOR THE RESULTshow(Result)summary(Result)plot(Result)