plattCalibration function

plattCalibration

plattCalibration

Function that calculates the Platt Calibrations

plattCalibration(r.calib, p.calib, nbins = 10, pl = FALSE)

Arguments

  • r.calib: observed binary phenotype
  • p.calib: predicted probabilities
  • nbins: number of bins to create the plots
  • pl: logical indicating if the function should plot the Reliability diagram and histogram of the calibrations

Returns

list with samples, responses, calibrations, ECE, MCE and calibration plots if save==T

Details

Many popular machine learning algorithms produce inaccurate predicted probabilities, especially when applied on a dataset different than the training set. Platt (1999) proposed an adjustment, in which the original probabilities are used as a predictor in a single-variable logistic regression to produce more accurate adjusted predicted probabilities. The function will also help the evaluation of the calibration, by plotting: reliability diagrams and distributions of the calibrated and non-calibrated probabilities. The reliability diagrams plots the mean predicted value within a certain range of posterior probabilities, against the fraction of accurately predicted values. Finally, we also report accuracy measures for the calibrations: the ECE, MCE and the Log-Loss of the probabilities before and after calibration.

Examples

library(stats) library(plotly) #load the dataset met <- synthetic_metabolic_dataset phen <- synthetic_phenotypic_dataset #Calculating the binarized surrogates b_phen<-binarize_all_pheno(phen) #Apply a surrogate models and plot the ROC curve surr<-calculate_surrogate_scores(met, phen,MiMIR::PARAM_surrogates, bin_names=colnames(b_phen)) #Calibration of the surrogate sex real_data<-as.numeric(b_phen$sex) pred_data<-surr$surrogates[,"s_sex"] plattCalibration(r.calib=real_data, p.calib=pred_data, nbins = 10, pl=TRUE)

References

This is a function originally created for the package in eRic, under the name prCalibrate and modified ad hoc for our purposes (Github)

J. C. Platt, 'Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods', in Advances in Large Margin Classifiers, 1999, pp. 61-74.

  • Maintainer: Daniele Bizzarri
  • License: GPL-3
  • Last published: 2024-02-01

Useful links