Generate performance metrics across probability thresholds
Generate performance metrics across probability thresholds
threshold_perf() can take a set of class probability predictions and determine performance characteristics across different values of the probability threshold and any existing groups.
threshold_perf(.data,...)## S3 method for class 'data.frame'threshold_perf( .data, truth, estimate, thresholds =NULL, metrics =NULL, na_rm =TRUE, event_level ="first",...)
Arguments
.data: A tibble, potentially grouped.
...: Currently unused.
truth: The column identifier for the true two-class results (that is a factor). This should be an unquoted column name.
estimate: The column identifier for the predicted class probabilities (that is a numeric). This should be an unquoted column name.
thresholds: A numeric vector of values for the probability threshold. If unspecified, a series of values between 0.5 and 1.0 are used. Note : if this argument is used, it must be named.
metrics: Either NULL or a yardstick::metric_set() with a list of performance metrics to calculate. The metrics should all be oriented towards hard class predictions (e.g. yardstick::sensitivity(), yardstick::accuracy(), yardstick::recall(), etc.) and not class probabilities. A set of default metrics is used when NULL (see Details below).
na_rm: A single logical: should missing data be removed?
event_level: A single string. Either "first" or "second" to specify which level of truth to consider as the "event".
Returns
A tibble with columns: .threshold, .estimator, .metric, .estimate and any existing groups.
Details
Note that that the global option yardstick.event_first will be used to determine which level is the event of interest. For more details, see the Relevant level section of yardstick::sens().
The default calculated metrics are:
yardstick::j_index()
yardstick::sens()
yardstick::spec()
distance = (1 - sens) ^ 2 + (1 - spec) ^ 2
If a custom metric is passed that does not compute sensitivity and specificity, the distance metric is not computed.
Examples
library(dplyr)data("segment_logistic")# Set the threshold to 0.6# > 0.6 = good# < 0.6 = poorthreshold_perf(segment_logistic, Class, .pred_good, thresholds =0.6)# Set the threshold to multiple valuesthresholds <- seq(0.5,0.9, by =0.1)segment_logistic %>% threshold_perf(Class, .pred_good, thresholds)# ---------------------------------------------------------------------------# It works with grouped data frames as well# Let's mock some resampled dataresamples <-5mock_resamples <- resamples %>% replicate( expr = sample_n(segment_logistic,100, replace =TRUE), simplify =FALSE)%>% bind_rows(.id ="resample")resampled_threshold_perf <- mock_resamples %>% group_by(resample)%>% threshold_perf(Class, .pred_good, thresholds)
resampled_threshold_perf
# Average over the resamplesresampled_threshold_perf %>% group_by(.metric, .threshold)%>% summarise(.estimate = mean(.estimate))