precision estimates the precision (a.k.a. positive predictive value -ppv-) for a nominal/categorical predicted-observed dataset.
ppv estimates the Positive Predictive Value (equivalent to precision) for a nominal/categorical predicted-observed dataset.
FDR estimates the complement of precision (a.k.a. positive predictive value -PPV-) for a nominal/categorical predicted-observed dataset.
precision( data =NULL, obs, pred, tidy =FALSE, atom =FALSE, na.rm =TRUE, pos_level =2)ppv( data =NULL, obs, pred, tidy =FALSE, atom =FALSE, na.rm =TRUE, pos_level =2)FDR( data =NULL, obs, pred, atom =FALSE, pos_level =2, tidy =FALSE, na.rm =TRUE)
Arguments
data: (Optional) argument to call an existing data frame containing the data.
obs: Vector with observed values (character | factor).
pred: Vector with predicted values (character | factor).
tidy: Logical operator (TRUE/FALSE) to decide the type of return. TRUE returns a data.frame, FALSE returns a list; Default : FALSE.
atom: Logical operator (TRUE/FALSE) to decide if the estimate is made for each class (atom = TRUE) or at a global level (atom = FALSE); Default : FALSE.
na.rm: Logic argument to remove rows with missing values (NA). Default is na.rm = TRUE.
pos_level: Integer, for binary cases, indicating the order (1|2) of the level corresponding to the positive. Generally, the positive level is the second (2) since following an alpha-numeric order, the most common pairs are (Negative | Positive), (0 | 1), (FALSE | TRUE). Default : 2.
Returns
an object of class numeric within a list (if tidy = FALSE) or within a data frame (if tidy = TRUE).
Details
The precision is a non-normalized coefficient that represents the ratio between the correctly predicted cases (or true positive -TP- for binary cases) to the total predicted observations for a given class (or total predicted positive -PP- for binary cases) or at overall level.
For binomial cases, precision=PPTP=TP+FPTP
The precision metric is bounded between 0 and 1. The closer to 1 the better. Values towards zero indicate low precision of predictions. It can be estimated for each particular class or at a global level.
The false detection rate or false discovery rate (FDR) represents the proportion of false positives with respect to the total number of cases predicted as positive.
For binomial cases, FDR=1−precision=PPFP=TP+FPFP
The precision metric is bounded between 0 and 1. The closer to 1 the better. Values towards zero indicate low precision of predictions.
set.seed(123)# Two-classbinomial_case <- data.frame(labels = sample(c("True","False"),100,replace =TRUE), predictions = sample(c("True","False"),100, replace =TRUE))# Multi-classmultinomial_case <- data.frame(labels = sample(c("Red","Blue","Green"),100,replace =TRUE), predictions = sample(c("Red","Blue","Green"),100, replace =TRUE))# Get precision estimate for two-class caseprecision(data = binomial_case, obs = labels, pred = predictions, tidy =TRUE)# Get FDR estimate for two-class caseFDR(data = binomial_case, obs = labels, pred = predictions, tidy =TRUE)# Get precision estimate for each class for the multi-class caseprecision(data = multinomial_case, obs = labels, pred = predictions, tidy =TRUE, atom =TRUE)# Get precision estimate for the multi-class case at a global levelprecision(data = multinomial_case, obs = labels, pred = predictions, tidy =TRUE, atom =TRUE)
References
Ting K.M. (2017) Precision and Recall. In: Sammut C., Webb G.I. (eds) Encyclopedia of Machine Learning and Data Mining.