distribution_accuracy_measures function

Distribution accuracy measures

Distribution accuracy measures

These accuracy measures can be used to evaluate how accurately a forecast distribution predicts a given actual value. data

Format

An object of class list of length 2.

percentile_score(.dist, .actual, na.rm = TRUE, ...) quantile_score( .dist, .actual, probs = c(0.05, 0.25, 0.5, 0.75, 0.95), na.rm = TRUE, ... ) CRPS(.dist, .actual, n_quantiles = 1000, na.rm = TRUE, ...) distribution_accuracy_measures

Arguments

  • .dist: The distribution of fitted values from the model, or forecasted values from the forecast.
  • .actual: A vector of responses matching the fitted values (for forecast accuracy, new_data must be provided).
  • na.rm: Remove the missing values before calculating the accuracy measure
  • ...: Additional arguments for each measure.
  • probs: A vector of probabilities at which the metric is evaluated.
  • n_quantiles: The number of quantiles to use in approximating CRPS when an exact solution is not available.

Quantile/percentile score (pinball loss)

A quantile (or percentile) score evaluates how accurately a set of quantiles (or percentiles) from the distribution match the given actual value. This score uses a pinball loss function, and can be calculated via the average of the score function given below:

The score function sp(qp,y)s_p(q_p,y) is given by (1p)(qpy)(1-p)(q_p-y) if y<qpy < q_p, and p(yqp)p(y-q_p) if yqpy \ge q_p. Where pp is the quantile probability, qp=F1(p)q_p = F^{-1}(p) is the quantile with probability pp, and yy is the actual value.

The resulting accuracy measure will average this score over all predicted points at all desired quantiles (defined via the probs argument).

The percentile score is uses the same method with probs set to all percentiles probs = seq(0.01, 0.99, 0.01).

Continuous ranked probability score (CRPS)

The continuous ranked probability score (CRPS) is the continuous analogue of the pinball loss quantile score defined above. Its value is twice the integral of the quantile score over all possible quantiles:

CRPS(F,y)=201sp(qp,y)dpCRPS(F,y)=2integral01sp(qp,y)dp CRPS(F,y) = 2 \int_0^1 s_p(q_p,y) dpCRPS(F,y) = 2 integral_0^1 s_p(q_p,y) dp

It can be computed directly from the distribution via:

CRPS(F,y)=(F(x)1yx)2dxCRPS(F,y)=integral(F(x)1yx)2dx CRPS(F,y) = \int_{-\infty}^\infty (F(x) - 1{y\leq x})^2 dxCRPS(F,y) = integral_{-\infty}^\infty (F(x) - 1{y\leq x})^2 dx

For some forecast distribution FF and actual value yy.

Calculating the CRPS accuracy measure is computationally difficult for many distributions, however it can be computed quickly and exactly for Normal and emperical (sample) distributions. For other distributions the CRPS is approximated using the quantile score of many quantiles (using the number of quantiles specified in the n_quantiles argument).