mqdedist function

Minimum Quantile Distance Fit of Univariate Distributions.

Minimum Quantile Distance Fit of Univariate Distributions.

Fit of univariate distributions for non-censored data using minimum quantile distance estimation (mqde), which can also be called maximum quantile goodness-of-fit estimation.

Source

Based on the function mledist of the R package: fitdistrplus

Delignette-Muller ML and Dutang C (2015), fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software, 64(4), 1-34.

Functions checkparam and startargdefault are needed and were copied from the same package (fitdistrplus version: 1.0-9).

mqdedist( data, distr, probs = (1:length(data) - 0.5)/length(data), qtype = 5, dist = "euclidean", start = NULL, fix.arg = NULL, optim.method = "default", lower = -Inf, upper = Inf, custom.optim = NULL, weights = NULL, silent = TRUE, gradient = NULL, ... )

Arguments

  • data: A numeric vector with the observed values for non-censored data.

  • distr: A character string "name" naming a distribution for which the corresponding quantile function qname and the corresponding density distribution dname must be classically defined.

  • probs: A numeric vector of the probabilities for which the minimum quantile distance estimation is done. p[k]=(k0.5)/np[k] = (k - 0.5) / n (default).

  • qtype: The quantile type used by the R quantile function to compute the empirical quantiles. Type 5 (default), i.e. x[k]x[k] is both the kkth order statistic and the type 5 sample quantile of p[k]=(k0.5)/np[k] = (k - 0.5) / n.

  • dist: The distance measure between observed and theoretical quantiles to be used. This must be one of "euclidean" (default), "maximum", or "manhattan". Any unambiguous substring can be given.

  • start: A named list giving the initial values of parameters of the named distribution or a function of data computing initial values and returning a named list. This argument may be omitted (default) for some distributions for which reasonable starting values are computed (see the 'details' section of mledist).

  • fix.arg: An optional named list giving the values of fixed parameters of the named distribution or a function of data computing (fixed) parameter values and returning a named list. Parameters with fixed value are thus NOT estimated.

  • optim.method: "default" (see details) or optimization method to pass to optim.

  • lower: Left bounds on the parameters for the "L-BFGS-B" method (see optim) or the constrOptim function (as an equivalent linear constraint).

  • upper: Right bounds on the parameters for the "L-BFGS-B" method (see optim) or the constrOptim function (as an equivalent linear constraint).

  • custom.optim: A function carrying the optimization (see details).

  • weights: An optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector with strictly positive numbers. If non-NULL, weighted mqde is used, otherwise ordinary mqde.

  • silent: A logical to remove or show warnings when bootstrapping.

  • gradient: A function to return the gradient of the optimization objective function for the "BFGS", "CG" and "L-BFGS-B"

    methods. If it is NULL, a finite-difference approximation will be used, see optim.

  • ...: Further arguments passed to the optim, constrOptim or custom.optim function.

Returns

mqdedist returns a list with following components,

  • estimate: the parameter estimates.

  • convergence: an integer code for the convergence of optim defined as below or defined by the user in the user-supplied optimization function.

    0 indicates successful convergence.

    1 indicates that the iteration limit of optim has been reached.

    10 indicates degeneracy of the Nealder-Mead simplex.

    100 indicates that optim encountered an internal error.

  • value: the value of the optimization objective function at the solution found.

  • hessian: a symmetric matrix computed by optim as an estimate of the Hessian at the solution found or computed in the user-supplied optimization function.

  • probs: the probability vector on which observed and theoretical quantiles were calculated.

  • dist: the name of the distance between observed and theoretical quantiles used.

  • optim.function: the name of the optimization function used.

  • fix.arg: the named list giving the values of parameters of the named distribution that must kept fixed rather than estimated by maximum likelihood or NULL if there are no such parameters.

  • loglik: the log-likelihood.

  • optim.method: when optim is used, the name of the algorithm used, NULL otherwise.

  • fix.arg.fun: the function used to set the value of fix.arg or NULL.

  • weights: the vector of weights used in the estimation process or NULL.

  • counts: A two-element integer vector giving the number of calls to the log-likelihood function and its gradient respectively. This excludes those calls needed to compute the Hessian, if requested, and any calls to log-likelihood function to compute a finite-difference approximation to the gradient. counts is returned by optim or the user-supplied optimization function, or set to NULL.

  • optim.message: A character string giving any additional information returned by the optimizer, or NULL. To understand exactly the message, see the source code.

Details

The mqdedist function carries out the minimum quantile distance estimation numerically, by minimization of a distance between observed and theoretical quantiles.

The optimization process is the same as mledist, see the 'details' section of that function.

Optionally, a vector of weights can be used in the fitting process. By default (when weights=NULL), ordinary mqde is carried out, otherwise the specified weights are used to compute a weighted distance.

We believe this function should be added to the package fitdistrplus. Until it is accepted and incorporated into that package, it will remain in the package BMT. This function is internally called in BMTfit.mqde.

Examples

# (1) basic fit of a normal distribution set.seed(1234) x1 <- rnorm(n = 100) mean(x1); sd(x1) mqde1 <- mqdedist(x1, "norm") mqde1$estimate # (2) defining your own distribution functions, here for the Gumbel # distribution for other distributions, see the CRAN task view dedicated # to probability distributions dgumbel <- function(x, a, b) 1/b*exp((a-x)/b)*exp(-exp((a-x)/b)) pgumbel <- function(q, a, b) exp(-exp((a-q)/b)) qgumbel <- function(p, a, b) a-b*log(-log(p)) mqde1 <- mqdedist(x1, "gumbel", start = list(a = 10, b = 5)) mqde1$estimate # (3) fit a discrete distribution (Poisson) set.seed(1234) x2 <- rpois(n = 30, lambda = 2) mqde2 <- mqdedist(x2, "pois") mqde2$estimate # (4) fit a finite-support distribution (beta) set.seed(1234) x3 <- rbeta(n = 100, shape1 = 5, shape2 = 10) mqde3 <- mqdedist(x3, "beta") mqde3$estimate # (5) fit frequency distributions on USArrests dataset. x4 <- USArrests$Assault mqde4pois <- mqdedist(x4, "pois") mqde4pois$estimate mqde4nbinom <- mqdedist(x4, "nbinom") mqde4nbinom$estimate # (6) weighted fit of a normal distribution set.seed(1234) w1 <- runif(100) weighted.mean(x1, w1) mqde1 <- mqdedist(x1, "norm", weights = w1) mqde1$estimate

References

LaRiccia, V. N. (1982). Asymptotic Properties of Weighted L2L^2 Quantile Distance Estimators. The Annals of Statistics, 10(2), 621-624.

Torres-Jimenez, C. J. (2017, September), Comparison of estimation methods for the BMT distribution. ArXiv e-prints.

See Also

mpsedist, mledist, mmedist, qmedist, mgedist, optim, constrOptim, and quantile.

Author(s)

Camilo Jose Torres-Jimenez [aut,cre] cjtorresj@unal.edu.co

  • Maintainer: Camilo Jose Torres-Jimenez
  • License: GPL (>= 2)
  • Last published: 2025-04-17

Useful links