fhpdcon function

MLE Fitting of Hybrid Pareto Extreme Value Mixture Model with Single Continuity Constraint

MLE Fitting of Hybrid Pareto Extreme Value Mixture Model with Single Continuity Constraint

Maximum likelihood estimation for fitting the Hybrid Pareto extreme value mixture model, with only continuity at threshold and not necessarily continuous in first derivative. With options for profile likelihood estimation for threshold and fixed threshold approach.

fhpdcon(x, useq = NULL, fixedu = FALSE, pvector = NULL, std.err = TRUE, method = "BFGS", control = list(maxit = 10000), finitelik = TRUE, ...) lhpdcon(x, nmean = 0, nsd = 1, u = qnorm(0.9, nmean, nsd), xi = 0, log = TRUE) nlhpdcon(pvector, x, finitelik = FALSE) profluhpdcon(u, pvector, x, method = "BFGS", control = list(maxit = 10000), finitelik = TRUE, ...) nluhpdcon(pvector, u, x, finitelik = FALSE)

Arguments

  • x: vector of sample data
  • useq: vector of thresholds (or scalar) to be considered in profile likelihood or NULL for no profile likelihood
  • fixedu: logical, should threshold be fixed (at either scalar value in useq, or estimated from maximum of profile likelihood evaluated at sequence of thresholds in useq)
  • pvector: vector of initial values of parameters or NULL for default values, see below
  • std.err: logical, should standard errors be calculated
  • method: optimisation method (see optim)
  • control: optimisation control list (see optim)
  • finitelik: logical, should log-likelihood return finite value for invalid parameters
  • ...: optional inputs passed to optim
  • nmean: scalar normal mean
  • nsd: scalar normal standard deviation (positive)
  • u: scalar threshold value
  • xi: scalar shape parameter
  • log: logical, if TRUE then log-likelihood rather than likelihood is output

Returns

lhpdcon, nlhpdcon, and nluhpdcon give the log-likelihood, negative log-likelihood and profile likelihood for threshold. Profile likelihood for single threshold is given by profluhpdcon. fhpdcon returns a simple list with the following elements

call :optim call
x :data vector x
init :pvector
fixedu :fixed threshold, logical
useq :threshold vector for profile likelihood or scalar for fixed threshold
nllhuseq :profile negative log-likelihood at each threshold in useq
optim :complete optim output
mle :vector of MLE of parameters
cov :variance-covariance matrix of MLE of parameters
se :vector of standard errors of MLE of parameters
rate :phiu to be consistent with evd
nllh :minimum negative log-likelihood
n :total sample size
nmean :MLE of normal mean
nsd :MLE of normal standard deviation
u :threshold (fixed or MLE)
sigmau :MLE of GPD scale (estimated from other parameters)
xi :MLE of GPD shape
phiu :MLE of tail fraction (implied by 1/(1+pnorm(u,nmean,nsd)) )

Details

The hybrid Pareto model is fitted to the entire dataset using maximum likelihood estimation, with only continuity at threshold and not necessarily continuous in first derivative. The estimated parameters, variance-covariance matrix and their standard errors are automatically output.

Note that the key difference between this model (hpdcon) and the normal with GPD tail and continuity at threshold (normgpdcon) is that the latter includes the rescaling of the conditional GPD component by the tail fraction to make it an unconditional tail model. However, for the hybrid Pareto with single continuity constraint use the GPD in it's conditional form with no differential scaling compared to the bulk model.

See help for fnormgpd for details, type help fnormgpd. Only the different features are outlined below for brevity.

The profile likelihood and fixed threshold approach functionality are implemented for this version of the hybrid Pareto as it includes the threshold as a parameter. Whereas the usual hybrid Pareto does not naturally have a threshold parameter.

The GPD sigmau parameter is now specified as function of other parameters, see help for dhpdcon for details, type help hpdcon. Therefore, sigmau should not be included in the parameter vector if initial values are provided, making the full parameter vector (nmean, nsd, u, xi) if threshold is also estimated and (nmean, nsd, xi) for profile likelihood or fixed threshold approach.

Note

When pvector=NULL then the initial values are:

  • threshold 90% quantile (not relevant for profile likelihood for threshold or fixed threshold approaches);
  • MLE of normal parameters assuming entire population is normal; and
  • MLE of GPD parameters above threshold.

Avoid setting the starting value for the shape parameter to xi=0 as depending on the optimisation method it may be get stuck.

Acknowledgments

See Acknowledgments in fnormgpd, type help fnormgpd.

Examples

## Not run: set.seed(1) par(mfrow = c(2, 1)) x = rnorm(1000) xx = seq(-4, 4, 0.01) y = dnorm(xx) # Hybrid Pareto provides reasonable fit for some asymmetric heavy upper tailed distributions # but not for cases such as the normal distribution # Continuity constraint fit = fhpdcon(x) hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 4)) lines(xx, y) with(fit, lines(xx, dhpdcon(xx, nmean, nsd, u, xi), col="red")) abline(v = fit$u, col = "red") # No continuity constraint fit2 = fhpd(x) with(fit2, lines(xx, dhpd(xx, nmean, nsd, xi), col="blue")) abline(v = fit2$u, col = "blue") legend("topleft", c("True Density","No continuity constraint","With continuty constraint"), col=c("black", "blue", "red"), lty = 1) # Profile likelihood for initial value of threshold and fixed threshold approach fitu = fhpdcon(x, useq = seq(-2, 2, length = 20)) fitfix = fhpdcon(x, useq = seq(-2, 2, length = 20), fixedu = TRUE) hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 4)) lines(xx, y) with(fit, lines(xx, dhpdcon(xx, nmean, nsd, u, xi), col="red")) abline(v = fit$u, col = "red") with(fitu, lines(xx, dhpdcon(xx, nmean, nsd, u, xi), col="purple")) abline(v = fitu$u, col = "purple") with(fitfix, lines(xx, dhpdcon(xx, nmean, nsd, u, xi), col="darkgreen")) abline(v = fitfix$u, col = "darkgreen") legend("topleft", c("True Density","Default initial value (90% quantile)", "Prof. lik. for initial value", "Prof. lik. for fixed threshold"), col=c("black", "red", "purple", "darkgreen"), lty = 1) # Notice that if tail fraction is included a better fit is obtained fittailfrac = fnormgpdcon(x) par(mfrow = c(1, 1)) hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 4)) lines(xx, y) with(fit, lines(xx, dhpdcon(xx, nmean, nsd, u, xi), col="red")) abline(v = fit$u, col = "red") with(fittailfrac, lines(xx, dnormgpdcon(xx, nmean, nsd, u, xi), col="blue")) abline(v = fittailfrac$u) legend("topright", c("Standard Normal", "Hybrid Pareto Continuous", "Normal+GPD Continuous"), col=c("black", "red", "blue"), lty = 1) ## End(Not run)

References

http://www.math.canterbury.ac.nz/~c.scarrott/evmix

http://en.wikipedia.org/wiki/Normal_distribution

http://en.wikipedia.org/wiki/Generalized_Pareto_distribution

Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf

Hu, Y. (2013). Extreme value mixture modelling: An R package and simulation study. MSc (Hons) thesis, University of Canterbury, New Zealand. http://ir.canterbury.ac.nz/simple-search?query=extreme&submit=Go

Carreau, J. and Y. Bengio (2008). A hybrid Pareto model for asymmetric fat-tailed data: the univariate case. Extremes 12 (1), 53-76.

See Also

dnorm, fgpd and gpd

The condmixt package written by one of the original authors of the hybrid Pareto model (Carreau and Bengio, 2008) also has similar functions for the likelihood of the hybrid Pareto (hpareto.negloglike) and fitting (hpareto.fit).

Other hpd: fhpd, hpdcon, hpd

Other hpdcon: fhpd, hpdcon, hpd

Other normgpdcon: fgngcon, flognormgpdcon, fnormgpdcon, fnormgpd, gngcon, gng, hpdcon, hpd, normgpdcon, normgpd

Other fhpdcon: hpdcon

Author(s)

Yang Hu and Carl Scarrott carl.scarrott@canterbury.ac.nz