fitmgng function

MLE Fitting of Normal Bulk and GPD for Both Tails Interval Transition Mixture Model

MLE Fitting of Normal Bulk and GPD for Both Tails Interval Transition Mixture Model

Maximum likelihood estimation for fitting the extreme value mixture model with normal for bulk distribution between thresholds, conditional GPDs beyond thresholds and interval transition. With options for profile likelihood estimation for both thresholds and interval half-width, which can also be fixed.

fitmgng(x, eseq = NULL, ulseq = NULL, urseq = NULL, fixedeu = FALSE, pvector = NULL, std.err = TRUE, method = "BFGS", control = list(maxit = 10000), finitelik = TRUE, ...) litmgng(x, nmean = 0, nsd = 1, epsilon = nsd, ul = 0, sigmaul = 1, xil = 0, ur = 0, sigmaur = 1, xir = 0, log = TRUE) nlitmgng(pvector, x, finitelik = FALSE) profleuitmgng(eulr, pvector, x, method = "BFGS", control = list(maxit = 10000), finitelik = TRUE, ...) nleuitmgng(pvector, epsilon, ul, ur, x, finitelik = FALSE)

Arguments

  • x: vector of sample data
  • eseq: vector of epsilons (or scalar) to be considered in profile likelihood or NULL for no profile likelihood
  • ulseq: vector of lower thresholds (or scalar) to be considered in profile likelihood or NULL for no profile likelihood
  • urseq: vector of upper thresholds (or scalar) to be considered in profile likelihood or NULL for no profile likelihood
  • fixedeu: logical, should threshold and epsilon be fixed (at either scalar value in useq and eseq, or estimated from maximum of profile likelihood evaluated at grid of thresholds and epsilons in useq and eseq)
  • pvector: vector of initial values of parameters or NULL for default values, see below
  • std.err: logical, should standard errors be calculated
  • method: optimisation method (see optim)
  • control: optimisation control list (see optim)
  • finitelik: logical, should log-likelihood return finite value for invalid parameters
  • ...: optional inputs passed to optim
  • nmean: scalar normal mean
  • nsd: scalar normal standard deviation (positive)
  • epsilon: interval half-width
  • ul: lower tail threshold
  • sigmaul: lower tail GPD scale parameter (positive)
  • xil: lower tail GPD shape parameter
  • ur: upper tail threshold
  • sigmaur: upper tail GPD scale parameter (positive)
  • xir: upper tail GPD shape parameter
  • log: logical, if TRUE then log-likelihood rather than likelihood is output
  • eulr: vector of epsilon, lower and upper thresholds considered in profile likelihood

Returns

Log-likelihood is given by litmgng and it's wrappers for negative log-likelihood from nlitmgng

and nluitmgng. Profile likelihood for thresholds and interval half-width given by profluitmgng. Fitting function fitmgng returns a simple list with the following elements

call :optim call
x :data vector x
init :pvector
fixedeu :fixed epsilon and threshold, logical
ulseq :lower threshold vector for profile likelihood or scalar for fixed threshold
urseq :upper threshold vector for profile likelihood or scalar for fixed threshold
eseq :interval half-width vector for profile likelihood or scalar for fixed threshold
nllheuseq :profile negative log-likelihood at each combination in (eseq, ulseq, urseq)
optim :complete optim output
mle :vector of MLE of parameters
cov :variance-covariance matrix of MLE of parameters
se :vector of standard errors of MLE of parameters
nllh :minimum negative log-likelihood
n :total sample size
nmean :MLE of normal mean
nsd :MLE of normal standard deviation
epsilon :MLE of transition half-width
ul :lower threshold (fixed or MLE)
sigmaul :MLE of lower tail GPD scale
xil :MLE of lower tail GPD shape
ur :upper threshold (fixed or MLE)
sigmaur :MLE of upper tail GPD scale
xir :MLE of upper tail GPD shape

Details

The extreme value mixture model with the normal bulk and GPD for both tails interval transition is fitted to the entire dataset using maximum likelihood estimation. The estimated parameters, variance-covariance matrix and their standard errors are automatically output.

See ditmgng for explanation of GPD-normal-GPD interval transition model, including mixing functions.

See also help for fnormgpd for details, type help fnormgpd. Only the different features are outlined below for brevity.

The full parameter vector is (nmean, nsd, epsilon, ul, sigmaul, xil, ur, sigmaur, xir) if thresholds and interval half-width are also estimated and (nmean, nsd, sigmaul, xil, sigmaur, xir) for profile likelihood or fixed threshold approach.

If the profile likelihood approach is used, then a grid search over all combinations of epsilons and both thresholds are carried out. The combinations which lead to less than 5 in any component outside of the intervals are not considered.

A fixed pair of thresholds and epsilon approach is acheived by setting a single scalar value to each in ulseq, urseq and eseq respectively.

Note

When pvector=NULL then the initial values are:

  • MLE of normal parameters assuming entire population is normal; and
  • lower threshold 10% quantile (not relevant for profile likelihood for threshold or fixed threshold approaches);
  • upper threshold 90% quantile (not relevant for profile likelihood for threshold or fixed threshold approaches);
  • MLE of GPD parameters beyond threshold.

Acknowledgments

See Acknowledgments in fnormgpd, type help fnormgpd. Based on code by Xin Zhao produced for MATLAB.

Examples

## Not run: set.seed(1) par(mfrow = c(1, 1)) x = rnorm(1000) xx = seq(-4, 4, 0.01) y = dnorm(xx) # MLE for complete parameter set (not recommended!) fit = fitmgng(x) hist(x, breaks = seq(-6, 6, 0.1), freq = FALSE, xlim = c(-4, 4)) lines(xx, y) with(fit, lines(xx, ditmgng(xx, nmean, nsd, epsilon, ul, sigmaul, xil, ur, sigmaur, xir), col="red")) abline(v = fit$ul + fit$epsilon * seq(-1, 1), col = "red") abline(v = fit$ur + fit$epsilon * seq(-1, 1), col = "darkred") # Profile likelihood for threshold which is then fixed fitfix = fitmgng(x, eseq = seq(0, 2, 0.1), ulseq = seq(-2.5, 0, 0.25), urseq = seq(0, 2.5, 0.25), fixedeu = TRUE) with(fitfix, lines(xx, ditmgng(xx, nmean, nsd, epsilon, ul, sigmaul, xil, ur, sigmaur, xir), col="blue")) abline(v = fitfix$ul + fitfix$epsilon * seq(-1, 1), col = "blue") abline(v = fitfix$ur + fitfix$epsilon * seq(-1, 1), col = "darkblue") legend("topright", c("True Density", "GPD-normal-GPD ITM", "Profile likelihood"), col=c("black", "red", "blue"), lty = 1) ## End(Not run)

References

http://www.math.canterbury.ac.nz/~c.scarrott/evmix

http://en.wikipedia.org/wiki/Normal_distribution

http://en.wikipedia.org/wiki/Generalized_Pareto_distribution

Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf

Holden, L. and Haug, O. (2013). A mixture model for unsupervised tail estimation. arxiv:0902.4137

See Also

fgng, dnorm, fgpd and gpd

Other itmgng: itmgng

Other itmnormgpd: fitmnormgpd, itmgng, itmnormgpd

Other gng: fgngcon, fgng, fnormgpd, gngcon, gng, itmgng, normgpd

Author(s)

Alfadino Akbar and Carl Scarrott carl.scarrott@canterbury.ac.nz