fkdengpd function

MLE Fitting of Kernel Density Estimate for Bulk and GPD Tail Extreme Value Mixture Model

MLE Fitting of Kernel Density Estimate for Bulk and GPD Tail Extreme Value Mixture Model

Maximum likelihood estimation for fitting the extreme value mixture model with kernel density estimate for bulk distribution upto the threshold and conditional GPD above threshold. With options for profile likelihood estimation for threshold and fixed threshold approach.

fkdengpd(x, phiu = TRUE, useq = NULL, fixedu = FALSE, pvector = NULL, kernel = "gaussian", add.jitter = FALSE, factor = 0.1, amount = NULL, std.err = TRUE, method = "BFGS", control = list(maxit = 10000), finitelik = TRUE, ...) lkdengpd(x, lambda = NULL, u = 0, sigmau = 1, xi = 0, phiu = TRUE, bw = NULL, kernel = "gaussian", log = TRUE) nlkdengpd(pvector, x, phiu = TRUE, kernel = "gaussian", finitelik = FALSE) proflukdengpd(u, pvector, x, phiu = TRUE, kernel = "gaussian", method = "BFGS", control = list(maxit = 10000), finitelik = TRUE, ...) nlukdengpd(pvector, u, x, phiu = TRUE, kernel = "gaussian", finitelik = FALSE)

Arguments

  • x: vector of sample data
  • phiu: probability of being above threshold (0,1)(0, 1) or logical, see Details in help for fnormgpd
  • useq: vector of thresholds (or scalar) to be considered in profile likelihood or NULL for no profile likelihood
  • fixedu: logical, should threshold be fixed (at either scalar value in useq, or estimated from maximum of profile likelihood evaluated at sequence of thresholds in useq)
  • pvector: vector of initial values of parameters or NULL for default values, see below
  • kernel: kernel name (default = "gaussian")
  • add.jitter: logical, whether jitter is needed for rounded kernel centres
  • factor: see jitter
  • amount: see jitter
  • std.err: logical, should standard errors be calculated
  • method: optimisation method (see optim)
  • control: optimisation control list (see optim)
  • finitelik: logical, should log-likelihood return finite value for invalid parameters
  • ...: optional inputs passed to optim
  • lambda: scalar bandwidth for kernel (as half-width of kernel)
  • u: scalar threshold value
  • sigmau: scalar scale parameter (positive)
  • xi: scalar shape parameter
  • bw: scalar bandwidth for kernel (as standard deviations of kernel)
  • log: logical, if TRUE then log-likelihood rather than likelihood is output

Returns

Log-likelihood is given by lkdengpd and it's wrappers for negative log-likelihood from nlkdengpd

and nlukdengpd. Profile likelihood for single threshold given by proflukdengpd. Fitting function fkdengpd returns a simple list with the following elements

call :optim call
x :data vector x
init :pvector
fixedu :fixed threshold, logical
useq :threshold vector for profile likelihood or scalar for fixed threshold
nllhuseq :profile negative log-likelihood at each threshold in useq
optim :complete optim output
mle :vector of MLE of parameters
cov :variance-covariance matrix of MLE of parameters
se :vector of standard errors of MLE of parameters
rate :phiu to be consistent with evd
nllh :minimum negative log-likelihood
n :total sample size
lambda :MLE of lambda (kernel half-width)
u :threshold (fixed or MLE)
sigmau :MLE of GPD scale
xi :MLE of GPD shape
phiu :MLE of tail fraction (bulk model or parameterised approach)
se.phiu :standard error of MLE of tail fraction
bw :MLE of bw (kernel standard deviations)
kernel :kernel name

Details

The extreme value mixture model with kernel density estimate for bulk and GPD tail is fitted to the entire dataset using maximum likelihood estimation. The estimated parameters, variance-covariance matrix and their standard errors are automatically output.

See help for fnormgpd for details, type help fnormgpd. Only the different features are outlined below for brevity.

The full parameter vector is (lambda, u, sigmau, xi) if threshold is also estimated and (lambda, sigmau, xi) for profile likelihood or fixed threshold approach.

Cross-validation likelihood is used for KDE, but standard likelihood is used for GPD component. See help for fkden for details, type help fkden.

The alternate bandwidth definitions are discussed in the kernels, with the lambda as the default used in the likelihood fitting. The bw specification is the same as used in the density function.

The possible kernels are also defined in kernels

with the "gaussian" as the default choice.

Note

The data and kernel centres are both vectors. Infinite and missing sample values (and kernel centres) are dropped.

When pvector=NULL then the initial values are:

  • normal reference rule for bandwidth, using the bw.nrd0 function, which is consistent with the density function. At least two kernel centres must be provided as the variance needs to be estimated.
  • threshold 90% quantile (not relevant for profile likelihood for threshold or fixed threshold approaches);
  • MLE of GPD parameters above threshold.

Warning

See important warnings about cross-validation likelihood estimation in fkden, type help fkden.

Acknowledgments

See Acknowledgments in fnormgpd, type help fnormgpd. Based on code by Anna MacDonald produced for MATLAB.

Examples

## Not run: set.seed(1) par(mfrow = c(2, 1)) x = rnorm(1000) xx = seq(-4, 4, 0.01) y = dnorm(xx) # Bulk model based tail fraction fit = fkdengpd(x) hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 4)) lines(xx, y) with(fit, lines(xx, dkdengpd(xx, x, lambda, u, sigmau, xi), col="red")) abline(v = fit$u, col = "red") # Parameterised tail fraction fit2 = fkdengpd(x, phiu = FALSE) with(fit2, lines(xx, dkdengpd(xx, x, lambda, u, sigmau, xi, phiu), col="blue")) abline(v = fit2$u, col = "blue") legend("topright", c("True Density","Bulk Tail Fraction","Parameterised Tail Fraction"), col=c("black", "red", "blue"), lty = 1) # Profile likelihood for initial value of threshold and fixed threshold approach fitu = fkdengpd(x, useq = seq(0, 2, length = 20)) fitfix = fkdengpd(x, useq = seq(0, 2, length = 20), fixedu = TRUE) hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 4)) lines(xx, y) with(fit, lines(xx, dkdengpd(xx, x, lambda, u, sigmau, xi), col="red")) abline(v = fit$u, col = "red") with(fitu, lines(xx, dkdengpd(xx, x, lambda, u, sigmau, xi), col="purple")) abline(v = fitu$u, col = "purple") with(fitfix, lines(xx, dkdengpd(xx, x, lambda, u, sigmau, xi), col="darkgreen")) abline(v = fitfix$u, col = "darkgreen") legend("topright", c("True Density","Default initial value (90% quantile)", "Prof. lik. for initial value", "Prof. lik. for fixed threshold"), col=c("black", "red", "purple", "darkgreen"), lty = 1) ## End(Not run)

References

http://www.math.canterbury.ac.nz/~c.scarrott/evmix

http://en.wikipedia.org/wiki/Kernel_density_estimation

http://en.wikipedia.org/wiki/Cross-validation_(statistics)

http://en.wikipedia.org/wiki/Generalized_Pareto_distribution

Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf

Hu, Y. (2013). Extreme value mixture modelling: An R package and simulation study. MSc (Hons) thesis, University of Canterbury, New Zealand. http://ir.canterbury.ac.nz/simple-search?query=extreme&submit=Go

Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360.

Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179.

MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.

Wand, M. and Jones, M.C. (1995). Kernel Smoothing. Chapman && Hall.

See Also

kernels, kfun, density, bw.nrd0

and dkde in ks package. fgpd and gpd.

Other kden: bckden, fbckden, fgkgcon, fgkg, fkdengpdcon, fkden, kdengpdcon, kdengpd, kden

Other kdengpd: bckdengpd, fbckdengpd, fgkg, fkdengpdcon, fkden, gkg, kdengpdcon, kdengpd, kden

Other kdengpdcon: bckdengpdcon, fbckdengpdcon, fgkgcon, fkdengpdcon, gkgcon, kdengpdcon, kdengpd

Other gkg: fgkgcon, fgkg, gkgcon, gkg, kdengpd, kden

Other bckdengpd: bckdengpdcon, bckdengpd, bckden, fbckdengpdcon, fbckdengpd, fbckden, gkg, kdengpd, kden

Other fkdengpd: kdengpd

Author(s)

Yang Hu and Carl Scarrott carl.scarrott@canterbury.ac.nz