x: location to evaluate KDE (single scalar or vector)
lambda: bandwidth for kernel (as half-width of kernel) or NULL
bw: bandwidth for kernel (as standard deviations of kernel) or NULL
kerncentres: kernel centres (typically sample data vector or scalar)
z: standardised location put into kernel z = (x-kerncentres)/lambda
kernel: kernel name (default = "gaussian")
Returns
codekd* and kp* give the density and cumulative distribution functions for each kernel respectively, where * is the kernel name. kdz and kpz are the equivalent global functions for all of the kernels.
Details
Functions for the commonly used kernels for kernel density estimation. The density and cumulative distribution functions are provided. Each function can accept the bandwidth specified as either:
bw - in terms of number of standard deviations of the kernel, consistent with the defined values in the density function in the R base libraries
lambda - in terms of half-width of kernel
If both bandwidths are given as NULL then the default bandwidth is lambda=1. If either one is specified then this will be used. If both are specified then lambda
will be used.
All the kernels have bounded support [−λ,λ], except the normal ("gaussian") which is unbounded. In the latter, both bandwidths are the same bw=lambda and equal to the standard deviation.
Typically,a single location x at which to evaluate kernel is given along with vector of kernel centres. As such, they are designed to be used with sapply to loop over vector of locations at which to evaluate KDE. Alternatively, a vector of locations x can be given with a single scalar kernel centre kerncentres, which is commonly used when locations are pre-standardised by (x-kerncentres)/lambda and kerncentre=0. A warnings is given if both the evaluation locations and kernel centres are vectors as this is not often needed so is likely to be a user error.
If no kernel centres are provided then by default it is set to zero (i.e. x is at middle of kernel).
The following kernels are implemented, with relevant ones having definitions consistent with those of the density function, except where specified:
gaussian or normal
uniform or rectangular - same as "rectangular" in density function
triangular
epanechnikov
biweight
triweight
tricube
parzen
cosine
optcosine
The kernel densities are all normalised to unity. See Wikipedia reference below
for their definitions.
Each kernel's functions can be called individually, or the global functions
kdzand kpz for the density and cumulative distribution function can apply any particular kernel which is specified by the kernel input. These global functions take the standardised locations z = (x - kerncentres)/lambda.
Examples
xx = seq(-2,2,0.01)plot(xx, kdgaussian(xx), type ="l", col ="black",ylim = c(0,1.2))lines(xx, kduniform(xx), col ="grey")lines(xx, kdtriangular(xx), col ="blue")lines(xx, kdepanechnikov(xx), col ="darkgreen")lines(xx, kdbiweight(xx), col ="red")lines(xx, kdtriweight(xx), col ="purple")lines(xx, kdtricube(xx), col ="orange")lines(xx, kdparzen(xx), col ="salmon")lines(xx, kdcosine(xx), col ="cyan")lines(xx, kdoptcosine(xx), col ="goldenrod")legend("topright", c("Gaussian","uniform","triangular","Epanechnikov","biweight","triweight","tricube","Parzen","cosine","optcosine"), lty =1,col = c("black","grey","blue","darkgreen","red","purple","orange","salmon","cyan","goldenrod"))