kepdf() R function from [pdfCluster]

Kernel estimate of a probability density function.

Estimates density of uni- and multivariate data by the kernel method.


kepdf(x, eval.points = x, kernel = "gaussian", 
      bwtype = "fixed", h = h.norm(x), hx = NULL, alpha = 1/2)

Arguments

x: A vector, a matrix or data-frame of data whose density should be estimated.
eval.points: A vector, a matrix or a data-frame of data points at which the density estimate should be evaluated.
kernel: Either 'gaussian' or 't7', it defines the kernel function to be used. See details below.
bwtype: Either 'fixed' or 'adaptive', corresponding to a kernel estimator with fixed or adaptive bandwidths respectively. See details below.
h: A vector of length set to NCOL(x), defining the smoothing parameters to be used either to estimate the density in kernel estimation with fixed bandwidth or to estimate the pilot density in kernel estimation with adaptive bandwidths. Default value is the result of h.norm applied to x.
hx: A matrix with the same number of rows and columns as x, where each row defines the vector of smoothing parameters specific for each sample point. To be used when bwtype = "adaptive". Default value is the result of hprop2f

applied to x. Set to NULL when bwtype= "fixed".
alpha: Sensitivity parameter to be given to hprop2f when bwtype= "adaptive" and the vectors of smoothing parameters are computed according to Silverman's (1986) approach.

Details

The current version of pdfCluster-package allows for computing estimates by a kernel product estimator of the form:

\hat{f}(y)= \sum_{i=1}^n \frac{1}{n h_{i,1} \cdots h_{i,d}} \prod_{j=1}^d K\left(\frac{y_{j} - x_{i,j}}{h_{i,j}}\right).

The kernel function $K$ can either be a Gaussian density (if kernel = "gaussian") or a $t_\nu$ density, with $\nu = 7$ degrees of freedom (when kernel = "t7"). Although uncommon, the option of selecting a $t$ kernel is motivated by computational efficiency reasons. Hence, its use is suggested when either x or eval.points have a huge number of rows.

The vectors of bandwidths $h_{i} = (h_{i,1} \cdots h_{i,d})'$ are defined as follows:

Fixed bandwidth: When bwtype='fixed', $h_{i} = h$ that is, a constant smoothing vector is used for all the observations $x_i$ . Default values are set as asymptotically optimal for a multivariate Normal distribution (e.g., Bowman and Azzalini, 1997). See h.norm for further details.
Adaptive bandwidth: When bwtype='adaptive', a vector of bandwidths $h_i$ is specified for each observation $x_i$ . Default values are selected according to Silverman (1986, Section 5.3.1). See hprop2f.

Returns

An S4 object of kepdf-class with slots:

call: The matched call.
x: The data input, coerced to be a matrix.
eval.points: The data points at which the density is evaluated.
estimate: The values of the density estimate at the evaluation points.
kernel: The selected kernel.
bwtype: The type of estimator.
par: A list of parameters used to estimate the density, with elements:
- h the smoothing parameters used to estimate either the density or the pilot density;
- hx the matrix of sample smoothing parameters, when bwtype='adaptive';
- alpha sensitivity parameter used if bwtype='adaptive'.

References

Bowman, A.W. and Azzalini, A. (1997). Applied smoothing techniques for data analysis: the kernel approach with S-Plus illustrations. Oxford University Press, Oxford.

Silverman, B. (1986). Density estimation for statistics and data analysis. Chapman and Hall, London.

Examples


## A 1-dimensional example
data(wine)
x <- wine[,3] 
pdf <- kepdf(x, eval.points=seq(0,7,by=.1))
plot(pdf, n.grid= 100, main="wine data")

## A 2-dimensional example
x <- wine[,c(2,8)] 
pdf <- kepdf(x)
plot(pdf, main="wine data", props=c(5,50,90), ylim=c(0,4))
plot(pdf, main="wine data", method="perspective", phi=30, theta=60)

### A 3-dimensional example
x <- wine[,c(2,3,8)] 
pdf <- kepdf(x)
plot(pdf, main="wine data", props=c(10,50,70), gap=0.2)
plot(pdf, main="wine data", method="perspective", gap=0.2, phi=30, theta=10)

### A 6-dimensional example
### adaptive kernel density estimate is preferable in high-dimensions
x <- wine[,c(2,3,5,7,8,10)]
pdf <- kepdf(x, bwtype="adaptive")
plot(pdf, main="wine data", props=c(10,50,70), gap=0.2)
plot(pdf, main="wine data", method="perspective", gap=0.2, phi=30, theta=10)

pdfCluster package Read PDF manual

Maintainer: Menardi Giovanna
License: GPL-2
Last published: 2022-12-02

Useful links

kepdf function