Make a quadratic form matrix for the kernel-based variance estimator of Breidt, Opsomer, and Sanchez-Borrego (2016)
Make a quadratic form matrix for the kernel-based variance estimator of Breidt, Opsomer, and Sanchez-Borrego (2016)
Constructs the quadratic form matrix for the kernel-based variance estimator of Breidt, Opsomer, and Sanchez-Borrego (2016). The bandwidth is automatically chosen to result in the smallest possible nonempty kernel window.
x: A numeric vector, giving the values of an auxiliary variable.
kernel: The name of a kernel function. Currently only "Epanechnikov" is supported.
bandwidth: The bandwidth to use for the kernel. The default value is "auto", which means that the bandwidth will be chosen automatically to produce the smallest window size while ensuring that every unit has a nonempty window, as suggested by Breidt, Opsomer, and Sanchez-Borrego (2016). Otherwise, the user can supply their own value, which can be a single positive number.
Returns
The quadratic form matrix for the variance estimator, with dimension equal to the length of x. The resulting object has an attribute bandwidth that can be retrieved using attr(Q, 'bandwidth')
Details
This kernel-based variance estimator was proposed by Breidt, Opsomer, and Sanchez-Borrego (2016), for use with samples selected using systematic sampling or where only a single sampling unit is selected from each stratum (sometimes referred to as "fine stratification").
Suppose there are n sampled units, and for each unit i there is a numeric population characteristic xi
and there is a weighted total Y^i, where Y^i is only observed in the selected sample but xi
is known prior to sampling.
The variance estimator has the following form:
V^ker=Cd1i=1∑n(Y^i−j=1∑ndj(i)Y^j)2
The terms dj(i) are kernel weights given by
dj(i)=∑j=1nK(hxi−xj)K(hxi−xj)
where K(⋅) is a symmetric, bounded kernel function and h is a bandwidth parameter. The normalizing constant Cd
is computed as:
Cd=n1i=1∑n(1−2di(i)+j=1∑Hdj2(i))
If n=2, then the estimator is simply the estimator used for simple random sampling without replacement.
If n=1, then the matrix simply has an entry equal to 0.
Examples
# The auxiliary variable has the same value for all unitsmake_kernel_var_matrix(c(1,1,1))# The auxiliary variable differs across unitsmake_kernel_var_matrix(c(1,2,3))# View the bandwidth that was automatically selectedQ <- make_kernel_var_matrix(c(1,2,4))attr(Q,'bandwidth')
References
Breidt, F. J., Opsomer, J. D., & Sanchez-Borrego, I. (2016). "Nonparametric Variance Estimation Under Fine Stratification: An Alternative to Collapsed Strata." Journal of the American Statistical Association , 111(514), 822–833. https://doi.org/10.1080/01621459.2015.1058264