kare function

Kernel Adaptive Regression Estimator

Kernel Adaptive Regression Estimator

Wrapper function which does some preparatory calculations and then calls the actual ``workhorse'' functions which do the main computations for kernel adaptive regression estimation of Eichner & Stute (2012). Finally, it structures and returns the obtained results. Summarizing information and technical details can be found in Eichner (2017).

kare(x.points, data, kernel = c("gaussian", "epanechnikov", "rectangular"), Sigma = seq(0.01, 10, length = 51), h = NULL, theta = NULL)

Arguments

  • x.points: Vector of location(s) at which the regression estimate is to be computed.

  • data: Data frame or list with one component named x which contains the vector of regressor values x1,,xnx_1, \ldots, x_n

    and one named y which holds the vector of pertaining response values y1,,yny_1, \ldots, y_n (in the corresponding order) of the data from which the estimate is to be computed at the values given in x.points. Pairs (xi,yi)(x_i, y_i)

    with NA or an infinite value in a least one of their elements are removed (and a warning is issued).

  • kernel: A character string naming the kernel to be used for the adaptive estimator. This must partially match one of "gaussian", "rectangular" or "epanechnikov", with default "gaussian", and may be abbreviated to a unique prefix. (Currently, this kernel is also used for the initial, non-adaptive Nadaraya-Watson regression estimator which enters into the estimators of bias and variance as described in the references.)

  • Sigma: Vector of value(s) of the scale parameter σ\sigma. If of length 1 no adaptation is performed. Otherwise considered as the grid over which the optimization of the adaptive method will be performed. Defaults to seq(0.01, 10, length = 51).

  • h: Numeric scalar for bandwidth hh. Defaults to NULL and is then internally set to n1/5n^{-1/5}.

  • theta: Numeric scalar for value of location parameter θ\theta. Defaults to NULL and is then internally set to the arithmetic mean of x1,,xnx_1, \ldots, x_n.

Returns

If length(x.points) = 1, a list of eight components with the following names and meanings:

xScalar xx -value in x.points at which the regression estimator was computed.
yEstimated scalar value of m(x)m(x) at point in x.points .
sigma.adapThe found scalar minimizer of the MSE-estimator, i.e., the adaptive smoothing parameter value.
msehat.minThe found scalar minimum of the MSE-estimator.
SigmaVector with the σ\sigma -grid on which the minimization process was performed.
BnVector with the estimator of bias on that σ\sigma -grid.
Vn2Ditto for the variance.
MSEDitto for the MSE.

If length(x.points) > 1, a list with the same component names as above, but then

xVector x.points with x-values at which the regression estimator was computed.
yVector of estimated values of m(x)m(x) at the x-values in x.points .
sigma.adapVector of the found minimizers of the MSE-estimator, i.e., the adaptive smoothing parameter values.
msehat.minVector of the found minima of the MSE-estimator.
SigmaVector with the σ\sigma -grid on which the minimization process was performed.
Bn( length(Sigma) by length(x.points) )-matrix with the estimated values of the bias on the σ\sigma -grid in their columns (which correspond to the x-values).
Vn2Ditto for the variance.
MSEDitto for the MSE.

Examples

require(stats) # Regression function: m <- function(x, x1 = 0, x2 = 8, a = 0.01, b = 0) { a * (x - x1) * (x - x2)^3 + b } # Note: For a few details on m() see examples in ?nadwat. x0 <- 5 # The point x_0 at which the MSE-optimal kernel adjusted # nonparametric estimation of m should take place. (Recall: for m's # default values a minimum is at 2, a point of inflection at 4, and # a saddle point an 8; an "arbitrary" point would, e.g., be at 5.) n <- 100 # Sample size. sdeps <- 1 # Std. dev. of the \epsilon_i: \sqrt(Var(Y|X=x)) # (here: constant in x). design.ctr <- x0 + 0.5 # "centre" and "scale" of the design, i.e., design.scl <- 1 # in the normal scenario below, expected value and # std. dev. of the distribution of the x_i's. set.seed(42) # To guarantee reproducibility. x <- rnorm(n, mean = design.ctr, sd = design.scl) # x_1, ..., x_n Y <- m(x) + rnorm(length(x), sd = sdeps) # Y_1, ..., Y_n data <- data.frame(x = x, y = Y) # Computing the kernel adaptive regression estimator values #********************************************************** x.points <- seq(-3.3 * design.scl, 3.3 * design.scl, length = 101) + design.ctr # x-grid on which to draw and estimate the regr. fct. m. Sigma <- seq(0.01, 10, length = 51) # \sigma-grid for minimization. fit <- kare(x.points = x0, data = data, Sigma = Sigma) ## Not run: # Grafical display for the current data set #****************************************** # Storing the curent settings of the graphics device # and changing its layout for the three plots to come: op <- par(mfrow = c(3, 1), mar = c(3, 3, 2, 0.1)+0.1, mgp = c(1.5, 0.5, 0), tcl = -0.3, cex.main = 2) # The scatter plot of the "raw data": plot(y ~ x, data = data, xlim = range(data$x, x.points), ylim = range(data$y, fit$y, na.rm = TRUE), main = bquote(n == .(n)), xlab = "x", ylab = "y") # The "true" regression function m: lines(x.points, m(x.points), lty = 2) # The MSE-optimal kernel adjusted nonparametric regression estimator # at x_0, i.e., the point (x_0, \hat m_n(x_0)): points(fit$x, fit$y, col = "red", pch = 4, cex = 2) # The legend for the "true" regression function m and for the point # (x_0, \hat m_n(x_0)): legend("topleft", lty = c(2, NA), pch = c(NA, 4), col = c("black", "red"), bty = "n", cex = 1.2, legend = c(as.expression(bquote(paste("m with ", sigma(paste(Y, "|", X == x)) == .(sdeps)))), as.expression(bquote(paste(hat(m)[n](x[0]), " at ", x[0] == .(x0)))))) # Visualizing the estimators of (Bias_n(sigma))^2 and # Var_n(sigma) at x0 on the sigma-grid: with(fit, matplot(Sigma, cbind(Bn*Bn, Vn2), type = "l", lty = 1:2, col = c("black", "red"), xlab = expression(sigma), ylab = "")) # The legend for (Bias_n(sigma))^2 and Var_n(sigma): legend("topleft", lty = 1:2, col = c("black", "red"), bty = "n", legend = c(expression(paste(widehat(plain(Bias))[n]^2, (sigma))), expression(widehat(plain(Var))[n](sigma))), cex = 1.2) # Visualizing the estimator of MSE_n(sigma) at x0 on the sigma-grid # together with the point indicating the detected minimum, and a legend: plot(fit$Sigma, fit$MSE, type = "l", xlab = expression(sigma), ylab = "") points(fit$sigma.adap, fit$msehat.min, pch = 4, col = "red", cex = 2) legend("topleft", lty = c(1, NA), pch = c(NA, 4), col = c("black", "red"), bty = "n", cex = 1.2, legend = c(expression(widehat(plain(MSE))[n](sigma)), substitute(group("(", list(plain(Minimizer), plain(Minimum)), ")") == group("(", list(x, y), ")"), list(x = signif(fit$sigma.adap, 4), y = signif(fit$msehat.min, 4))))) par(op) # Restoring the previous settings of the graphics device. ## End(Not run)

References

Eichner & Stute (2012) and Eichner (2017): see kader.

  • Maintainer: Gerrit Eichner
  • License: GPL-3
  • Last published: 2017-10-04