The detection of discontinuities in a regression curve or surface.
The detection of discontinuities in a regression curve or surface.
This function uses a comparison of left and right handed nonparametric regression curves to assess the evidence for the presence of one or more discontinuities in a regression curve or surface. A hypothesis test is carried out, under the assumption that the errors in the data are approximately normally distributed. A graphical indication of the locations where the evidence for a discontinuity is strongest is also available.
sm.discontinuity(x, y, h, hd,...)
Arguments
x: a vector or two-column matrix of covariate values.
y: a vector of responses observed at the covariate locations.
h: a smoothing parameter to be used in the construction of the nonparametric regression estimates. A normal kernel function is used and h is its standard deviation(s). However, if this argument is omitted h will be selected by an approximate degrees of freedom criterion, controlled by the df parameter. See sm.options for details.
hd: a smoothing parameter to be used in smoothing the differences of the left and right sided nonparametric regression estimates. A normal kernel function is used and hd is its standard deviation(s). However, if this argument is omitted hd will be set to h * sqrt(0.25), and h reset to h * sqrt(0.75), when x is a vector When x is a matrix, hd will be set to h * sqrt(0.5)
and h will be reset to the same value.
...: other optional parameters are passed to the sm.options
function, through a mechanism which limits their effect only to this call of the function; those relevant for this function are add, eval.points, ngrid, se, band, xlab, ylab, xlim, ylim, lty, col; see the documentation of sm.options for their description.
Returns
a list containing the following items - p: the p-value for the test of the null hypothesis that no discontinuities are present.
sigma: the estimated standard deviation of the errors.
eval.points: the evaluation points of the nonparametric regression estimates. When x is a matrix, eval.points is also a matrix whose columns define the evaluation grid of each margin of the evaluation rectangle.
st.diff: a vector or matrix of standardised differences between the left and right sided estimators at the evaluation points.
diffmat: when x is a vector, this contains the locations and standardised differences where the latter are greater than 2.5.
angle: when x is a matrix, this contains the estimated angles at which the standardised differences were constructed.
h: the principal smoothing parameter.
hd: the smoothing parameter used for double-smoothing (see the reference below).
Side Effects
a plot on the current graphical device is produced, unless the option display="none" is set.
Details
The reference below describes the statistical methods used in the function. There are minor differences in some computational details of the implementation.
Currently duplicated rows of x cause a difficulty in the two covariate case. Duplicated rows should be removed.
References
Bowman, A.W., Pope, A. and Ismail, B. (2006). Detecting discontinuities in nonparametric regression curves and surfaces. Statistics & Computing, 16, 377--390.
See Also
sm.regression, sm.options
Examples
par(mfrow = c(3,2))with(nile,{ sm.discontinuity(Year, Volume, hd =0) sm.discontinuity(Year, Volume) ind <-(Year >1898) plot(Year, Volume) h <- h.select(Year, Volume) sm.regression(Year[!ind], Volume[!ind], h, add =TRUE) sm.regression(Year[ ind], Volume[ ind], h, add =TRUE) hvec <-1:15 p <- numeric(0)for(h in hvec){ result <- sm.discontinuity(Year, Volume, h, display ="none", verbose =0) p <- c(p, result$p)} plot(hvec, p, type ="l", ylim = c(0, max(p)), xlab ="h") lines(range(hvec), c(0.05,0.05), lty =2)})with(trawl,{ Position <- cbind(Longitude, Latitude) ind <-(Longitude <143.8)# Remove a repeated point which causes difficulty with sm.discontinuity ind[54]<-FALSE sm.regression(Position[ind,], Score1[ind], theta =35, phi =30) sm.discontinuity(Position[ind,], Score1[ind], col ="blue")})par(mfrow = c(1,1))# The following example takes longer to run.# Alternative values for nside are 32 and 64.# Alternative values of yjump are 1 and 0.5.# nside <- 16# yjump <- 2# x1 <- seq(0, 1, length = nside)# x2 <- seq(0, 1, length = nside)# x <- expand.grid(x1, x2)# x <- cbind(x1 = x[, 1], x2 = x[, 2])# y <- rnorm(nside * nside)# ind <- (sqrt((x[, 1] - 0.5)^2 + (x[, 2] - 0.5)^2) <= 0.25)# y[ind] <- y[ind] + yjump# image(x1, x2, matrix(y, ncol = nside))# sm.discontinuity(x, y, df = 20, add = TRUE)