cplm_ic function

Multiple change-point detection in a continuous piecewise-linear signal via minimising an information criterion

Multiple change-point detection in a continuous piecewise-linear signal via minimising an information criterion

This function performs the Isolate-Detect methodology based on an information criterion approach, in order to detect multiple change-points in a noisy, continuous, piecewise-linear data sequence, with the noise being Gaussian. More information on how this approach works as well as the relevant literature reference are given in Details.

cplm_ic(x, th_const = 1.25, Kmax = 200, penalty = c("ssic_pen", "sic_pen"), points = 10)

Arguments

  • x: A numeric vector containing the data in which you would like to find change-points.
  • th_const: A positive real number with default value equal to 1.25. It is used to define the threshold value that will be used at the first step of the model selection based Isolate-Detect method; see Details for more information.
  • Kmax: A positive integer with default value equal to 200. It is the maximum allowed number of estimated change-points in the solution path; see sol_path_cplm for more details.
  • penalty: A character vector with names of penalty functions used.
  • points: A positive integer with default value equal to 10. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively.

Returns

A list with the following components:

sol_pathA vector containing the solution path.
ic_curveA list with values of the chosen information criteria.
cpt_icA list with the change-points detected for each information criterion considered.
no_cpt_icThe number of change-points detected for each information criterion considered.

Details

The approach followed in cplm_ic in order to detect the change-points is based on identifying the set of change-points that minimise an information criterion. At first, we employ sol_path_cplm, which overestimates the number of change-points using th_const in order to define the threshold and then sorts the obtained estimates in a way that the estimate, which is most likely to be correct appears first, whereas the least likely to be correct, appears last. Let JJ be the number of estimates that this overestimation approach returns. We will obtain a vector b=(b1,b2,...,bJ)b = (b_1, b_2, ..., b_J), with the estimates ordered as explained above. We define the collection {Mj}j=0,1,,J\left\{M_j\right\}_{j = 0,1,\ldots,J}, where M0M_0

is the empty set and Mj={b1,b2,...,bj}M_j = \left\{b_1,b_2,...,b_j\right\}. Among the collection of models Mj,j=0,1,...,JM_j, j=0,1,...,J, we select the one that minimises a predefined Information Criterion. The obtained set of change-points is apparently a subset of the solution path given in sol_path_cplm. More details can be found in ``Detecting multiple generalized change-points by isolating single ones'', Anastasiou and Fryzlewicz (2018), preprint.

Examples

single.cpt <- c(seq(0, 999, 1), seq(998.5, 499, -0.5)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.ic <- cplm_ic(single.cpt.noise) three.cpt <- c(seq(0, 499, 1), seq(498.5, 249, -0.5), seq(250,1249,2), seq(1248,749,-1)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three.ic <- cplm_ic(three.cpt.noise)

See Also

ID_cplm and ID, which employ this function. In addition, see pcm_ic for the case of detecting changes in a piecewise-constant signal using the information criterion based approach.

Author(s)

Andreas Anastasiou, a.anastasiou@lse.ac.uk

  • Maintainer: Andreas Anastasiou
  • License: GPL-3
  • Last published: 2018-03-09

Useful links