cir_stat function

Statistics for testing circular uniformity

Statistics for testing circular uniformity

Low-level implementation of several statistics for assessing circular uniformity on [0,2π)[0, 2\pi) or, equivalently, S1:=xR2:x=1S^1:={x\in R^2:||x||=1}.

cir_stat_Kuiper(Theta, sorted = FALSE, KS = FALSE, Stephens = FALSE) cir_stat_Watson(Theta, sorted = FALSE, CvM = FALSE, Stephens = FALSE) cir_stat_Watson_1976(Theta, sorted = FALSE, minus = FALSE) cir_stat_Range(Theta, sorted = FALSE, gaps_in_Theta = FALSE, max_gap = TRUE) cir_stat_Rao(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Greenwood(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Log_gaps(Theta, sorted = FALSE, gaps_in_Theta = FALSE, abs_val = TRUE) cir_stat_Vacancy(Theta, a = 2 * pi, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Max_uncover(Theta, a = 2 * pi, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Num_uncover(Theta, a = 2 * pi, sorted = FALSE, gaps_in_Theta = FALSE, minus_val = TRUE) cir_stat_Gini(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Gini_squared(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Ajne(Theta, Psi_in_Theta = FALSE) cir_stat_Rothman(Theta, Psi_in_Theta = FALSE, t = 1/3) cir_stat_Hodges_Ajne(Theta, asymp_std = FALSE, sorted = FALSE, use_Cressie = TRUE) cir_stat_Cressie(Theta, t = 1/3, sorted = FALSE) cir_stat_FG01(Theta, sorted = FALSE) cir_stat_Rayleigh(Theta, m = 1L) cir_stat_Bingham(Theta) cir_stat_Hermans_Rasson(Theta, Psi_in_Theta = FALSE) cir_stat_Gine_Gn(Theta, Psi_in_Theta = FALSE) cir_stat_Gine_Fn(Theta, Psi_in_Theta = FALSE) cir_stat_Pycke(Theta, Psi_in_Theta = FALSE) cir_stat_Pycke_q(Theta, Psi_in_Theta = FALSE, q = 0.5) cir_stat_Bakshaev(Theta, Psi_in_Theta = FALSE) cir_stat_Riesz(Theta, Psi_in_Theta = FALSE, s = 1) cir_stat_PCvM(Theta, Psi_in_Theta = FALSE) cir_stat_PRt(Theta, Psi_in_Theta = FALSE, t = 1/3) cir_stat_PAD(Theta, Psi_in_Theta = FALSE, AD = FALSE, sorted = FALSE) cir_stat_Poisson(Theta, Psi_in_Theta = FALSE, rho = 0.5) cir_stat_Softmax(Theta, Psi_in_Theta = FALSE, kappa = 1) cir_stat_CCF09(Theta, dirs, K_CCF09 = 25L, original = FALSE)

Arguments

  • Theta: a matrix of size c(n, M) with M samples of size n of circular data on [0,2π)[0, 2\pi). Must not contain NA's.

  • sorted: are the columns of Theta sorted increasingly? If TRUE, performance is improved. If FALSE (default), each column of Theta is sorted internally.

  • KS: compute the Kolmogorov-Smirnov statistic (which is not invariant under origin shifts) instead of the Kuiper statistic? Defaults to FALSE.

  • Stephens: compute Stephens (1970) modification so that the null distribution of the is less dependent on the sample size? The modification does not alter the test decision.

  • CvM: compute the Cramér-von Mises statistic (which is not

    invariant under origin shifts) instead of the Watson statistic? Defaults to FALSE.

  • minus: compute the invariant DnD_n^- instead of Dn+D_n^+? Defaults to FALSE.

  • gaps_in_Theta: does Theta contain the matrix of circular gaps that is obtained with

    cir_gaps(Theta)? If FALSE (default), the circular gaps are computed internally.

  • max_gap: compute the maximum gap for the range statistic? If TRUE (default), rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, the minimum gap is computed and rejection happens for low values.

  • abs_val: return the absolute value of the Darling's log gaps statistic? If TRUE (default), rejection happens for large

    values of the statistic, which is consistent with the rest of tests. Otherwise, the signed statistic is computed and rejection happens for large absolute values.

  • a: either:

    • an=a/na_n = a / n parameter used in the length of the arcs of the coverage-based tests. Must be positive. Defaults to 2 * pi.
    • aa parameter for the Stereo test, a real in [1,1][-1, 1]. Defaults to 0.
  • minus_val: return the negative value of the (standardized) number of uncovered spacings? If TRUE (default), rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, rejection happens for low values.

  • Psi_in_Theta: does Theta contain the shortest angles matrix Ψ\Psi that is obtained with

    Psi_mat(array(Theta, dim = c(n, 1, M)))? If FALSE

    (default), Ψ\Psi is computed internally.

  • t: tt parameter for the Rothman and Cressie tests, a real in (0,1)(0, 1). Defaults to 1 / 3.

  • asymp_std: normalize the Hodges-Ajne statistic in terms of its asymptotic distribution? Defaults to FALSE.

  • use_Cressie: compute the Hodges-Ajne statistic as a particular case of the Cressie statistic? Defaults to TRUE as it is more efficient. If FALSE, the geometric construction in Ajne (1968) is employed.

  • m: integer mm for the mm-modal Rayleigh test. Defaults to m = 1 (the standard Rayleigh test).

  • q: qq parameter for the Pycke "qq-test", a real in (0,1)(0, 1). Defaults to 1 / 2.

  • s: ss parameter for the ss-Riesz test, a real in (0,2)(0, 2). Defaults to 1.

  • AD: compute the Anderson-Darling statistic (which is not

    invariant under origin shifts) instead of the Projected Anderson-Darling statistic? Defaults to FALSE.

  • rho: ρ\rho parameter for the Poisson test, a real in [0,1)[0, 1). Defaults to 0.5.

  • kappa: κ\kappa parameter for the Softmax test, a non-negative real. Defaults to 1.

  • dirs: a matrix of size c(n_proj, 2) containing n_proj

    random directions (in Cartesian coordinates) on S1S^1 to perform the CCF09 test.

  • K_CCF09: integer giving the truncation of the series present in the asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to 25.

  • original: return the CCF09 statistic as originally defined? If FALSE (default), a faster and equivalent statistic is computed, and rejection happens for large values of the statistic, which is consistent with the rest of tests. Otherwise, rejection happens for low values.

Returns

A matrix of size c(M, 1) containing the statistics for each of the M samples.

Details

Descriptions and references for most of the statistics are available in García-Portugués and Verdebout (2018).

The statistics cir_stat_PCvM and cir_stat_PRt are provided for the sake of completion, but they equal the more efficiently-implemented statistics 2 * cir_stat_Watson and cir_stat_Rothman, respectively.

Warning

Be careful on avoiding the next bad usages of the functions, which will produce spurious results:

  • The entries of Theta are not in [0,2π)[0, 2\pi).

  • Theta does not contain the circular gaps when gaps_in_Theta = TRUE.

  • Theta is not sorted increasingly when data_sorted = TRUE.

  • Theta does not contain Psi_mat(array(Theta, dim = c(n, 1, M))) when

    Psi_in_Theta = TRUE.

  • The directions in dirs do not have unit norm.

Examples

## Sample uniform circular data M <- 2 n <- 100 set.seed(987202226) Theta <- r_unif_cir(n = n, M = M) ## Tests based on the empirical cumulative distribution function # Kuiper cir_stat_Kuiper(Theta) cir_stat_Kuiper(Theta, Stephens = TRUE) # Watson cir_stat_Watson(Theta) cir_stat_Watson(Theta, Stephens = TRUE) # Watson (1976) cir_stat_Watson_1976(Theta) ## Partition-based tests # Ajne Theta_array <- Theta dim(Theta_array) <- c(nrow(Theta), 1, ncol(Theta)) Psi <- Psi_mat(Theta_array) cir_stat_Ajne(Theta) cir_stat_Ajne(Psi, Psi_in_Theta = TRUE) # Rothman cir_stat_Rothman(Theta, t = 0.5) cir_stat_Rothman(Theta) cir_stat_Rothman(Psi, Psi_in_Theta = TRUE) # Hodges-Ajne cir_stat_Hodges_Ajne(Theta) cir_stat_Hodges_Ajne(Theta, use_Cressie = FALSE) # Cressie cir_stat_Cressie(Theta, t = 0.5) cir_stat_Cressie(Theta) # FG01 cir_stat_FG01(Theta) ## Spacings-based tests # Range cir_stat_Range(Theta) # Rao cir_stat_Rao(Theta) # Greenwood cir_stat_Greenwood(Theta) # Log gaps cir_stat_Log_gaps(Theta) # Vacancy cir_stat_Vacancy(Theta) # Maximum uncovered spacing cir_stat_Max_uncover(Theta) # Number of uncovered spacings cir_stat_Num_uncover(Theta) # Gini mean difference cir_stat_Gini(Theta) # Gini mean squared difference cir_stat_Gini_squared(Theta) ## Sobolev tests # Rayleigh cir_stat_Rayleigh(Theta) cir_stat_Rayleigh(Theta, m = 2) # Bingham cir_stat_Bingham(Theta) # Hermans-Rasson cir_stat_Hermans_Rasson(Theta) cir_stat_Hermans_Rasson(Psi, Psi_in_Theta = TRUE) # Gine Fn cir_stat_Gine_Fn(Theta) cir_stat_Gine_Fn(Psi, Psi_in_Theta = TRUE) # Gine Gn cir_stat_Gine_Gn(Theta) cir_stat_Gine_Gn(Psi, Psi_in_Theta = TRUE) # Pycke cir_stat_Pycke(Theta) cir_stat_Pycke(Psi, Psi_in_Theta = TRUE) # Pycke q cir_stat_Pycke_q(Theta) cir_stat_Pycke_q(Psi, Psi_in_Theta = TRUE) # Bakshaev cir_stat_Bakshaev(Theta) cir_stat_Bakshaev(Psi, Psi_in_Theta = TRUE) # Riesz cir_stat_Riesz(Theta, s = 1) cir_stat_Riesz(Psi, Psi_in_Theta = TRUE, s = 1) # Projected Cramér-von Mises cir_stat_PCvM(Theta) cir_stat_PCvM(Psi, Psi_in_Theta = TRUE) # Projected Rothman cir_stat_PRt(Theta, t = 0.5) cir_stat_PRt(Theta) cir_stat_PRt(Psi, Psi_in_Theta = TRUE) # Projected Anderson-Darling cir_stat_PAD(Theta) cir_stat_PAD(Psi, Psi_in_Theta = TRUE) ## Other tests # CCF09 dirs <- r_unif_sph(n = 3, p = 2, M = 1)[, , 1] cir_stat_CCF09(Theta, dirs = dirs) ## Connection of Kuiper and Watson statistics with KS and CvM, respectively # Rotate sample for KS and CvM alpha <- seq(0, 2 * pi, l = 1e4) KS_alpha <- sapply(alpha, function(a) { cir_stat_Kuiper((Theta[, 2, drop = FALSE] + a) %% (2 * pi), KS = TRUE) }) CvM_alpha <- sapply(alpha, function(a) { cir_stat_Watson((Theta[, 2, drop = FALSE] + a) %% (2 * pi), CvM = TRUE) }) AD_alpha <- sapply(alpha, function(a) { cir_stat_PAD((Theta[, 2, drop = FALSE] + a) %% (2 * pi), AD = TRUE) }) # Kuiper is the maximum rotated KS plot(alpha, KS_alpha, type = "l") abline(h = cir_stat_Kuiper(Theta[, 2, drop = FALSE]), col = 2) points(alpha[which.max(KS_alpha)], max(KS_alpha), col = 2, pch = 16) # Watson is the minimum rotated CvM plot(alpha, CvM_alpha, type = "l") abline(h = cir_stat_Watson(Theta[, 2, drop = FALSE]), col = 2) points(alpha[which.min(CvM_alpha)], min(CvM_alpha), col = 2, pch = 16) # Anderson-Darling is the average rotated AD? plot(alpha, AD_alpha, type = "l") abline(h = cir_stat_PAD(Theta[, 2, drop = FALSE]), col = 2) abline(h = mean(AD_alpha), col = 3)

References

García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. tools:::Rd_expr_doi("10.48550/arXiv.1804.00286") .

  • Maintainer: Eduardo García-Portugués
  • License: GPL-3
  • Last published: 2024-05-24