kruskalTest function

Kruskal-Wallis Rank Sum Test

Kruskal-Wallis Rank Sum Test

Performs a Kruskal-Wallis rank sum test.

kruskalTest(x, ...) ## Default S3 method: kruskalTest(x, g, dist = c("Chisquare", "KruskalWallis", "FDist"), ...) ## S3 method for class 'formula' kruskalTest( formula, data, subset, na.action, dist = c("Chisquare", "KruskalWallis", "FDist"), ... )

Arguments

  • x: a numeric vector of data values, or a list of numeric data vectors.
  • ...: further arguments to be passed to or from methods.
  • g: a vector or factor object giving the group for the corresponding elements of "x". Ignored with a warning if "x" is a list.
  • dist: the test distribution. Defaults's to "Chisquare".
  • formula: a formula of the form response ~ group where response gives the data values and group a vector or factor of the corresponding groups.
  • data: an optional matrix or data frame (or similar: see model.frame) containing the variables in the formula formula. By default the variables are taken from environment(formula).
  • subset: an optional vector specifying a subset of observations to be used.
  • na.action: a function which indicates what should happen when the data contain NAs. Defaults to getOption("na.action").

Returns

A list with class "htest" containing the following components:

  • method: a character string indicating what type of test was performed.
  • data.name: a character string giving the name(s) of the data.
  • statistic: the estimated quantile of the test statistic.
  • p.value: the p-value for the test.
  • parameter: the parameters of the test statistic, if any.
  • alternative: a character string describing the alternative hypothesis.
  • estimates: the estimates, if any.
  • null.value: the estimate under the null hypothesis, if any.

Details

For one-factorial designs with non-normally distributed residuals the Kruskal-Wallis rank sum test can be performed to test the H0:F1(x)=F2(x)==Fk(x)_0: F_1(x) = F_2(x) = \ldots = F_k(x) against the HA:Fi(x)Fj(x) (ij)_\mathrm{A}: F_i (x) \ne F_j(x)~ (i \ne j) with at least one strict inequality.

Let RijR_{ij} be the joint rank of XijX_{ij}, with R(1)(1)=1,,R(n)(n)=N,  N=i=1kniR_{(1)(1)} = 1, \ldots, R_{(n)(n)} = N, ~~ N = \sum_{i=1}^k n_i, The test statistic is calculated as

H=i=1kni(RˉiRˉ)/σR, H = \sum_{i=1}^k n_i \left(\bar{R}_i - \bar{R}\right) / \sigma_R,%SEE PDF

with the mean rank of the ii-th group

Rˉi=j=1niRij/ni, \bar{R}_i = \sum_{j = 1}^{n_{i}} R_{ij} / n_i,%SEE PDF

the expected value

Rˉ=(N+1)/2 \bar{R} = \left(N +1\right) / 2%SEE PDF

and the expected variance as

σR2=N(N+1)/12. \sigma_R^2 = N \left(N + 1\right) / 12.%SEE PDF

In case of ties the statistic HH is divided by (1i=1rti3ti)/(N3N)\left(1 - \sum_{i=1}^r t_i^3 - t_i \right) / \left(N^3 - N\right)

According to Conover and Imam (1981), the statistic HH is related to the FF-quantile as

F=H/(k1)(N1H)/(Nk) F = \frac{H / \left(k - 1\right)}{\left(N - 1 - H\right) / \left(N - k\right)}%SEE PDF

which is equivalent to a one-way ANOVA F-test using rank transformed data (see examples).

The function provides three different dist for pp-value estimation:

  • Chisquare: pp-values are computed from the Chisquare

     distribution with $v = k - 1$ degree of freedom.
    
  • KruskalWallis: pp-values are computed from the pKruskalWallis of the package SuppDists.

  • FDist: pp-values are computed from the FDist distribution with v1=k1, v2=Nkv_1 = k-1, ~ v_2 = N -k degree of freedom.

Examples

## Hollander & Wolfe (1973), 116. ## Mucociliary efficiency from the rate of removal of dust in normal ## subjects, subjects with obstructive airway disease, and subjects ## with asbestosis. x <- c(2.9, 3.0, 2.5, 2.6, 3.2) # normal subjects y <- c(3.8, 2.7, 4.0, 2.4) # with obstructive airway disease z <- c(2.8, 3.4, 3.7, 2.2, 2.0) # with asbestosis g <- factor(x = c(rep(1, length(x)), rep(2, length(y)), rep(3, length(z))), labels = c("ns", "oad", "a")) dat <- data.frame( g = g, x = c(x, y, z)) ## AD-Test adKSampleTest(x ~ g, data = dat) ## BWS-Test bwsKSampleTest(x ~ g, data = dat) ## Kruskal-Test ## Using incomplete beta approximation kruskalTest(x ~ g, dat, dist="KruskalWallis") ## Using chisquare distribution kruskalTest(x ~ g, dat, dist="Chisquare") ## Not run: ## Check with kruskal.test from R stats kruskal.test(x ~ g, dat) ## End(Not run) ## Using Conover's F kruskalTest(x ~ g, dat, dist="FDist") ## Not run: ## Check with aov on ranks anova(aov(rank(x) ~ g, dat)) ## Check with oneway.test oneway.test(rank(x) ~ g, dat, var.equal = TRUE) ## End(Not run) ## Median Test asymptotic medianTest(x ~ g, dat) ## Median Test with simulated p-values set.seed(112) medianTest(x ~ g, dat, simulate.p.value = TRUE)

References

Conover, W.J., Iman, R.L. (1981) Rank Transformations as a Bridge Between Parametric and Nonparametric Statistics. Am Stat 35 , 124--129.

Kruskal, W.H., Wallis, W.A. (1952) Use of Ranks in One-Criterion Variance Analysis. J Am Stat Assoc 47 , 583--621.

Sachs, L. (1997) Angewandte Statistik. Berlin: Springer.

See Also

kruskal.test, pKruskalWallis, Chisquare, FDist

  • Maintainer: Thorsten Pohlert
  • License: GPL (>= 3)
  • Last published: 2024-09-08

Useful links