dixon2002 function

Dixon (2002) Nearest-neighbor contingency table analysis

Dixon (2002) Nearest-neighbor contingency table analysis

dixon2002 is a wrapper to the functions of Dixon (2002) to test spatial segregation for several species by analyzing the counts of the nearest neighbour contingency table for a marked point pattern.

dixon2002(datos, nsim = 99)

Arguments

  • datos: data.frame with three columns: x-coordinate, y-coordinate and sp-name. See swamp.
  • nsim: number of simulations for the randomization approximation of the p-values.

Returns

A list with the following components: - ON: Observed nearest neighbor counts in table format. From row sp to column sp.

  • EN: Expected nearest neighbor counts in table format.

  • Z: Z-score for testing whether the observed count equals the expected count.

  • S: Segregation measure.

  • pZas: P-values based on the asymptotic normal distribution of the Z statistic.

  • pNr: If nsim !=0, p-values of the observed counts in each cell based on the randomization distribution.

  • C: Overall test of random labelling.

  • Ci: Species-specific test of random labelling.

  • pCas: P-value of the overall test from the asymptotic chi-square distribution with the appropriate degrees of freedom.

  • pCias: P-values of the species-specific tests from the asymptotic chi-square distribution with the appropriate degrees of freedom.

  • pCr: If nsim !=0, p-value of the overall test from the randomization distribution.

  • pCir: If nsim !=0, p-values of the species-specific tests from the randomization distribution.

  • tablaZ: table with ON, EN, Z, S, pZas and pNr in pretty format, as in the table II of Dixon (2002).

  • tablaC: table with C, Ci, pCas,pCias, pCr and pCir in pretty format, as in the table IV of Dixon (2002).

Details

A measure of segregation describes the tendency of one species to be associated with itself or with other species. Dixon (2002) proposed a measure of the segregation of species i in a multiespecies spatial pattern as:

S[i]=log[(N[ii]/(N[i]N[ii])]/[(N[i]1)/(NN[i])] S[i] = log{[(N[ii]/(N[i]-N[ii])] / [(N[i]-1)/(N-N[i])]}

where N[i]N[i] is the number of individuals of species i, N[ii]N[ii] is the frequency of species i as neighbor of especies i and NN is the total number of locations. Values of S[i]S[i] larger than 0 indicate that species i is segregated; the larger the value of S[i]S[i], the more extreme the segregation. Values of S[i]S[i] less than 0 indicate that species i is is found as neighbor of itself less than expected under random labelling. Values of S[i]S[i] close to 0 are consistent with random labelling of the neighbors of species i.

Dixon (2002) also proposed a pairwise segregation index for the off-diagonal elements of the contingency table:

S[ij]=log[(N[ij]/(N[i]N[ij])]/[(N[i])/(NN[j])1] S[ij] = log{[(N[ij]/(N[i]-N[ij])] / [(N[i])/(N-N[j])-1]}

S[ij]S[ij] is larger than 0 when N[ij]N[ij], the frequency of neighbors of species j around points of species i, is larger than expected under random labelling and less than 0 when N[ij]N[ij] is smaller than expected under random labelling.

As a species/neighbor-specific test , Dixon(2002) proposed the statistic

Z[ij]=(N[ij]EN[ij])/sqrt(VarN[ij]) Z[ij] =(N[ij] -EN[ij])/sqrt(Var N[ij])

where j may be the same as i and EN[ij]EN[ij] is the expected count in the contingency table. It has an asymptotic normal distribution with mean 0 and variance 1; its asymptotic p-value can be obtained from the numerical evaluation of the cumulative normal distribution; when the sample size is small, a p-value on the observed counts in each cell (N[ij]N[ij]) may be obtained by simulation, i.e, by condicting a randomization test.

An overall test of random labelling (i.e. a test that all counts in the kk x kk nearest-neighbor contingency table are equal to their expected counts) is based on the quadratic form

C=(NEN)Sigma(NEN) C = (N-EN)' Sigma^- (N - EN)

where NN is the vector of all cell counts in the contingency table, SigmaSigma is the variance-covariance matrix of those counts and SigmaSigma^-

is a generalized inverse of SigmaSigma. Under the null hypothesis of random labelling of points, CC has a asymptotic Chi-square distribution with k(k1)k(k-1)

degrees of freedom (if the sample sizes are small its distribution should be estimated using Monte-Carlo simulation). P-values are computed from the probability of observing equal or larger values of CC. The overall statistic CC can be partitioned into kk species-specific test statistics C[i]C[i]. Each C[i]C[i] test if the frequencies of the neighbors of species i are similar to the expected frequencies if the points were randomly labelled. Because the C[i]C[i] are not independent Chi-square statistics, they do not sum to the overall CC.

Warning

The S[i]S[i] and S[ij]S[ij] statistics asume that the spatial nearest-neighbor process is stationary, at least to second order, i.e., have the same sign in every part of the entire plot. A biologically heterogeneous process will violate this asumption.

References

Dixon, P.M. 2002. Nearest-neighbor contingency table analysis of spatial segregation for several species. Ecoscience, 9 (2): 142-151. tools:::Rd_expr_doi("10.1080/11956860.2002.11682700") .

Author(s)

Philip M. Dixon . Marcelino de la Cruz wrote the wrapper code for the ecespa version.

See Also

K012 for another segregation test, based in the differences of univariate and bivariate KK-functions. A faster version of this function, with code implemented in FORTRAN it is available in function dixon in dixon.

Examples

data(swamp) dixon2002(swamp,nsim=99)
  • Maintainer: Marcelino de la Cruz Rot
  • License: GPL (>= 2)
  • Last published: 2023-01-05

Useful links