GiniMD computes Gini's mean difference on a numeric vector. This index is defined as the mean absolute difference between any two distinct elements of a vector. For a Bernoulli (binary) variable with proportion of ones equal to p and sample size n, Gini's mean difference is 2np(1−p)/(n−1). For a trinomial variable (e.g., predicted values for a 3-level categorical predictor using two dummy variables) having (predicted) values A,B,C with corresponding proportions a,b,c, Gini's mean difference is 2n[ab∣A−B∣+ac∣A−C∣+bc∣B−C∣]/(n−1).
GiniMd(x, na.rm=FALSE)
Arguments
x: a numeric vector (for GiniMd)
na.rm: set to TRUE if you suspect there may be NAs in x; these will then be removed. Otherwise an error will result.
Returns
a scalar numeric
References
David HA (1968): Gini's mean difference rediscovered. Biometrika 55:573--575.
set.seed(1)x <- rnorm(40)# Test GiniMd against a brute-force solutiongmd <-function(x){ n <- length(x) sum(outer(x, x,function(a, b) abs(a - b)))/ n /(n -1)}GiniMd(x)gmd(x)z <- c(rep(0,17), rep(1,6))n <- length(z)GiniMd(z)2*mean(z)*(1-mean(z))*n/(n-1)a <-12; b <-13; c <-7; n <- a + b + c
A <--.123; B <--.707; C <-0.523xx <- c(rep(A, a), rep(B, b), rep(C, c))GiniMd(xx)2*(a*b*abs(A-B)+ a*c*abs(A-C)+ b*c*abs(B-C))/n/(n-1)