Distance between probability distributions of discrete variables given samples
Distance between probability distributions of discrete variables given samples
Hellinger (or Matusita) distance between two multivariate (q>1) or univariate (q=1) discrete probability distributions, estimated from samples.
ddhellinger(x1, x2)
Arguments
x1, x2: data frames of q columns or vectors (can also be tibbles).
If they are data frames and have not the same column names, there is a warning.
Details
Let p1 and p2 denote the estimated probability distributions of the discrete samples x1 and x2. The Matusita distance between the discrete probability distributions of the samples are computed using the ddhellingerpar function.
Returns
The distance between the two probability distributions.
Author(s)
Rachid Boumaza, Pierre Santagostini, Smail Yousfi, Sabine Demotes-Mainard
See Also
ddhellingerpar: Hellinger metric (Matusita distance) between two discrete distributions, given the on their common support probabilities.
Other distances: ddchisqsym, ddjeffreys, ddjensen, ddlp.
References
Deza, M.M. and Deza E. (2013). Encyclopedia of distances. Springer.
Examples
# Example 1x1 <- c("A","A","B","B")x2 <- c("A","A","A","B","B")ddhellinger(x1, x2)# Example 2x1 <- data.frame(x = factor(c("A","A","A","B","B","B")), y = factor(c("a","a","a","b","b","b")))x2 <- data.frame(x = factor(c("A","A","A","B","B")), y = factor(c("a","a","b","a","b")))ddhellinger(x1, x2)