kldstudent function

Kullback-Leibler Divergence between Centered Multivariate tt Distributions

Kullback-Leibler Divergence between Centered Multivariate tt Distributions

Computes the Kullback-Leibler divergence between two random vectors distributed according to multivariate tt distributions (MTD) with zero location vector.

kldstudent(nu1, Sigma1, nu2, Sigma2, eps = 1e-06)

Arguments

  • nu1: numeric. The degrees of freedom of the first distribution.
  • Sigma1: symmetric, positive-definite matrix. The scatter matrix of the first distribution.
  • nu2: numeric. The degrees of freedom of the second distribution.
  • Sigma2: symmetric, positive-definite matrix. The scatter matrix of the second distribution.
  • eps: numeric. Precision for the computation of the partial derivative of the Lauricella DD-hypergeometric function (see Details). Default: 1e-06.

Returns

A numeric value: the Kullback-Leibler divergence between the two distributions, with two attributes attr(, "epsilon") (precision of the partial derivative of the Lauricella DD-hypergeometric function,see Details) and attr(, "k") (number of iterations).

Details

Given X1X_1, a random vector of Rp\mathbb{R}^p distributed according to the centered MTD with parameters (ν1,0,Σ1)(\nu_1, 0, \Sigma_1)

and X2X_2, a random vector of Rp\mathbb{R}^p distributed according to the MCD with parameters (ν2,0,Σ2)(\nu_2, 0, \Sigma_2).

Let λ1,,λp\lambda_1, \dots, \lambda_p the eigenvalues of the square matrix Σ1Σ21\Sigma_1 \Sigma_2^{-1}

sorted in increasing order:

λ1<<λp1<λp \lambda_1 < \dots < \lambda_{p-1} < \lambda_p

The Kullback-Leibler divergence of X1X_1 from X2X_2 is given by:

DKL(X1X2)=ln(Γ(ν1+p2)Γ(ν22)ν2p2Γ(ν2+p2)Γ(ν12)ν1p2)+ν2ν12[ψ(ν1+p2)ψ(ν12)]12i=1plnλiν2+p2×D \displaystyle{ D_{KL}(\mathbf{X}_1\|\mathbf{X}_2) = \ln\left(\frac{\Gamma\left(\frac{\nu_1+p}{2}\right) \Gamma\left(\frac{\nu_2}{2}\right) \nu_2^{\frac{p}{2}}}{\Gamma\left(\frac{\nu_2+p}{2}\right) \Gamma\left(\frac{\nu_1}{2}\right) \nu_1^{\frac{p}{2}}} \right) + \frac{\nu_2-\nu_1}{2} \left[\psi\left(\frac{\nu_1+p}{2} \right) - \psi\left(\frac{\nu_1}{2}\right)\right] - \frac{1}{2} \sum_{i=1}^p{\ln\lambda_i} - \frac{\nu_2+p}{2} \times D }

where ψ\psi is the digamma function (see Special ) and DD is given by:

  • If ν1ν2λ11\displaystyle{\frac{\nu_1}{\nu_2}\lambda_1 \> 1},
D=i=1p(ν2ν11λi)12a{FD(p)(ν1+p2,12,,12p;a+ν1+p2;1ν2ν11λ1,,1ν2ν11λp)}a=0 \displaystyle{ D = \prod_{i=1}^p{\left(\frac{\nu_2}{\nu_1}\frac{1}{\lambda_i}\right)^\frac{1}{2}} \frac{\partial}{\partial{a}}{ \bigg\{ F_D^{(p)}\bigg( \frac{\nu_1+p}{2}, \underbrace{\frac{1}{2}, \dots, \frac{1}{2}}_p; a + \frac{\nu_1+p}{2}; 1 - \frac{\nu_2}{\nu_1}\frac{1}{\lambda_1}, \dots, 1 - \frac{\nu_2}{\nu_1}\frac{1}{\lambda_p} \bigg) \bigg\} } \bigg|_{a=0} }
  • If ν1ν2λp\<1\displaystyle{\frac{\nu_1}{\nu_2}\lambda_p \< 1},
D=a{FD(p)(a,12,,12p;a+ν1+p2;1ν1ν2λ1,,1ν1ν2λp)}a=0 \displaystyle{ D = \frac{\partial}{\partial{a}}{ \bigg\{ F_D^{(p)}\bigg( a, \underbrace{\frac{1}{2}, \dots, \frac{1}{2}}_p; a + \frac{\nu_1+p}{2}; 1 - \frac{\nu_1}{\nu_2}\lambda_1, \dots, 1 - \frac{\nu_1}{\nu_2}\lambda_p \bigg) \bigg\} } \bigg|_{a=0} }
  • If ν1ν2λ1\<1\<ν1ν2λp\displaystyle{\frac{\nu_1}{\nu_2}\lambda_1 \< 1 \< \frac{\nu1}{\nu_2}\lambda_p},
D=ln(ν1ν2λp)+a{FD(p)(a,12,,12,a+ν12p;a+ν1+p2;1λ1λp,,1λp1λp,1ν2ν11λp)}a=0 \begin{array}{lll}D & = & \displaystyle{ -\ln\left(\frac{\nu_1}{\nu_2}\lambda_p\right) } + \\&& \displaystyle{ \frac{\partial}{\partial{a}}{ \bigg\{ F_D^{(p)}\bigg( a, \underbrace{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{\nu_1}{2}}_p; a + \frac{\nu_1+p}{2}; 1 - \frac{\lambda_1}{\lambda_p}, \dots, 1 - \frac{\lambda_{p-1}}{\lambda_p}, 1 - \frac{\nu_2}{\nu_1}\frac{1}{\lambda_p} \bigg) \bigg\} } \bigg|_{a=0} }\end{array}

FD(p)F_D^{(p)} is the Lauricella DD-hypergeometric function defined for pp variables:

FD(p)(a;b1,,bp;g;x1,,xp)=m10mp0(a)m1++mp(b1)m1(bp)mp(g)m1++mpx1m1m1!xpmpmp! \displaystyle{ F_D^{(p)}\left(a; b_1, \dots, b_p; g; x_1, \dots, x_p\right) = \sum\limits_{m_1 \geq 0} \dots \sum\limits_{m_p \geq 0}{ \frac{ (a)_{m_1+\dots+m_p}(b_1)_{m_1} \dots (b_p)_{m_p} }{ (g)_{m_1+\dots+m_p} } \frac{x_1^{m_1}}{m_1!} \dots \frac{x_p^{m_p}}{m_p!} } }

Examples

nu1 <- 2 Sigma1 <- matrix(c(2, 1.2, 0.4, 1.2, 2, 0.6, 0.4, 0.6, 2), nrow = 3) nu2 <- 4 Sigma2 <- matrix(c(1, 0.3, 0.1, 0.3, 1, 0.4, 0.1, 0.4, 1), nrow = 3) kldstudent(nu1, Sigma1, nu2, Sigma2) kldstudent(nu2, Sigma2, nu1, Sigma1)

References

N. Bouhlel and D. Rousseau (2023), Exact Rényi and Kullback-Leibler Divergences Between Multivariate t-Distributions. IEEE Signal Processing Letters, vol. 30, pp. 1672-1676, October 2023. tools:::Rd_expr_doi("10.1109/LSP.2023.3324594")

Author(s)

Pierre Santagostini, Nizar Bouhlel

  • Maintainer: Pierre Santagostini
  • License: GPL (>= 3)
  • Last published: 2024-12-20