clr function

Clr and inverse clr transformation

Clr and inverse clr transformation

clr computes the clr or inverse clr transformation of a vector f

with respect to integration weights w, corresponding to a Bayes Hilbert space B2(μ)=B2(T,A,μ)B^2(\mu) = B^2(T, A, \mu).

clr(f, w = 1, inverse = FALSE)

Arguments

  • f: a vector containing the function values (evaluated on a grid) of the function ff to transform. If inverse = TRUE, f must be a density, i.e., all entries must be positive and usually f integrates to one. If inverse = FALSE, f should integrate to zero, see Details.
  • w: a vector of length one or of the same length as f containing positive integration weights. If w has length one, this weight is used for all function values. The integral of ff is approximated via j=1m\sum_{j=1}^m wj_j fj_j, where mm equals the length of f.
  • inverse: if TRUE, the inverse clr transformation is computed.

Returns

A vector of the same length as f containing the (inverse) clr transformation of f.

Details

The clr transformation maps a density ff from B2(μ)B^2(\mu) to L02(μ):=fL2(μ)Tfdμ=0L^2_0(\mu) := {f \in L^2(\mu) | \int_T f d\mu = 0}

via

clr(f):=logf1μ(T)Tlogfdμ.clr(f):=logf1/μ(T)Tlogfdμ. \mathrm{clr}(f) := \log f - \frac{1}{\mu (\mathcal{T})} \int_{\mathcal{T}} \log f \, \mathrm{d}\mu.clr(f) := log f - 1/\mu(T) * \int_T log f d\mu.

The inverse clr transformation maps a function ff from L02(μ)L^2_0(\mu) to B2(μ)B^2(\mu) via

clr1(f):=expfTexpfdμ.clr1(f):=(expf)/(Texpfdμ). \mathrm{clr}^{-1}(f) := \frac{\exp f}{\int_{\mathcal{T}} \exp f \, \mathrm{d}\mu}.clr^{-1}(f) := (exp f) / (\int_T \exp f d\mu).

Note that in contrast to Maier et al. (2021), this definition of the inverse clr transformation includes normalization, yielding the respective probability density function (representative of the equivalence class of proportional functions in B2(μ)B^2(\mu)).

The (inverse) clr transformation depends not only on ff, but also on the underlying measure space (T,A,μ)(T, A, \mu), which determines the integral. In clr this is specified via the integration weights w. E.g., for a discrete set TT

with A=P(T)A = P(T) the power set of TT and μ=tTδt\mu = \sum_{t \in T} \delta_t the sum of dirac measures at tTt \in T, the default w = 1 is the correct choice. In this case, integrals are indeed computed exactly, not only approximately. For an interval T=[a,b]T = [a, b]

with A=BA = B the Borel σ\sigma-algebra restricted to TT and μ=λ\mu = \lambda the Lebesgue measure, the choice of w depends on the grid on which the function was evaluated: wj_j must correspond to the length of the subinterval of [a,b][a, b], which fj_j represents. E.g., for a grid with equidistant distance dd, where the boundary grid values are a+d/2a + d/2 and bd/2b - d/2

(i.e., the grid points are centers of intervals of size dd), equal weights dd should be chosen for w.

The clr transformation is crucial for density-on-scalar regression since estimating the clr transformed model in L02(μ)L^2_0(\mu) is equivalent to estimating the original model in B2(μ)B^2(\mu) (as the clr transformation is an isometric isomorphism), see also the vignette "FDboost_density-on-scalar_births" and Maier et al. (2021).

Examples

### Continuous case (T = [0, 1] with Lebesgue measure): # evaluate density of a Beta distribution on an equidistant grid g <- seq(from = 0.005, to = 0.995, by = 0.01) f <- dbeta(g, 2, 5) # compute clr transformation with distance of two grid points as integration weight f_clr <- clr(f, w = 0.01) # visualize result plot(g, f_clr , type = "l") abline(h = 0, col = "grey") # compute inverse clr transformation (w as above) f_clr_inv <- clr(f_clr, w = 0.01, inverse = TRUE) # visualize result plot(g, f, type = "l") lines(g, f_clr_inv, lty = 2, col = "red") ### Discrete case (T = {1, ..., 12} with sum of dirac measures at t in T): data("birthDistribution", package = "FDboost") # fit density-on-scalar model with effects for sex and year model <- FDboost(birth_densities_clr ~ 1 + bolsc(sex, df = 1) + bbsc(year, df = 1, differences = 1), # use bbsc() in timeformula to ensure integrate-to-zero constraint timeformula = ~bbsc(month, df = 4, # December is followed by January of subsequent year cyclic = TRUE, # knots = {1, ..., 12} with additional boundary knot # 0 (coinciding with 12) due to cyclic = TRUE knots = 1:11, boundary.knots = c(0, 12), # degree = 1 with these knots yields identity matrix # as design matrix degree = 1), data = birthDistribution, offset = 0, control = boost_control(mstop = 1000)) # Extract predictions (clr-transformed!) and transform them to Bayes Hilbert space predictions_clr <- predict(model) predictions <- t(apply(predictions_clr, 1, clr, inverse = TRUE))

References

Maier, E.-M., Stoecker, A., Fitzenberger, B., Greven, S. (2021): Additive Density-on-Scalar Regression in Bayes Hilbert Spaces with an Application to Gender Economics. arXiv preprint arXiv:2110.11771.

Author(s)

Eva-Maria Maier

  • Maintainer: David Ruegamer
  • License: GPL-2
  • Last published: 2023-08-12