clr computes the clr or inverse clr transformation of a vector f
with respect to integration weights w, corresponding to a Bayes Hilbert space B2(μ)=B2(T,A,μ).
clr(f, w =1, inverse =FALSE)
Arguments
f: a vector containing the function values (evaluated on a grid) of the function f to transform. If inverse = TRUE, f must be a density, i.e., all entries must be positive and usually f integrates to one. If inverse = FALSE, f should integrate to zero, see Details.
w: a vector of length one or of the same length as f containing positive integration weights. If w has length one, this weight is used for all function values. The integral of f is approximated via ∑j=1mwjfj, where m equals the length of f.
inverse: if TRUE, the inverse clr transformation is computed.
Returns
A vector of the same length as f containing the (inverse) clr transformation of f.
Details
The clr transformation maps a density f from B2(μ) to L02(μ):=f∈L2(μ)∣∫Tfdμ=0
Note that in contrast to Maier et al. (2021), this definition of the inverse clr transformation includes normalization, yielding the respective probability density function (representative of the equivalence class of proportional functions in B2(μ)).
The (inverse) clr transformation depends not only on f, but also on the underlying measure space (T,A,μ), which determines the integral. In clr this is specified via the integration weights w. E.g., for a discrete set T
with A=P(T) the power set of T and μ=∑t∈Tδt the sum of dirac measures at t∈T, the default w = 1 is the correct choice. In this case, integrals are indeed computed exactly, not only approximately. For an interval T=[a,b]
with A=B the Borel σ-algebra restricted to T and μ=λ the Lebesgue measure, the choice of w depends on the grid on which the function was evaluated: wj must correspond to the length of the subinterval of [a,b], which fj represents. E.g., for a grid with equidistant distance d, where the boundary grid values are a+d/2 and b−d/2
(i.e., the grid points are centers of intervals of size d), equal weights d should be chosen for w.
The clr transformation is crucial for density-on-scalar regression since estimating the clr transformed model in L02(μ) is equivalent to estimating the original model in B2(μ) (as the clr transformation is an isometric isomorphism), see also the vignette "FDboost_density-on-scalar_births" and Maier et al. (2021).
Examples
### Continuous case (T = [0, 1] with Lebesgue measure):# evaluate density of a Beta distribution on an equidistant gridg <- seq(from =0.005, to =0.995, by =0.01)f <- dbeta(g,2,5)# compute clr transformation with distance of two grid points as integration weightf_clr <- clr(f, w =0.01)# visualize resultplot(g, f_clr , type ="l")abline(h =0, col ="grey")# compute inverse clr transformation (w as above)f_clr_inv <- clr(f_clr, w =0.01, inverse =TRUE)# visualize resultplot(g, f, type ="l")lines(g, f_clr_inv, lty =2, col ="red")### Discrete case (T = {1, ..., 12} with sum of dirac measures at t in T):data("birthDistribution", package ="FDboost")# fit density-on-scalar model with effects for sex and yearmodel <- FDboost(birth_densities_clr ~1+ bolsc(sex, df =1)+ bbsc(year, df =1, differences =1),# use bbsc() in timeformula to ensure integrate-to-zero constraint timeformula =~bbsc(month, df =4,# December is followed by January of subsequent year cyclic =TRUE,# knots = {1, ..., 12} with additional boundary knot# 0 (coinciding with 12) due to cyclic = TRUE knots =1:11, boundary.knots = c(0,12),# degree = 1 with these knots yields identity matrix # as design matrix degree =1), data = birthDistribution, offset =0, control = boost_control(mstop =1000))# Extract predictions (clr-transformed!) and transform them to Bayes Hilbert spacepredictions_clr <- predict(model)predictions <- t(apply(predictions_clr,1, clr, inverse =TRUE))
References
Maier, E.-M., Stoecker, A., Fitzenberger, B., Greven, S. (2021): Additive Density-on-Scalar Regression in Bayes Hilbert Spaces with an Application to Gender Economics. arXiv preprint arXiv:2110.11771.