Density, Distribution and Quantile of Finite Mixture Distribution
Density, Distribution and Quantile of Finite Mixture Distribution
Density function, distribution function, quantile function and random generation for a finite mixture distribution with normal or Tukey g-&-h components.
dfmx( x, dist, distname = dist@distname, K = dim(pars)[1L], pars = dist@pars, w = dist@w,..., log =FALSE)pfmx( q, dist, distname = dist@distname, K = dim(pars)[1L], pars = dist@pars, w = dist@w,..., lower.tail =TRUE, log.p =FALSE)qfmx( p, dist, distname = dist@distname, K = dim(pars)[1L], pars = dist@pars, w = dist@w, interval = qfmx_interval(dist = dist),..., lower.tail =TRUE, log.p =FALSE)rfmx( n, dist, distname = dist@distname, K = dim(pars)[1L], pars = dist@pars, w = dist@w
)
distname, K, pars, w: auxiliary parameters, whose default values are determined by argument dist. The user-specified vector of w does not need to sum up to 1; w/sum(w) will be used internally.
...: additional parameters
log, log.p: logical scalar. If TRUE, probabilities are given as log(p).
lower.tail: logical scalar. If TRUE (default), probabilities are Pr(X≤x), otherwise, Pr(X>x).
p: numeric vector , probabilities.
interval: length two numeric vector , interval for root finding, see vuniroot2 and vuniroot
n: integer scalar, number of observations.
Returns
Function dfmx() returns a numeric vector of probability density values of an fmx object at specified quantiles x.
Function pfmx() returns a numeric vector of cumulative probability values of an fmx object at specified quantiles q.
Function qfmx() returns an unnamed numeric vector of quantiles of an fmx object, based on specified cumulative probabilities p.
Function rfmx() generates random deviates of an fmx object.
Details
A computational challenge in function dfmx() is when mixture density is very close to 0, which happens when the per-component log densities are negative with big absolute values. In such case, we cannot compute the log mixture densities (i.e., -Inf), for the log-likelihood using function logLik.fmx(). Our solution is to replace these -Inf log mixture densities by the weighted average (using the mixing proportions of dist) of the per-component log densities.
Function qfmx() gives the quantile function, by numerically solving pfmx . One major challenge when dealing with the finite mixture of Tukey g-&-h family distribution is that Brent–Dekker's method needs to be performed in both pGH and qfmx functions, i.e. two layers of root-finding algorithm.
Note
Function qnorm returns an unnamed vector of quantiles, although quantile returns a named vector of quantiles.
Examples
library(ggplot2)(e1 = fmx('norm', mean = c(0,3), sd = c(1,1.3), w = c(1,1)))curve(dfmx(x, dist = e1), xlim = c(-3,7))ggplot()+ geom_function(fun = dfmx, args = list(dist = e1))+ xlim(-3,7)ggplot()+ geom_function(fun = pfmx, args = list(dist = e1))+ xlim(-3,7)hist(rfmx(n =1e3L, dist = e1), main ='1000 obs from e1')x =(-3):7round(dfmx(x, dist = e1), digits =3L)round(p1 <- pfmx(x, dist = e1), digits =3L)stopifnot(all.equal.numeric(qfmx(p1, dist = e1), x, tol =1e-4))(e2 = fmx('GH', A = c(0,3), g = c(.2,.3), h = c(.2,.1), w = c(2,3)))ggplot()+ geom_function(fun = dfmx, args = list(dist = e2))+ xlim(-3,7)round(dfmx(x, dist = e2), digits =3L)round(p2 <- pfmx(x, dist = e2), digits =3L)stopifnot(all.equal.numeric(qfmx(p2, dist = e2), x, tol =1e-4))(e3 = fmx('GH', g =.2, h =.01))# one-component Tukeyggplot()+ geom_function(fun = dfmx, args = list(dist = e3))+ xlim(-3,5)set.seed(124); r1 = rfmx(1e3L, dist = e3);set.seed(124); r2 = TukeyGH77::rGH(n =1e3L, g =.2, h =.01)stopifnot(identical(r1, r2))# but ?rfmx has much cleaner coderound(dfmx(x, dist = e3), digits =3L)round(p3 <- pfmx(x, dist = e3), digits =3L)stopifnot(all.equal.numeric(qfmx(p3, dist = e3), x, tol =1e-4))a1 = fmx('GH', A = c(7,9), B = c(.8,1.2), g = c(.3,0), h = c(0,.1), w = c(1,1))a2 = fmx('GH', A = c(6,9), B = c(.8,1.2), g = c(-.3,0), h = c(.2,.1), w = c(4,6))library(ggplot2)(p = ggplot()+ geom_function(fun = pfmx, args = list(dist = a1), mapping = aes(color ='g2=h1=0'))+ geom_function(fun = pfmx, args = list(dist = a2), mapping = aes(color ='g2=0'))+ xlim(3,15)+ scale_y_continuous(labels = scales::percent)+ labs(y =NULL, color ='models')+ coord_flip())p + theme(legend.position ='none')# to use [rfmx] without \pkg{fmx}(d = fmx(distname ='GH', A = c(-1,1), B = c(.9,1.1), g = c(.3,-.2), h = c(.1,.05), w = c(2,3)))d@pars
set.seed(14123); x = rfmx(n =1e3L, dist = d)set.seed(14123); x_raw = rfmx(n =1e3L, distname ='GH', K =2L, pars = rbind( c(A =-1, B =.9, g =.3, h =.1), c(A =1, B =1.1, g =-.2, h =.05)), w = c(.4,.6))stopifnot(identical(x, x_raw))