Kernel Density-based EM-type algorithm for Semiparametric Mixture Regression with Unspecified Error Distributions
Kernel Density-based EM-type algorithm for Semiparametric Mixture Regression with Unspecified Error Distributions
`kdeem' is used for semiparametric mixture regression using a kernel density-based expectation-maximization (EM)-type algorithm with unspecified homogeneous or heterogenous error distributions (Ma et al., 2012).
kdeem(x, y, C =2, ini =NULL, maxiter =200)
Arguments
x: an n by p data matrix where n is the number of observations and p is the number of explanatory variables (including the intercept).
y: an n-dimensional vector of response variable.
C: number of mixture components. Default is 2.
ini: initial values for the parameters. Default is NULL, which obtains the initial values using the kdeem.lse function. If specified, it can be a list with the form of list(beta, prop, tau, pi, h), where beta is a p by C matrix for regression coefficients of C components, prop is an n by C matrix for probabilities of each observation belonging to each component, caculated based on the initial beta and h, tau is a vector of C precision parameters (inverse of standard deviation), pi is a vector of C mixing proportions, and h is the bandwidth for kernel estimation.
maxiter: maximum number of iterations for the algorithm. Default is 200.
Returns
A list containing the following elements: - posterior: posterior probabilities of each observation belonging to each component.
beta: estimated regression coefficients.
tau: estimated precision parameters, the inverse of standard deviation.
pi: estimated mixing proportions.
h: bandwidth used for the kernel estimation.
Details
It can be used for a semiparametric mixture of linear regression models with unspecified component error distributions. The errors can be either homogeneous or heterogenous. The model is as follows:
fY∣X(y,x,θ,g)=j=1∑Cπjτjg{(y−x⊤βj)τj}.
Here, θ=(π1,...,πC−1,β1⊤,..,βC⊤,τ1,...,τC)⊤, g(⋅) is an unspecified density function with mean 0 and variance 1, and τj is a precision parameter. For the calculation of β in the M-step, this function employs the universal optimizer function ucminf from the `ucminf' package.