semimrFull function

Semiparametric Mixture Regression Models with Single-index Proportion and Fully Iterative Backfitting

Semiparametric Mixture Regression Models with Single-index Proportion and Fully Iterative Backfitting

Assume that x=(x1,,xn)\boldsymbol{x} = (\boldsymbol{x}_1,\cdots,\boldsymbol{x}_n) is an n by p matrix and Y=(Y1,,Yn)Y = (Y_1,\cdots,Y_n) is an n-dimensional vector of response variable. The conditional distribution of YY given x\boldsymbol{x} can be written as: [REMOVE_ME]f(yx,α,π,m,σ2)=j=1Cπj(αx)ϕ(ymj(αx),σj2(αx)).[REMOVEME2] f(y|\boldsymbol{x},\boldsymbol{\alpha},\pi,m,\sigma^2) =\sum_{j=1}^C\pi_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x})\phi(y|m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}),\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x})). [REMOVE_ME_2]

`semimrFull' is used to estimate the mixture of single-index models described above, where ϕ(ymj(αx),σj2(αx))\phi(y|m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}),\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x}))

represents the normal density with a mean of mj(αx)m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}) and a variance of σj2(αx)\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x}), and πj(),μj(),σj2()\pi_j(\cdot), \mu_j(\cdot), \sigma_j^2(\cdot) are unknown smoothing single-index functions capable of handling high-dimensional non-parametric problem. This function employs kernel regression and a fully iterative backfitting (FIB) estimation procedure (Xiang and Yao, 2020).

semimrFull(x, y, h = NULL, coef = NULL, ini = NULL, grid = NULL, maxiter = 100)

Arguments

  • x: an n by p matrix of observations where n is the number of observations and p is the number of explanatory variables.
  • y: an n-dimensional vector of response values.
  • h: bandwidth for the kernel regression. Default is NULL, and the bandwidth is computed in the function by cross-validation.
  • coef: initial value of α\boldsymbol{\alpha}^{\top} in the model, which plays a role of regression coefficient in a regression model. Default is NULL, and the value is computed in the function by sliced inverse regression (Li, 1991).
  • ini: initial values for the parameters. Default is NULL, which obtains the initial values, assuming a linear mixture model. If specified, it can be a list with the form of list(pi, mu, var), where pi is a vector of mixing proportions, mu is a vector of component means, and var is a vector of component variances.
  • grid: grid points at which nonparametric functions are estimated. Default is NULL, which uses the estimated mixing proportions, component means, and component variances as the grid points after the algorithm converges.
  • maxiter: maximum number of iterations. Default is 100.

Returns

A list containing the following elements: - pi: matrix of estimated mixing proportions.

  • mu: estimated component means.

  • var: estimated component variances.

  • coef: estimated regression coefficients.

  • run: total number of iterations after convergence.

Description

Assume that x=(x1,,xn)\boldsymbol{x} = (\boldsymbol{x}_1,\cdots,\boldsymbol{x}_n) is an n by p matrix and Y=(Y1,,Yn)Y = (Y_1,\cdots,Y_n) is an n-dimensional vector of response variable. The conditional distribution of YY given x\boldsymbol{x} can be written as:

f(yx,α,π,m,σ2)=j=1Cπj(αx)ϕ(ymj(αx),σj2(αx)). f(y|\boldsymbol{x},\boldsymbol{\alpha},\pi,m,\sigma^2) =\sum_{j=1}^C\pi_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x})\phi(y|m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}),\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x})).

`semimrFull' is used to estimate the mixture of single-index models described above, where ϕ(ymj(αx),σj2(αx))\phi(y|m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}),\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x}))

represents the normal density with a mean of mj(αx)m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}) and a variance of σj2(αx)\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x}), and πj(),μj(),σj2()\pi_j(\cdot), \mu_j(\cdot), \sigma_j^2(\cdot) are unknown smoothing single-index functions capable of handling high-dimensional non-parametric problem. This function employs kernel regression and a fully iterative backfitting (FIB) estimation procedure (Xiang and Yao, 2020).

Examples

xx = NBA[, c(1, 2, 4)] yy = NBA[, 3] x = xx/t(matrix(rep(sqrt(diag(var(xx))), length(yy)), nrow = 3)) y = yy/sd(yy) ini_bs = sinvreg(x, y) ini_b = ini_bs$direction[, 1] est = semimrFull(x[1:50, ], y[1:50], h = 0.3442, coef = ini_b)

References

Xiang, S. and Yao, W. (2020). Semiparametric mixtures of regressions with single-index for model based clustering. Advances in Data Analysis and Classification, 14(2), 261-292.

Li, K. C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86(414), 316-327.

See Also

semimrOne, sinvreg for initial value calculation of α\boldsymbol{\alpha}^{\top}.

  • Maintainer: Suyeon Kang
  • License: GPL (>= 2)
  • Last published: 2023-09-20

Useful links