HDSReg function

Factor analysis with observed regressors for vector time series

Factor analysis with observed regressors for vector time series

HDSReg() considers a multivariate time series model which represents a high-dimensional vector process as a sum of three terms: a linear regression of some observed regressors, a linear combination of some latent and serially correlated factors, and a vector white noise:[REMOVE_ME]\bfyt=Dzt+Axt+ϵt,[REMOVEME2] {\bfy}_t = {\bf Dz}_t + {\bf Ax}_t + {\boldsymbol {\epsilon}}_t, [REMOVE_ME_2] where c("{\\bf\n", "y}_t") and zt{\bf z}_t are, respectively, observable p×1p\times 1 and m×1m \times 1 time series, xt{\bf x}_t is an r×1r \times 1 latent factor process, ϵt{\boldsymbol{\epsilon}}_t is a vector white noise process, D{\bf D} is an unknown regression coefficient matrix, and A{\bf A} is an unknown factor loading matrix. This procedure proposed in Chang, Guo and Yao (2015) aims to estimate the regression coefficient matrix D{\bf D}, the number of factors rr and the factor loading matrix A{\bf A}.

HDSReg( Y, Z, D = NULL, lag.k = 5, thresh = FALSE, delta = 2 * sqrt(log(ncol(Y))/nrow(Y)), twostep = FALSE )

Arguments

  • Y: An n×pn \times p data matrix Y=(y1,,yn){\bf Y} = ({\bf y}_1, \dots , {\bf y}_n )', where nn is the number of the observations of the p×1p \times 1 time series {yt}t=1n\{{\bf y}_t\}_{t=1}^n.

  • Z: An n×mn \times m data matrix Z=(z1,,zn){\bf Z} = ({\bf z}_1, \dots , {\bf z}_n )'

    consisting of the observed regressors.

  • D: A p×mp\times m regression coefficient matrix c("\\tilde{\\bf\n", "D}"). If D = NULL (the default), our procedure will estimate D{\bf D} first and let D~\tilde{\bf D} be the estimate of D{\bf D}. If D is given by the users, then D~=D\tilde{\bf D}={\bf D}.

  • lag.k: The time lag KK used to calculate the nonnegative definte matrix M^η \hat{\mathbf{M}}_{\eta}:

M^η =sumk=1KTδ{Σ^η(k)}Tδ{Σ^η(k)}, \hat{\mathbf{M}}_{\eta}\ =\\sum_{k=1}^{K} T_\delta\{\hat{\mathbf{\Sigma}}_{\eta}(k)\} T_\delta\{\hat{\mathbf{\Sigma}}_{\eta}(k)\}',

where Σ^η(k)\hat{\bf \Sigma}_{\eta}(k) is the sample autocovariance of ηt=ytD~zt {\boldsymbol {\eta}}_t = {\bf y}_t - \tilde{\bf D}{\bf z}_t

at lag $k$ and $T_\delta(\cdot)$

is a threshold operator with the threshold level $\delta \geq 0$. See 'Details'. The default is 5.
  • thresh: Logical. If thresh = FALSE (the default), no thresholding will be applied to estimate M^η\hat{\mathbf{M}}_{\eta}. If thresh = TRUE, δ\delta will be set through delta. See 'Details'.
  • delta: The value of the threshold level δ\delta. The default is δ=2n1logp \delta = 2 \sqrt{n^{-1}\log p}.
  • twostep: Logical. The same as the argument twostep in Factors.

Returns

An object of class "factors", which contains the following components:

  • factor_num: The estimated number of factors r^\hat{r}.

  • reg.coff.mat: The estimated p×mp \times m regression coefficient matrix D~\tilde{\bf D}.

  • loading.mat: The estimated p×r^p \times \hat{r} factor loading matrix A^{\bf \hat{A}}.

  • X: The n×r^n\times \hat{r} matrix X^=(x^1,,x^n)\hat{\bf X}=(\hat{\bf x}_1,\dots,\hat{\bf x}_n)' with x^t=A^(ytD~zt)\hat{\mathbf{x}}_t=\hat{\mathbf{A}}'(\mathbf{y}_t-\tilde{\mathbf{D}} \mathbf{z}_t).

  • lag.k: The time lag used in function.

Description

HDSReg() considers a multivariate time series model which represents a high-dimensional vector process as a sum of three terms: a linear regression of some observed regressors, a linear combination of some latent and serially correlated factors, and a vector white noise:

\bfyt=Dzt+Axt+ϵt, {\bfy}_t = {\bf Dz}_t + {\bf Ax}_t + {\boldsymbol {\epsilon}}_t,

where c("{\\bf\n", "y}_t") and zt{\bf z}_t are, respectively, observable p×1p\times 1 and m×1m \times 1 time series, xt{\bf x}_t is an r×1r \times 1 latent factor process, ϵt{\boldsymbol{\epsilon}}_t is a vector white noise process, D{\bf D} is an unknown regression coefficient matrix, and A{\bf A} is an unknown factor loading matrix. This procedure proposed in Chang, Guo and Yao (2015) aims to estimate the regression coefficient matrix D{\bf D}, the number of factors rr and the factor loading matrix A{\bf A}.

Details

The threshold operator Tδ()T_\delta(\cdot) is defined as Tδ(W)={wi,j1(wi,jδ)}T_\delta({\bf W}) = \{w_{i,j}1(|w_{i,j}|\geq \delta)\} for any matrix W=(wi,j){\bf W}=(w_{i,j}), with the threshold level δ0\delta \geq 0 and 1()1(\cdot)

representing the indicator function. We recommend to choose δ=0\delta=0 when pp is fixed and δ>0\delta>0 when pnp \gg n.

Examples

# Example 1 (Example 1 in Chang, Guo and Yao (2015)). ## Generate xt n <- 400 p <- 200 m <- 2 r <- 3 X <- mat.or.vec(n,r) x1 <- arima.sim(model = list(ar = c(0.6)), n = n) x2 <- arima.sim(model = list(ar = c(-0.5)), n = n) x3 <- arima.sim(model = list(ar = c(0.3)), n = n) X <- cbind(x1, x2, x3) X <- t(X) ## Generate yt Z <- mat.or.vec(m,n) S1 <- matrix(c(5/8, 1/8, 1/8, 5/8), 2, 2) Z[,1] <- c(rnorm(m)) for(i in c(2:n)){ Z[,i] <- S1%*%Z[, i-1] + c(rnorm(m)) } D <- matrix(runif(p*m, -2, 2), ncol = m) A <- matrix(runif(p*r, -2, 2), ncol = r) eps <- mat.or.vec(n, p) eps <- matrix(rnorm(n*p), p, n) Y <- D %*% Z + A %*% X + eps Y <- t(Y) Z <- t(Z) ## D is known res1 <- HDSReg(Y, Z, D, lag.k = 2) ## D is unknown res2 <- HDSReg(Y, Z, lag.k = 2)

References

Chang, J., Guo, B., & Yao, Q. (2015). High dimensional stochastic regression with latent factors, endogeneity and nonlinearity. Journal of Econometrics, 189 , 297--312. tools:::Rd_expr_doi("doi:10.1016/j.jeconom.2015.03.024") .

See Also

Factors.

  • Maintainer: Chen Lin
  • License: GPL-3
  • Last published: 2025-01-28