Factor analysis with observed regressors for vector time series
Factor analysis with observed regressors for vector time series
HDSReg() considers a multivariate time series model which represents a high-dimensional vector process as a sum of three terms: a linear regression of some observed regressors, a linear combination of some latent and serially correlated factors, and a vector white noise:[REMOVE_ME]\bfyt=Dzt+Axt+ϵt,[REMOVEME2] where c("{\\bf\n", "y}_t") and zt are, respectively, observable p×1 and m×1 time series, xt is an r×1 latent factor process, ϵt is a vector white noise process, D is an unknown regression coefficient matrix, and A is an unknown factor loading matrix. This procedure proposed in Chang, Guo and Yao (2015) aims to estimate the regression coefficient matrix D, the number of factors r and the factor loading matrix A.
HDSReg( Y, Z, D =NULL, lag.k =5, thresh =FALSE, delta =2* sqrt(log(ncol(Y))/nrow(Y)), twostep =FALSE)
Arguments
Y: An n×p data matrix Y=(y1,…,yn)′, where n is the number of the observations of the p×1 time series {yt}t=1n.
Z: An n×m data matrix Z=(z1,…,zn)′
consisting of the observed regressors.
D: A p×m regression coefficient matrix c("\\tilde{\\bf\n", "D}"). If D = NULL (the default), our procedure will estimate D first and let D~ be the estimate of D. If D is given by the users, then D~=D.
lag.k: The time lag K used to calculate the nonnegative definte matrix M^η:
M^η=sumk=1KTδ{Σ^η(k)}Tδ{Σ^η(k)}′,
where Σ^η(k) is the sample autocovariance of ηt=yt−D~zt
at lag $k$ and $T_\delta(\cdot)$
is a threshold operator with the threshold level $\delta \geq 0$. See 'Details'. The default is 5.
thresh: Logical. If thresh = FALSE (the default), no thresholding will be applied to estimate M^η. If thresh = TRUE, δ will be set through delta. See 'Details'.
delta: The value of the threshold level δ. The default is δ=2n−1logp.
twostep: Logical. The same as the argument twostep in Factors.
Returns
An object of class "factors", which contains the following components:
factor_num: The estimated number of factors r^.
reg.coff.mat: The estimated p×m regression coefficient matrix D~.
loading.mat: The estimated p×r^ factor loading matrix A^.
X: The n×r^ matrix X^=(x^1,…,x^n)′ with x^t=A^′(yt−D~zt).
lag.k: The time lag used in function.
Description
HDSReg() considers a multivariate time series model which represents a high-dimensional vector process as a sum of three terms: a linear regression of some observed regressors, a linear combination of some latent and serially correlated factors, and a vector white noise:
\bfyt=Dzt+Axt+ϵt,
where c("{\\bf\n", "y}_t") and zt are, respectively, observable p×1 and m×1 time series, xt is an r×1 latent factor process, ϵt is a vector white noise process, D is an unknown regression coefficient matrix, and A is an unknown factor loading matrix. This procedure proposed in Chang, Guo and Yao (2015) aims to estimate the regression coefficient matrix D, the number of factors r and the factor loading matrix A.
Details
The threshold operator Tδ(⋅) is defined as Tδ(W)={wi,j1(∣wi,j∣≥δ)} for any matrix W=(wi,j), with the threshold level δ≥0 and 1(⋅)
representing the indicator function. We recommend to choose δ=0 when p is fixed and δ>0 when p≫n.
Examples
# Example 1 (Example 1 in Chang, Guo and Yao (2015)).## Generate xtn <-400p <-200m <-2r <-3X <- mat.or.vec(n,r)x1 <- arima.sim(model = list(ar = c(0.6)), n = n)x2 <- arima.sim(model = list(ar = c(-0.5)), n = n)x3 <- arima.sim(model = list(ar = c(0.3)), n = n)X <- cbind(x1, x2, x3)X <- t(X)## Generate ytZ <- mat.or.vec(m,n)S1 <- matrix(c(5/8,1/8,1/8,5/8),2,2)Z[,1]<- c(rnorm(m))for(i in c(2:n)){ Z[,i]<- S1%*%Z[, i-1]+ c(rnorm(m))}D <- matrix(runif(p*m,-2,2), ncol = m)A <- matrix(runif(p*r,-2,2), ncol = r)eps <- mat.or.vec(n, p)eps <- matrix(rnorm(n*p), p, n)Y <- D %*% Z + A %*% X + eps
Y <- t(Y)Z <- t(Z)## D is knownres1 <- HDSReg(Y, Z, D, lag.k =2)## D is unknownres2 <- HDSReg(Y, Z, lag.k =2)
References
Chang, J., Guo, B., & Yao, Q. (2015). High dimensional stochastic regression with latent factors, endogeneity and nonlinearity. Journal of Econometrics, 189 , 297--312. tools:::Rd_expr_doi("doi:10.1016/j.jeconom.2015.03.024") .