Factors() R function from [HDTSA]

Factor analysis for vector time series

Factors() deals with factor modeling for high-dimensional time series proposed in Lam and Yao (2012):[REMOVE_ME] ${\bf y}_t = {\bf Ax}_t +{\boldsymbol{\epsilon}}_t, [REMOVE_ME_2]$ where ${\bf x}_t$ is an $r \times 1$

latent process with (unknown) $r \leq p$ , ${\bf A}$ is a c(" $p\n$ ", " $\\times r$ ") unknown constant matrix, and ${\boldsymbol{\epsilon}}_t$ is a vector white noise process. The number of factors $r$ and the factor loadings ${\bf A}$ can be estimated in terms of an eigenanalysis for a nonnegative definite matrix, and is therefore applicable when the dimension of ${\bf y}_t$ is on the order of a few thousands. This function aims to estimate the number of factors $r$ and the factor loading matrix ${\bf A}$ .


Factors(
  Y,
  lag.k = 5,
  thresh = FALSE,
  delta = 2 * sqrt(log(ncol(Y))/nrow(Y)),
  twostep = FALSE
)

Arguments

Y: An $n \times p$ data matrix ${\bf Y} = ({\bf y}_1, \dots , {\bf y}_n )'$ , where $n$ is the number of the observations of the $p \times 1$ time series $\{{\bf y}_t\}_{t=1}^n$ .
lag.k: The time lag $K$ used to calculate the nonnegative definite matrix $\hat{\mathbf{M}}$ :

\hat{\mathbf{M}}\ =\\sum_{k=1}^{K} T_\delta\{\hat{\mathbf{\Sigma}}_y(k)\} T_\delta\{\hat{\mathbf{\Sigma}}_y(k)\}'\,,

where $\hat{\bf \Sigma}_y(k)$ is the sample autocovariance of ${\bf y}_t$ at lag $k$ and $T_\delta(\cdot)$

is a threshold operator with the threshold level $\delta \geq 0$. See 'Details'. The default is 5.

thresh: Logical. If thresh = FALSE (the default), no thresholding will be applied to estimate $\hat{\mathbf{M}}$ . If thresh = TRUE, $\delta$ will be set through delta.
delta: The value of the threshold level $\delta$ . The default is $\delta = 2 \sqrt{n^{-1}\log p}$ .
twostep: Logical. If twostep = FALSE (the default), the standard procedure [See Section 2.2 in Lam and Yao (2012)] for estimating $r$

and ${\bf A}$ will be implemented. If twostep = TRUE, the two-step estimation procedure [See Section 4 in Lam and Yao (2012)] for estimating $r$ and ${\bf A}$ will be implemented.

Returns

An object of class "factors", which contains the following components: - factor_num: The estimated number of factors $\hat{r}$ .

loading.mat: The estimated $p \times \hat{r}$ factor loading matrix $\hat{\bf A}$ .
X: The $n\times \hat{r}$ matrix $\hat{\bf X}=(\hat{\bf x}_1,\dots,\hat{\bf x}_n)'$ with $\hat{\bf x}_t = \hat{\bf A}'\hat{\bf y}_t$ .
lag.k: The time lag used in function.

Description

Factors() deals with factor modeling for high-dimensional time series proposed in Lam and Yao (2012):

{\bf y}_t = {\bf Ax}_t +{\boldsymbol{\epsilon}}_t,

where ${\bf x}_t$ is an $r \times 1$

Details

The threshold operator $T_\delta(\cdot)$ is defined as $T_\delta({\bf W}) = \{w_{i,j}1(|w_{i,j}|\geq \delta)\}$ for any matrix ${\bf W}=(w_{i,j})$ , with the threshold level $\delta \geq 0$ and $1(\cdot)$

representing the indicator function. We recommend to choose $\delta=0$ when $p$ is fixed and $\delta>0$ when $p \gg n$ .

Examples


# Example 1 (Example in Section 3.3 of lam and Yao 2012)
## Generate y_t
p <- 200
n <- 400
r <- 3
X <- mat.or.vec(n, r)
A <- matrix(runif(p*r, -1, 1), ncol=r)
x1 <- arima.sim(model=list(ar=c(0.6)), n=n)
x2 <- arima.sim(model=list(ar=c(-0.5)), n=n)
x3 <- arima.sim(model=list(ar=c(0.3)), n=n)
eps <- matrix(rnorm(n*p), p, n)
X <- t(cbind(x1, x2, x3))
Y <- A %*% X + eps
Y <- t(Y)

fac <- Factors(Y,lag.k=2)
r_hat <- fac$factor_num
loading_Mat <- fac$loading.mat

References

Lam, C., & Yao, Q. (2012). Factor modelling for high-dimensional time series: Inference for the number of factors. The Annals of Statistics, 40 , 694--726. tools:::Rd_expr_doi("doi:10.1214/12-AOS970") .

HDTSA package Read PDF manual

Maintainer: Chen Lin
License: GPL-3
Last published: 2025-01-28

Useful links

Downloads (last 30 days):

Factors function