sketch_leverage function

Sketch using leverage score type sampling

Sketch using leverage score type sampling

Provides a subsample of data using sketches

sketch_leverage(data, m, method = "leverage")

Arguments

  • data: (n times d)-dimensional matrix of data. The first column needs to be a vector of the dependent variable (Y)
  • m: subsample size that is less than n
  • method: method for sketching: "leverage" leverage score sampling using X (default); "root_leverage" square-root leverage score sampling using X.

Returns

An S3 object has the following elements. - subsample: (m times d)-dimensional matrix of data

  • prob: m-dimensional vector of probabilities

Examples

## Least squares: sketch and solve # setup n <- 1e+6 # full sample size d <- 5 # dimension of covariates m <- 1e+3 # sketch size # generate psuedo-data X <- matrix(stats::rnorm(n*d), nrow = n, ncol = d) beta <- matrix(rep(1,d), nrow = d, ncol = 1) eps <- matrix(stats::rnorm(n), nrow = n, ncol = 1) Y <- X %*% beta + eps intercept <- matrix(rep(1,n), nrow = n, ncol = 1) # full sample including the intercept term fullsample <- cbind(Y,intercept,X) # generate a sketch using leverage score sampling s_lev <- sketch_leverage(fullsample, m, "leverage") # solve without the intercept with weighting ls_lev <- lm(s_lev$subsample[,1] ~ s_lev$subsample[,2] - 1, weights = s_lev$prob)

References

Ma, P., Zhang, X., Xing, X., Ma, J. and Mahoney, M.. (2020). Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1026-1035.

  • Maintainer: Sokbae Lee
  • License: GPL-3
  • Last published: 2022-09-07