Tools for Tall Distributed Matrices
Arithmetic Operators
subsetting
cbind
Column Operations
collapse
Covariance and Correlation
Matrix Multiplication
expand
getters
Generalized Linear Model Fitters
is.shaq
Tall Matrices
Linear Model Coefficients
Matrix Multiplication
norm
Principal Components Analysis
QR Decomposition Methods
ranshaq
Scale
setters
Class shaq
shaq
svd
Many data science problems reduce to operations on very tall, skinny matrices. However, sometimes these matrices can be so tall that they are difficult to work with, or do not even fit into main memory. One strategy to deal with such objects is to distribute their rows across several processors. To this end, we offer an 'S4' class for tall, skinny, distributed matrices, called the 'shaq'. We also provide many useful numerical methods and statistics operations for operating on these distributed objects. The naming is a bit "tongue-in-cheek", with the class a play on the fact that 'Shaquille' 'ONeal' ('Shaq') is very tall, and he starred in the film 'Kazaam'.
Useful links