lolR2.1 package

Linear Optimal Low-Rank Projection

lol.project.pls

Partial Least-Squares (PLS)

lol.project.rp

Random Projections (RP)

lol.sims.cigar

Stacked Cigar

lol.sims.cross

Cross

lol.sims.fat_tails

Fat Tails Simulation

lol.sims.mean_diff

Mean Difference Simulation

lol.classify.nearestCentroid

Nearest Centroid Classifier Training

lol.classify.rand

Random Classifier Utility

lol.classify.randomChance

Randomly Chance Classifier Training

lol.classify.randomGuess

Randomly Guessing Classifier Training

lol.embed

Embedding

lol.project.bayes_optimal

Bayes Optimal

lol.project.dp

Data Piling

lol.project.lol

Linear Optimal Low-Rank Projection (LOL)

lol.project.lrcca

Low-rank Canonical Correlation Analysis (LR-CCA)

lol.project.lrlda

Low-Rank Linear Discriminant Analysis (LRLDA)

lol.project.pca

Principal Component Analysis (PCA)

lol.sims.qdtoep

Quadratic Discriminant Toeplitz Simulation

lol.sims.random_rotate

Random Rotation

lol.sims.rev_rtrunk

Reverse Random Trunk

lol.sims.rotation

Sample Random Rotation

lol.sims.rtrunk

Random Trunk

lol.sims.sim_gmm

GMM Simulate

lol.sims.toep

Toeplitz Simulation

lol.sims.xor2

Xor Problem

lol.utils.decomp

A utility to use irlba when necessary

lol.utils.deltas

A function that performs a utility computation of information about th...

lol.utils.info

A function that performs basic utilities about the data.

lol.utils.ohe

A function for one-hot encoding categorical respose vectors.

lol.xval.eval

Embedding Cross Validation

lol.xval.optimal_dimselect

Optimal Cross-Validated Number of Embedding Dimensions

lol.xval.split

Cross-Validation Data Splitter

predict.nearestCentroid

Nearest Centroid Classifier Prediction

predict.randomChance

Randomly Chance Classifier Prediction

predict.randomGuess

Randomly Guessing Classifier Prediction

Supervised learning techniques designed for the situation when the dimensionality exceeds the sample size have a tendency to overfit as the dimensionality of the data increases. To remedy this High dimensionality; low sample size (HDLSS) situation, we attempt to learn a lower-dimensional representation of the data before learning a classifier. That is, we project the data to a situation where the dimensionality is more manageable, and then are able to better apply standard classification or clustering techniques since we will have fewer dimensions to overfit. A number of previous works have focused on how to strategically reduce dimensionality in the unsupervised case, yet in the supervised HDLSS regime, few works have attempted to devise dimensionality reduction techniques that leverage the labels associated with the data. In this package and the associated manuscript Vogelstein et al. (2017) <arXiv:1709.01233>, we provide several methods for feature extraction, some utilizing labels and some not, along with easily extensible utilities to simplify cross-validative efforts to identify the best feature extraction method. Additionally, we include a series of adaptable benchmark simulations to serve as a standard for future investigative efforts into supervised HDLSS. Finally, we produce a comprehensive comparison of the included algorithms across a range of benchmark simulations and real data applications.