Function to perform sparse Partial Least Squares (sPLS). The sPLS approach combines both integration and variable selection simultaneously on two data sets in a one-step strategy.
latin1
sPLS(X, Y, ncomp, mode ="regression", max.iter =500, tol =1e-06, keepX = rep(ncol(X), ncomp), keepY = rep(ncol(Y), ncomp),scale=TRUE)
Arguments
X: Numeric matrix of predictors.
Y: Numeric vector or matrix of responses (for multi-response models).
ncomp: The number of components to include in the model (see Details).
mode: Character string. What type of algorithm to use, (partially) matching one of "regression" or "canonical". See Details.
max.iter: Integer, the maximum number of iterations.
tol: A positive real, the tolerance used in the iterative algorithm.
keepX: Numeric vector of length ncomp, the number of variables to keep in X-loadings. By default all variables are kept in the model.
keepY: Numeric vector of length ncomp, the number of variables to keep in Y-loadings. By default all variables are kept in the model.
scale: a logical indicating if the orignal data set need to be scaled. By default scale=TRUE
Details
sPLS function fit sPLS models with 1,…,ncomp components. Multi-response models are fully supported.
The type of algorithm to use is specified with the mode argument. Two sPLS algorithms are available: sPLS regression ("regression") and sPLS canonical analysis ("canonical") (see References).
Returns
sPLS returns an object of class "sPLS", a list that contains the following components:
X: The centered and standardized original predictor matrix.
Y: The centered and standardized original response vector or matrix.
ncomp: The number of components included in the model.
mode: The algorithm used to fit the model.
keepX: Number of X variables kept in the model on each component.
keepY: Number of Y variables kept in the model on each component.
mat.c: Matrix of coefficients to be used internally by predict.
variates: List containing the variates.
loadings: List containing the estimated loadings for the X and Y variates.
names: List containing the names to be used for individuals and variables.
tol: The tolerance used in the iterative algorithm, used for subsequent S3 methods
max.iter: The maximum number of iterations, used for subsequent S3 methods
References
Liquet Benoit, Lafaye de Micheaux Pierre, Hejblum Boris, Thiebaut Rodolphe. A group and Sparse Group Partial Least Square approach applied in Genomics context. Submitted.
Le Cao, K.-A., Martin, P.G.P., Robert-Grani', C. and Besse, P. (2009). Sparse canonical methods for biological data integration: application to a cross-platform study. BMC Bioinformatics 10 :34.
Le Cao, K.-A., Rossouw, D., Robert-Grani'e, C. and Besse, P. (2008). A sparse PLS for variable selection when integrating Omics data. Statistical Applications in Genetics and Molecular Biology 7 , article 35.
Shen, H. and Huang, J. Z. (2008). Sparse principal component analysis via regularized low rank matrix approximation. Journal of Multivariate Analysis 99 , 1015-1034.
Tenenhaus, M. (1998). La r'egression PLS: th'eorie et pratique. Paris: Editions Technic.
Wold H. (1966). Estimation of principal components and related models by iterative least squares. In: Krishnaiah, P. R. (editors), Multivariate Analysis. Academic Press, N.Y., 391-420.
Author(s)
Benoit Liquet and Pierre Lafaye de Micheaux.
See Also
gPLS, sgPLS, predict, perf and functions from mixOmics package: summary, plotIndiv, plotVar, plot3dIndiv, plot3dVar.