plsda function

Partial Least Squares Discriminant Analysis

Partial Least Squares Discriminant Analysis

plsda is used to calibrate, validate and use of partial least squares discrimination analysis (PLS-DA) model.

plsda( x, c, ncomp = min(nrow(x) - 1, ncol(x), 20), center = TRUE, scale = FALSE, cv = NULL, exclcols = NULL, exclrows = NULL, x.test = NULL, c.test = NULL, method = "simpls", lim.type = "ddmoments", alpha = 0.05, gamma = 0.01, info = "", ncomp.selcrit = "min", classname = NULL, cv.scope = "local" )

Arguments

  • x: matrix with predictors.
  • c: vector with class membership (should be either a factor with class names/numbers in case of multiple classes or a vector with logical values in case of one class model).
  • ncomp: maximum number of components to calculate.
  • center: logical, center or not predictors and response values.
  • scale: logical, scale (standardize) or not predictors and response values.
  • cv: cross-validation settings (see details).
  • exclcols: columns of x to be excluded from calculations (numbers, names or vector with logical values)
  • exclrows: rows to be excluded from calculations (numbers, names or vector with logical values)
  • x.test: matrix with predictors for test set.
  • c.test: vector with reference class values for test set (same format as calibration values).
  • method: method for calculating PLS model.
  • lim.type: which method to use for calculation of critical limits for residual distances (see details)
  • alpha: significance level for extreme limits for T2 and Q disances.
  • gamma: significance level for outlier limits for T2 and Q distances.
  • info: short text with information about the model.
  • ncomp.selcrit: criterion for selecting optimal number of components ('min' for first local minimum of RMSECV and 'wold' for Wold's rule.)
  • classname: name (label) of class in case if PLS-DA is used for one-class discrimination model. In this case it is expected that parameter c will be a vector with logical values.
  • cv.scope: scope for center/scale operations inside CV loop: 'global' — using globally computed mean and std or 'local' — recompute new for each local calibration set.

Returns

Returns an object of plsda class with following fields (most inherited from class pls): - ncomp: number of components included to the model.

  • ncomp.selected: selected (optimal) number of components.

  • xloadings: matrix with loading values for x decomposition.

  • yloadings: matrix with loading values for y (c) decomposition.

  • weights: matrix with PLS weights.

  • coeffs: matrix with regression coefficients calculated for each component.

  • info: information about the model, provided by user when build the model.

  • calres: an object of class plsdares with PLS-DA results for a calibration data.

  • testres: an object of class plsdares with PLS-DA results for a test data, if it was provided.

  • cvres: an object of class plsdares with PLS-DA results for cross-validation, if this option was chosen.

Details

The plsda class is based on pls with extra functions and plots covering classification functionality. All plots for pls can be used. E.g. of you want to see the real predicted values (y in PLS) instead of classes use plotPredictions.pls(model) instead of plotPredictions(model).

Cross-validation settings, cv, can be a number or a list. If cv is a number, it will be used as a number of segments for random cross-validation (if cv = 1, full cross-validation will be preformed). If it is a list, the following syntax can be used: cv = list('rand', nseg, nrep) for random repeated cross-validation with nseg

segments and nrep repetitions or cv = list('ven', nseg) for systematic splits to nseg segments ('venetian blinds').

Calculation of confidence intervals and p-values for regression coefficients are available only by jack-knifing so far. See help for regcoeffs objects for details.

Examples

### Examples for PLS-DA model class library(mdatools) ## 1. Make a PLS-DA model with full cross-validation and show model overview # make a calibration set from iris data (3 classes) # use names of classes as class vector x.cal = iris[seq(1, nrow(iris), 2), 1:4] c.cal = iris[seq(1, nrow(iris), 2), 5] model = plsda(x.cal, c.cal, ncomp = 3, cv = 1, info = 'IRIS data example') model = selectCompNum(model, 1) # show summary and basic model plots # misclassification will be shown only for first class summary(model) plot(model) # summary and model plots for second class summary(model, nc = 2) plot(model, nc = 2) # summary and model plot for specific class and number of components summary(model, nc = 3, ncomp = 3) plot(model, nc = 3, ncomp = 3) ## 2. Show performance plots for a model par(mfrow = c(2, 2)) plotSpecificity(model) plotSensitivity(model) plotMisclassified(model) plotMisclassified(model, nc = 2) par(mfrow = c(1, 1)) ## 3. Show both class and y values predictions par(mfrow = c(2, 2)) plotPredictions(model) plotPredictions(model, res = "cal", ncomp = 2, nc = 2) plotPredictions(structure(model, class = "regmodel")) plotPredictions(structure(model, class = "regmodel"), ncomp = 2, ny = 2) par(mfrow = c(1, 1)) ## 4. All plots from ordinary PLS can be used, e.g.: par(mfrow = c(2, 2)) plotXYScores(model) plotYVariance(model) plotXResiduals(model) plotRegcoeffs(model, ny = 2) par(mfrow = c(1, 1))

See Also

Specific methods for plsda class:

print.plsdaprints information about a pls object.
summary.plsdashows performance statistics for the model.
plot.plsdashows plot overview of the model.
predict.plsdaapplies PLS-DA model to a new data.

Methods, inherited from classmodel class:

plotPredictions.classmodelshows plot with predicted values.
plotSensitivity.classmodelshows sensitivity plot.
plotSpecificity.classmodelshows specificity plot.
plotMisclassified.classmodelshows misclassified ratio plot.

See also methods for class pls.

Author(s)

Sergey Kucheryavskiy (svkucheryavski@gmail.com)

  • Maintainer: Sergey Kucheryavskiy
  • License: MIT + file LICENSE
  • Last published: 2024-08-19