simca function

SIMCA one-class classification

SIMCA one-class classification

simca is used to make SIMCA (Soft Independent Modelling of Class Analogies) model for one-class classification.

simca( x, classname, ncomp = min(nrow(x) - 1, ncol(x) - 1, 20), x.test = NULL, c.test = NULL, cv = NULL, ... )

Arguments

  • x: a numerical matrix with data values.
  • classname: short text (up to 20 symbols) with class name.
  • ncomp: maximum number of components to calculate.
  • x.test: a numerical matrix with test data.
  • c.test: a vector with classes of test data objects (can be text with names of classes or logical).
  • cv: cross-validation settings (see details).
  • ...: any other parameters suitable for pca method.

Returns

Returns an object of simca class with following fields: - classname: a short text with class name.

  • calres: an object of class simcares with classification results for a calibration data.

  • testres: an object of class simcares with classification results for a test data, if it was provided.

  • cvres: an object of class simcares with classification results for cross-validation, if this option was chosen.

Fields, inherited from pca class: - ncomp: number of components included to the model.

  • ncomp.selected: selected (optimal) number of components.

  • loadings: matrix with loading values (nvar x ncomp).

  • eigenvals: vector with eigenvalues for all existent components.

  • expvar: vector with explained variance for each component (in percent).

  • cumexpvar: vector with cumulative explained variance for each component (in percent).

  • T2lim: statistical limit for T2 distance.

  • Qlim: statistical limit for Q residuals.

  • info: information about the model, provided by user when build the model.

Details

SIMCA is in fact PCA model with additional functionality, so simca class inherits most of the functionality of pca class. It uses critical limits calculated for Q and T2 residuals calculated for PCA model for making classification decistion.

Cross-validation settings, cv, can be a number or a list. If cv is a number, it will be used as a number of segments for random cross-validation (if cv = 1, full cross-validation will be preformed). If it is a list, the following syntax can be used: cv = list('rand', nseg, nrep) for random repeated cross-validation with nseg

segments and nrep repetitions or cv = list('ven', nseg) for systematic splits to nseg segments ('venetian blinds').

Examples

## make a SIMCA model for Iris setosa class with full cross-validation library(mdatools) data = iris[, 1:4] class = iris[, 5] # take first 20 objects of setosa as calibration set se = data[1:20, ] # make SIMCA model and apply to test set model = simca(se, "setosa", cv = 1) model = selectCompNum(model, 1) # show infromation, summary and plot overview print(model) summary(model) plot(model) # show predictions par(mfrow = c(2, 1)) plotPredictions(model, show.labels = TRUE) plotPredictions(model, res = "cal", ncomp = 2, show.labels = TRUE) par(mfrow = c(1, 1)) # show performance, modelling power and residuals for ncomp = 2 par(mfrow = c(2, 2)) plotSensitivity(model) plotMisclassified(model) plotLoadings(model, comp = c(1, 2), show.labels = TRUE) plotResiduals(model, ncomp = 2) par(mfrow = c(1, 1))

References

S. Wold, M. Sjostrom. "SIMCA: A method for analyzing chemical data in terms of similarity and analogy" in B.R. Kowalski (ed.), Chemometrics Theory and Application, American Chemical Society Symposium Series 52, Wash., D.C., American Chemical Society, p. 243-282.

See Also

Methods for simca objects:

print.simcashows information about the object.
summary.simcashows summary statistics for the model.
plot.simcamakes an overview of SIMCA model with four plots.
predict.simcaapplies SIMCA model to a new data.

Methods, inherited from classmodel class:

plotPredictions.classmodelshows plot with predicted values.
plotSensitivity.classmodelshows sensitivity plot.
plotSpecificity.classmodelshows specificity plot.
plotMisclassified.classmodelshows misclassified ratio plot.

Methods, inherited from pca class:

selectCompNum.pcaset number of optimal components in the model
plotScores.pcashows scores plot.
plotLoadings.pcashows loadings plot.
plotVariance.pcashows explained variance plot.
plotCumVariance.pcashows cumulative explained variance plot.
plotResiduals.pcashows Q vs. T2 residuals plot.

Author(s)

Sergey Kucheryavskiy (svkucheryavski@gmail.com)

  • Maintainer: Sergey Kucheryavskiy
  • License: MIT + file LICENSE
  • Last published: 2024-08-19