simca() R function from [mdatools]

SIMCA one-class classification

simca is used to make SIMCA (Soft Independent Modelling of Class Analogies) model for one-class classification.


simca(
  x,
  classname,
  ncomp = min(nrow(x) - 1, ncol(x) - 1, 20),
  x.test = NULL,
  c.test = NULL,
  cv = NULL,
  ...
)

Arguments

x: a numerical matrix with data values.
classname: short text (up to 20 symbols) with class name.
ncomp: maximum number of components to calculate.
x.test: a numerical matrix with test data.
c.test: a vector with classes of test data objects (can be text with names of classes or logical).
cv: cross-validation settings (see details).
...: any other parameters suitable for pca method.

Returns

Returns an object of simca class with following fields: - classname: a short text with class name.

calres: an object of class simcares with classification results for a calibration data.
testres: an object of class simcares with classification results for a test data, if it was provided.
cvres: an object of class simcares with classification results for cross-validation, if this option was chosen.

Fields, inherited from pca class: - ncomp: number of components included to the model.

ncomp.selected: selected (optimal) number of components.
loadings: matrix with loading values (nvar x ncomp).
eigenvals: vector with eigenvalues for all existent components.
expvar: vector with explained variance for each component (in percent).
cumexpvar: vector with cumulative explained variance for each component (in percent).
T2lim: statistical limit for T2 distance.
Qlim: statistical limit for Q residuals.
info: information about the model, provided by user when build the model.

Details

SIMCA is in fact PCA model with additional functionality, so simca class inherits most of the functionality of pca class. It uses critical limits calculated for Q and T2 residuals calculated for PCA model for making classification decistion.

Cross-validation settings, cv, can be a number or a list. If cv is a number, it will be used as a number of segments for random cross-validation (if cv = 1, full cross-validation will be preformed). If it is a list, the following syntax can be used: cv = list('rand', nseg, nrep) for random repeated cross-validation with nseg

segments and nrep repetitions or cv = list('ven', nseg) for systematic splits to nseg segments ('venetian blinds').

Examples


## make a SIMCA model for Iris setosa class with full cross-validation
library(mdatools)

data = iris[, 1:4]
class = iris[, 5]

# take first 20 objects of setosa as calibration set
se = data[1:20, ]

# make SIMCA model and apply to test set
model = simca(se, "setosa", cv = 1)
model = selectCompNum(model, 1)

# show infromation, summary and plot overview
print(model)
summary(model)
plot(model)

# show predictions
par(mfrow = c(2, 1))
plotPredictions(model, show.labels = TRUE)
plotPredictions(model, res = "cal", ncomp = 2, show.labels = TRUE)
par(mfrow = c(1, 1))

# show performance, modelling power and residuals for ncomp = 2
par(mfrow = c(2, 2))
plotSensitivity(model)
plotMisclassified(model)
plotLoadings(model, comp = c(1, 2), show.labels = TRUE)
plotResiduals(model, ncomp = 2)
par(mfrow = c(1, 1))

References

S. Wold, M. Sjostrom. "SIMCA: A method for analyzing chemical data in terms of similarity and analogy" in B.R. Kowalski (ed.), Chemometrics Theory and Application, American Chemical Society Symposium Series 52, Wash., D.C., American Chemical Society, p. 243-282.


`print.simca`	shows information about the object.
`summary.simca`	shows summary statistics for the model.
`plot.simca`	makes an overview of SIMCA model with four plots.
`predict.simca`	applies SIMCA model to a new data.


`plotPredictions.classmodel`	shows plot with predicted values.
`plotSensitivity.classmodel`	shows sensitivity plot.
`plotSpecificity.classmodel`	shows specificity plot.
`plotMisclassified.classmodel`	shows misclassified ratio plot.


`selectCompNum.pca`	set number of optimal components in the model
`plotScores.pca`	shows scores plot.
`plotLoadings.pca`	shows loadings plot.
`plotVariance.pca`	shows explained variance plot.
`plotCumVariance.pca`	shows cumulative explained variance plot.
`plotResiduals.pca`	shows Q vs. T2 residuals plot.

Author(s)

Sergey Kucheryavskiy (svkucheryavski@gmail.com)

mdatools package Read PDF manual

Maintainer: Sergey Kucheryavskiy
License: MIT + file LICENSE
Last published: 2024-08-19

Useful links

simca function