csi function

Cholesky decomposition with Side Information

Cholesky decomposition with Side Information

The csi function in kernlab is an implementation of an incomplete Cholesky decomposition algorithm which exploits side information (e.g., classification labels, regression responses) to compute a low rank decomposition of a kernel matrix from the data. methods

## S4 method for signature 'matrix' csi(x, y, kernel="rbfdot", kpar=list(sigma=0.1), rank, centering = TRUE, kappa = 0.99 ,delta = 40 ,tol = 1e-5)

Arguments

  • x: The data matrix indexed by row

  • y: the classification labels or regression responses. In classification y is a m×nm \times n matrix where mm

    the number of data and nn the number of classes yy and yiy_i is 1 if the corresponding x belongs to class i.

  • kernel: the kernel function used in training and predicting. This parameter can be set to any function, of class kernel, which computes the inner product in feature space between two vector arguments. kernlab provides the most popular kernel functions which can be used by setting the kernel parameter to the following strings:

    • rbfdot Radial Basis kernel function "Gaussian"
    • polydot Polynomial kernel function
    • vanilladot Linear kernel function
    • tanhdot Hyperbolic tangent kernel function
    • laplacedot Laplacian kernel function
    • besseldot Bessel kernel function
    • anovadot ANOVA RBF kernel function
    • splinedot Spline kernel
    • stringdot String kernel

    The kernel parameter can also be set to a user defined function of class kernel by passing the function name as an argument.

  • kpar: the list of hyper-parameters (kernel parameters). This is a list which contains the parameters to be used with the kernel function. Valid parameters for existing kernels are :

    • sigma inverse kernel width for the Radial Basis kernel function "rbfdot" and the Laplacian kernel "laplacedot".
    • degree, scale, offset for the Polynomial kernel "polydot"
    • scale, offset for the Hyperbolic tangent kernel function "tanhdot"
    • sigma, order, degree for the Bessel kernel "besseldot".
    • sigma, degree for the ANOVA kernel "anovadot".

    Hyper-parameters for user defined kernels can be passed through the kpar parameter as well.

  • rank: maximal rank of the computed kernel matrix

  • centering: if TRUE centering is performed (default: TRUE)

  • kappa: trade-off between approximation of K and prediction of Y (default: 0.99)

  • delta: number of columns of cholesky performed in advance (default: 40)

  • tol: minimum gain at each iteration (default: 1e-4)

Details

An incomplete cholesky decomposition calculates ZZ where K=ZZK= ZZ' KK being the kernel matrix. Since the rank of a kernel matrix is usually low, ZZ tends to be smaller then the complete kernel matrix. The decomposed matrix can be used to create memory efficient kernel-based algorithms without the need to compute and store a complete kernel matrix in memory.

csi uses the class labels, or regression responses to compute a more appropriate approximation for the problem at hand considering the additional information from the response variable.

Returns

An S4 object of class "csi" which is an extension of the class "matrix". The object is the decomposed kernel matrix along with the slots : - pivots: Indices on which pivots where done

  • diagresidues: Residuals left on the diagonal

  • maxresiduals: Residuals picked for pivoting

  • predgain: predicted gain before adding each column

  • truegain: actual gain after adding each column

  • Q: QR decomposition of the kernel matrix

  • R: QR decomposition of the kernel matrix

slots can be accessed either by object@slot

or by accessor functions with the same name (e.g., pivots(object))

References

Francis R. Bach, Michael I. Jordan

Predictive low-rank decomposition for kernel methods.

Proceedings of the Twenty-second International Conference on Machine Learning (ICML) 2005

http://www.di.ens.fr/~fbach/bach_jordan_csi.pdf

Author(s)

Alexandros Karatzoglou (based on Matlab code by Francis Bach)

alexandros.karatzoglou@ci.tuwien.ac.at

See Also

inchol, chol, csi-class

Examples

data(iris) ## create multidimensional y matrix yind <- t(matrix(1:3,3,150)) ymat <- matrix(0, 150, 3) ymat[yind==as.integer(iris[,5])] <- 1 datamatrix <- as.matrix(iris[,-5]) # initialize kernel function rbf <- rbfdot(sigma=0.1) rbf Z <- csi(datamatrix,ymat, kernel=rbf, rank = 30) dim(Z) pivots(Z) # calculate kernel matrix K <- crossprod(t(Z)) # difference between approximated and real kernel matrix (K - kernelMatrix(kernel=rbf, datamatrix))[6,]
  • Maintainer: Alexandros Karatzoglou
  • License: GPL-2
  • Last published: 2024-08-13

Useful links