cv.irsvm_fit() R function from [mpath]

Internal function of cross-validation for irsvm

Internal function to conduct k-fold cross-validation for irsvm


cv.irsvm_fit(x, y, weights, cfun="ccave", s=c(1, 5), type=NULL, 
             kernel="radial", gamma=2^(-4:10), cost=2^(-4:4), 
             epsilon=0.1, balance=TRUE, nfolds=10, foldid, 
             trim_ratio=0.9, n.cores=2, ...)

Arguments

x: a data matrix, a vector, or a sparse 'design matrix' (object of class Matrix provided by the Matrix package, or of class matrix.csr

provided by the SparseM package, or of class simple_triplet_matrix provided by the slam

package).
y: a response vector with one label for each row/component of x. Can be either a factor (for classification tasks) or a numeric vector (for regression).
weights: the weight of each subject. It should be in the same length of y.
cfun: character, type of convex cap (concave) function.

Valid options are:
- "hcave"
- "acave"
- "bcave"
- "ccave"
- "dcave"
- "ecave"
- "gcave"
- "tcave"
s: tuning parameter of cfun. s > 0 and can be equal to 0 for cfun="tcave". If s is too close to 0 for cfun="acave", "bcave", "ccave", the calculated weights can become 0 for all observations, thus crash the program.
type: irsvm can be used as a classification machine, or as a regression machine. Depending of whether y is a factor or not, the default setting for type is C-classification or eps-regression, respectively, but may be overwritten by setting an explicit value.

Valid options are:
- C-classification
- nu-classification
- eps-regression
- nu-regression
kernel, gamma: the kernel used in training and predicting. You might consider changing some of the following parameters, depending on the kernel type.
- linear:: $u'*v$
- polynomial:: $(gamma*u'* v + coef0)^degree$
- radial basis:: $exp(-gamma*|u-v|^2)$
- sigmoid:: $tanh(gamma*u'*v + coef0)$
cost: cost of constraints violation (default: 1)---it is the C -constant of the regularization term in the Lagrange formulation. This is proportional to the inverse of lambda in irglmreg.
epsilon: epsilon in the insensitive-loss function (default: 0.1)
balance: for type="C-classification", "nu-classification" only
nfolds: number of folds >=3, default is 10
foldid: an optional vector of values between 1 and nfold

identifying what fold each observation is in. If supplied, nfold can be missing and will be ignored.
trim_ratio: a number between 0 and 1 for trimmed least squares, useful if type="eps-regression" or "nu-regression".
n.cores: The number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores.
...: Other arguments that can be passed to irsvm.

Details

This function is the driving force behind cv.irsvm. Does a K-fold cross-validation to determine optimal tuning parameters in SVM: cost and gamma if kernel is nonlinear. It can also choose s used in cfun.

Returns

an object of class "cv.irsvm" is returned, which is a list with the ingredients of the cross-validation fit. - residmat: matrix with row values for kernel="linear" are s, cost, error, k, where k is the number of cross-validation fold. For nonlinear kernels, row values are s, gamma, cost, error, k.

cost: a value of cost that gives minimum cross-validated value in irsvm.
gamma: a value of gamma that gives minimum cross-validated value in irsvm
s: value of s for cfun that gives minimum cross-validated value in irsvm.

References

Zhu Wang (2024) Unified Robust Estimation, Australian & New Zealand Journal of Statistics. 66(1):77-102.

Author(s)

Zhu Wang zwang145@uthsc.edu

cv.irsvm_fit function