Synthetic Minority Oversampling Technique (SMOTE) algorithm for imbalanced classification data.
smote(y, x, k =5, over =NULL, yminor =NULL)
Arguments
y: Vector of response outcome as a factor
x: Matrix of predictors
k: Range of KNN to consider for generation of new data
over: Amount of oversampling of the minority class. If set to NULL
then all classes will be oversampled up to the number of samples in the majority class.
yminor: Optional character value specifying the level in y which is to be oversampled. If NULL, this is set automatically to the class with the smallest sample size.
Returns
List containing extended matrix x of synthesised data and extended response vector y
References
Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16:321-357.