neater() R function from [imbalance]

Fitering of oversampled data based on non-cooperative game theory

Filters oversampled examples from a binary class dataset using game theory to find out if keeping an example is worthy enough.


neater(
  dataset,
  newSamples,
  k = 3,
  iterations = 100,
  smoothFactor = 1,
  classAttr = "Class"
)

Arguments

dataset: The original data.frame. All columns, except classAttr one, have to be numeric or coercible to numeric.
newSamples: A data.frame containing the samples to be filtered. Must have the same structure as dataset.
k: Integer. Number of nearest neighbours to use in KNN algorithm to rule out samples. By default, 3.
iterations: Integer. Number of iterations for the algorithm. By default, 100.
smoothFactor: A positive numeric. By default, 1.
classAttr: character. Indicates the class attribute from dataset and newSamples. Must exist in them.

Returns

Filtered samples as a data.frame with same structure as newSamples.

Details

Uses game theory and Nash equilibriums to calculate the minority examples probability of trully belonging to the minority class. It discards examples which at the final stage of the algorithm have more probability of being a majority example than a minority one.

Examples


data(iris0)

newSamples <- smotefamily::SMOTE(iris0[,-5], iris0[,5])$syn_data
# SMOTE overrides Class attr turning it into class
# and dataset must have same class attribute as newSamples
names(newSamples) <- c(names(newSamples)[-5], "Class")

neater(iris0, newSamples, k = 5, iterations = 100,
       smoothFactor = 1, classAttr = "Class")

References

Almogahed, B.A.; Kakadiaris, I.A. Neater: Filtering of Over-Sampled Data Using Non-Cooperative Game Theory. Soft Computing 19 (2014), Nr. 11, p. 3301–3322.

imbalance package Read PDF manual

Maintainer: Ignacio Cordón
License: GPL (>= 2) | file LICENSE
Last published: 2020-04-07

Useful links

neater function