racog function

Rapidly converging Gibbs algorithm.

Rapidly converging Gibbs algorithm.

Allows you to treat imbalanced discrete numeric datasets by generating synthetic minority examples, approximating their probability distribution.

racog(dataset, numInstances, burnin = 100, lag = 20, classAttr = "Class")

Arguments

  • dataset: data.frame to treat. All columns, except classAttr one, have to be numeric or coercible to numeric.
  • numInstances: Integer. Number of new minority examples to generate.
  • burnin: Integer. It determines how many examples generated for a given one are going to be discarded firstly. By default, 100.
  • lag: Integer. Number of iterations between new generated example for a minority one. By default, 20.
  • classAttr: character. Indicates the class attribute from dataset. Must exist in it.

Returns

A data.frame with the same structure as dataset, containing the generated synthetic examples.

Details

Approximates minority distribution using Gibbs Sampler. Dataset must be discretized and numeric. In each iteration, it builds a new sample using a Markov chain. It discards first burnin iterations, and from then on, each lag iterations, it validates the example as a new minority example. It generates d(iterationsburnin)/lagd (iterations-burnin)/lag where dd is minority examples number.

Examples

data(iris0) # Generates new minority examples newSamples <- racog(iris0, numInstances = 40, burnin = 20, lag = 10, classAttr = "Class") newSamples <- racog(iris0, numInstances = 100)

References

Das, Barnan; Krishnan, Narayanan C.; Cook, Diane J. Racog and Wracog: Two Probabilistic Oversampling Techniques. IEEE Transactions on Knowledge and Data Engineering 27(2015), Nr. 1, p. 222–234.

  • Maintainer: Ignacio Cordón
  • License: GPL (>= 2) | file LICENSE
  • Last published: 2020-04-07