Based on random forest instance proximity measure detects training cases which are different to all other cases.
rfOutliers(model, dataset)
Arguments
model: a random forest model returned by CoreModel
dataset: a training set used to generate the model
Returns
For each instance from a dataset the function returns a numeric score of its strangeness to other cases.
Details
Strangeness is defined using the random forest model via a proximity matrix (see rfProximity). If the number is greater than 10, the case can be considered an outlier according to Breiman 2001.
Examples
#first create a random forest tree using CORElearndataset <- iris
md <- CoreModel(Species ~ ., dataset, model="rf", rfNoTrees=30, maxThreads=1)outliers <- rfOutliers(md, dataset)plot(abs(outliers))#for a nicer display try plot(md, dataset, rfGraphType="outliers")destroyModels(md)# clean up
Author(s)
John Adeyanju Alao (as a part of his BSc thesis) and Marko Robnik-Sikonja (thesis supervisor)
See Also
CoreModel, rfProximity, rfClustering.
References
Leo Breiman: Random Forests. Machine Learning Journal, 45:5-32, 2001