ggplotFeatureImportance(featureList, control = list(),...)plotFeatureImportance(featureList, control = list(),...)
Arguments
featureList: [list]
List of vectors of features. One list element is expected to belong to one resampling iteration / fold.
control: [list]
A list, which stores additional configuration parameters:
featimp.col_{high/medium/low}: Color of the features, which are used often, sometimes or only a few times.
featimp.perc_{high/low}: Percentage of the total number of folds, defining when a features, is used often, sometimes or only a few times.
featimp.las: Alignment of axis labels.
featimp.lab_{feat/resample}: Axis labels (features and resample iterations).
featimp.string_angle: Angle for the features on the x-axis.
featimp.pch_{active/inactive}: Plot symbol of the active and inactive points.
featimp.col_inactive: Color of the inactive points.
featimp.col_vertical: Color of the vertical lines.
featimp.lab_{title/strip}: Label used for the title and/or strip label. These parameters are only relevant for ggplotFeatureImportance.
featimp.legend_position: Location of the legend. This parameter is only relevant for ggplotFeatureImportance.
featimp.flip_axes: Should the axes be flipped? This parameter is only relevant for ggplotFeatureImportance.
featimp.plot_tiles: Visualize (non-)selected features with tiles? This parameter is only relevant for ggplotFeatureImportance.
...: [any]
Further arguments, which can be passed to plot.
Returns
[plot].
Feature Importance Plot, indicating which feature was used during which iteration.
Examples
## Not run:# At the beginning, one needs a list of features, e.g. derived during a# nested feature selection within mlr (see the following 8 steps):library(mlr)library(mlbench)data(Glass)# (1) Create a classification task:classifTask = makeClassifTask(data = Glass, target ="Type")# (2) Define the model (here, a classification tree):lrn = makeLearner(cl ="classif.rpart")# (3) Define the resampling strategy, which is supposed to be used within # each inner loop of the nested feature selection:innerResampling = makeResampleDesc("Holdout")# (4) What kind of feature selection approach should be used? Here, we use a# sequential backward strategy, i.e. starting from a model with all features,# in each step the feature decreasing the performance measure the least is# removed from the model:ctrl = makeFeatSelControlSequential(method ="sbs")# (5) Wrap the original model (see (2)) in order to allow feature selection:wrappedLearner = makeFeatSelWrapper(learner = lrn, resampling = innerResampling, control = ctrl)# (6) Define a resampling strategy for the outer loop. This is necessary in# order to assess whether the selected features depend on the underlying# fold:outerResampling = makeResampleDesc(method ="CV", iters =10L)# (7) Perform the feature selection:featselResult = resample(learner = wrappedLearner, task = classifTask, resampling = outerResampling, models =TRUE)# (8) Extract the features, which were selected during each iteration of the# outer loop (i.e. during each of the 10 folds of the cross-validation):featureList = lapply(featselResult$models,function(mod) getFeatSelResult(mod)$x)## End(Not run)######################################################################### Now, one could inspect the features manually:featureList
# Alternatively, one might use visual means such as the feature# importance plot. There exist two versions for the feature importance# plot. One based on the classical R figuresplotFeatureImportance(featureList)# and one using ggplotggplotFeatureImportance(featureList)