Interaction estimates the feature interactions in a prediction model.
Interactions between features are measured via the decomposition of the prediction function: If a feature j has no interaction with any other feature, the prediction function can be expressed as the sum of the partial function that depends only on j and the partial function that only depends on features other than j. If the variance of the full function is completely explained by the sum of the partial functions, there is no interaction between feature j and the other features. Any variance that is not explained can be attributed to the interaction and is used as a measure of interaction strength.
The interaction strength between two features is the proportion of the variance of the 2-dimensional partial dependence function that is not explained by the sum of the two 1-dimensional partial dependence functions.
The interaction is measured by Friedman's H-statistic (square root of the H-squared test statistic) and takes on values between 0 (no interaction) to 1 (100% of standard deviation of f(x) du to interaction).
Parallelization is supported via package list("future"). To initialize future-based parallelization, select an appropriate backend and specify the amount of workers. For example, to use a PSOCK based cluster backend do:
future::plan(multisession, workers = 2)
<iml function here>
Consult the resources of the list("future") package for more parallel backend options.
Examples
## Not run:library("rpart")set.seed(42)# Fit a CART on the Boston housing data setdata("Boston", package ="MASS")rf <- rpart(medv ~ ., data = Boston)# Create a model objectmod <- Predictor$new(rf, data = Boston[-which(names(Boston)=="medv")])# Measure the interaction strengthia <- Interaction$new(mod)# Plot the resulting leaf nodesplot(ia)# Extract the resultsdat <- ia$results
head(dat)# Interaction also works with multiclass classificationrf <- rpart(Species ~ ., data = iris)mod <- Predictor$new(rf, data = iris, type ="prob")# For some models we have to specify additional arguments for the# predict functionia <- Interaction$new(mod)ia$plot()# For multiclass classification models, you can choose to only show one class:mod <- Predictor$new(rf, data = iris, type ="prob", class ="virginica")plot(Interaction$new(mod))## End(Not run)
References
Friedman, Jerome H., and Bogdan E. Popescu. "Predictive learning via rule ensembles." The Annals of Applied Statistics 2.3 (2008): 916-954.
Super class
iml::InterpretationMethod -> Interaction
Public fields
grid.size: (logical(1))
The number of values per feature that should be used to estimate the interaction strength.
The feature name or index for which to compute the effects.
grid.size: (numeric(1) | numeric(2))
The size of the grid for evaluating the predictions.
Returns
data.frame with the interaction strength (column .interation) per feature calculated as Friedman's H-statistic and - in the case of a multi-dimensional outcome - per class.
Method clone()
The objects of this class are cloneable with this method.