model_parts function

Dataset Level Variable Importance as Change in Loss Function after Variable Permutations

Dataset Level Variable Importance as Change in Loss Function after Variable Permutations

From DALEX version 1.0 this function calls the feature_importance

Find information how to use this function here: https://ema.drwhy.ai/featureImportance.html.

model_parts( explainer, loss_function = loss_default(explainer$model_info$type), ..., type = "variable_importance", N = n_sample, n_sample = 1000 )

Arguments

  • explainer: a model to be explained, preprocessed by the explain function
  • loss_function: a function that will be used to assess variable importance. By default it is 1-AUC for classification, cross entropy for multilabel classification and RMSE for regression. Custom, user-made loss function should accept two obligatory parameters (observed, predicted), where observed states for actual values of the target, while predicted for predicted values. If attribute "loss_accuracy" is associated with function object, then it will be plotted as name of the loss function.
  • ...: other parameters
  • type: character, type of transformation that should be applied for dropout loss. variable_importance and raw results raw drop lossess, ratio returns drop_loss/drop_loss_full_model while difference returns drop_loss - drop_loss_full_model
  • N: number of observations that should be sampled for calculation of variable importance. If NULL then variable importance will be calculated on whole dataset (no sampling).
  • n_sample: alias for N held for backwards compatibility. number of observations that should be sampled for calculation of variable importance.

Returns

An object of the class feature_importance. It's a data frame with calculated average response.

Examples

# regression library("ranger") apartments_ranger_model <- ranger(m2.price~., data = apartments, num.trees = 50) explainer_ranger <- explain(apartments_ranger_model, data = apartments[,-1], y = apartments$m2.price, label = "Ranger Apartments") model_parts_ranger_aps <- model_parts(explainer_ranger, type = "raw") head(model_parts_ranger_aps, 8) plot(model_parts_ranger_aps) # binary classification titanic_glm_model <- glm(survived~., data = titanic_imputed, family = "binomial") explainer_glm_titanic <- explain(titanic_glm_model, data = titanic_imputed[,-8], y = titanic_imputed$survived) logit <- function(x) exp(x)/(1+exp(x)) custom_loss <- function(observed, predicted){ sum((observed - logit(predicted))^2) } attr(custom_loss, "loss_name") <- "Logit residuals" model_parts_glm_titanic <- model_parts(explainer_glm_titanic, type = "raw", loss_function = custom_loss) head(model_parts_glm_titanic, 8) plot(model_parts_glm_titanic) # multilabel classification HR_ranger_model_HR <- ranger(status~., data = HR, num.trees = 50, probability = TRUE) explainer_ranger_HR <- explain(HR_ranger_model_HR, data = HR[,-6], y = HR$status, label = "Ranger HR") model_parts_ranger_HR <- model_parts(explainer_ranger_HR, type = "raw") head(model_parts_ranger_HR, 8) plot(model_parts_ranger_HR)

References

Explanatory Model Analysis. Explore, Explain and Examine Predictive Models. https://ema.drwhy.ai/