h2o.varimp_heatmap function

Variable Importance Heatmap across multiple models

Variable Importance Heatmap across multiple models

Variable importance heatmap shows variable importance across multiple models. Some models in H2O return variable importance for one-hot (binary indicator) encoded versions of categorical columns (e.g. Deep Learning, XGBoost). In order for the variable importance of categorical columns to be compared across all model types we compute a summarization of the the variable importance across all one-hot encoded features and return a single variable importance for the original categorical feature. By default, the models and variables are ordered by their similarity.

h2o.varimp_heatmap(object, top_n = 20, num_of_features = 20)

Arguments

  • object: A list of H2O models, an H2O AutoML instance, or an H2OFrame with a 'model_id' column (e.g. H2OAutoML leaderboard).
  • top_n: Integer specifying the number models shown in the heatmap (based on leaderboard ranking). Defaults to 20.
  • num_of_features: Integer specifying the number of features shown in the heatmap based on the maximum variable importance across the models. Use NULL for unlimited. Defaults to 20.

Returns

A ggplot2 object.

Examples

## Not run: library(h2o) h2o.init() # Import the wine dataset into H2O: f <- "https://h2o-public-test-data.s3.amazonaws.com/smalldata/wine/winequality-redwhite-no-BOM.csv" df <- h2o.importFile(f) # Set the response response <- "quality" # Split the dataset into a train and test set: splits <- h2o.splitFrame(df, ratios = 0.8, seed = 1) train <- splits[[1]] test <- splits[[2]] # Build and train the model: aml <- h2o.automl(y = response, training_frame = train, max_models = 10, seed = 1) # Create the variable importance heatmap varimp_heatmap <- h2o.varimp_heatmap(aml) print(varimp_heatmap) ## End(Not run)
  • Maintainer: Tomas Fryda
  • License: Apache License (== 2.0)
  • Last published: 2024-01-11