h2o.pd_plot function

Plot partial dependence for a variable

Plot partial dependence for a variable

Partial dependence plot (PDP) gives a graphical depiction of the marginal effect of a variable on the response. The effect of a variable is measured in change in the mean response. PDP assumes independence between the feature for which is the PDP computed and the rest.

h2o.pd_plot( object, newdata, column, target = NULL, row_index = NULL, max_levels = 30, binary_response_scale = c("response", "logodds"), grouping_column = NULL, nbins = 100, show_rug = TRUE )

Arguments

  • object: An H2O model.
  • newdata: An H2OFrame. Used to generate predictions used in Partial Dependence calculations.
  • column: A feature column name to inspect. Character string.
  • target: If multinomial, plot PDP just for target category. Character string.
  • row_index: Optional. Calculate Individual Conditional Expectation (ICE) for row, row_index. Integer.
  • max_levels: An integer specifying the maximum number of factor levels to show. Defaults to 30.
  • binary_response_scale: Option for binary model to display (on the y-axis) the logodds instead of the actual score. Can be one of: "response", "logodds". Defaults to "response".
  • grouping_column: A feature column name to group the data and provide separate sets of plots by grouping feature values
  • nbins: A number of bins used. Defaults to 100.
  • show_rug: Show rug to visualize the density of the column. Defaults to TRUE.

Returns

A ggplot2 object

Examples

## Not run: library(h2o) h2o.init() # Import the wine dataset into H2O: f <- "https://h2o-public-test-data.s3.amazonaws.com/smalldata/wine/winequality-redwhite-no-BOM.csv" df <- h2o.importFile(f) # Set the response response <- "quality" # Split the dataset into a train and test set: splits <- h2o.splitFrame(df, ratios = 0.8, seed = 1) train <- splits[[1]] test <- splits[[2]] # Build and train the model: gbm <- h2o.gbm(y = response, training_frame = train) # Create the partial dependence plot pdp <- h2o.pd_plot(gbm, test, column = "alcohol") print(pdp) ## End(Not run)
  • Maintainer: Tomas Fryda
  • License: Apache License (== 2.0)
  • Last published: 2024-01-11