Generate predictions for input data with specified model
Generate predictions for input data with specified model
Performs prediction of response stats::lm(), stats::glm(), ranger::ranger(), mgcv::gam(), workflows::workflow() (i.e., tidymodels models), and xgboost::xgb.train() with binary or continuous response. See details for more information.
predict_model(x, newdata,...)## Default S3 method:predict_model(x, newdata,...)## S3 method for class 'ar'predict_model(x, newdata, newreg, horizon,...)## S3 method for class 'Arima'predict_model( x, newdata, newreg, horizon, explain_idx, explain_lags, y, xreg,...)## S3 method for class 'forecast_ARIMA'predict_model(x, newdata, newreg, horizon,...)## S3 method for class 'glm'predict_model(x, newdata,...)## S3 method for class 'lm'predict_model(x, newdata,...)## S3 method for class 'gam'predict_model(x, newdata,...)## S3 method for class 'ranger'predict_model(x, newdata,...)## S3 method for class 'workflow'predict_model(x, newdata,...)## S3 method for class 'xgb.Booster'predict_model(x, newdata,...)
Arguments
x: Model object for the model to be explained.
newdata: A data.frame/data.table with the features to predict from.
...: newreg and horizon parameters used in models passed to [explain_forecast()]
horizon: Numeric. The forecast horizon to explain. Passed to the predict_model function.
explain_idx: Numeric vector. The row indices in data and reg denoting points in time to explain.
y: Matrix, data.frame/data.table or a numeric vector. Contains the endogenous variables used to estimate the (conditional) distributions needed to properly estimate the conditional expectations in the Shapley formula including the observations to be explained.
xreg: Matrix, data.frame/data.table or a numeric vector. Contains the exogenous variables used to estimate the (conditional) distributions needed to properly estimate the conditional expectations in the Shapley formula including the observations to be explained. As exogenous variables are used contemporaneously when producing a forecast, this item should contain nrow(y) + horizon rows.
Returns
Numeric. Vector of size equal to the number of rows in newdata.
Details
The following models are currently supported:
stats::lm()
stats::glm()
ranger::ranger()
mgcv::gam()
workflows::workflow()
xgboost::xgb.train()
If you have a binary classification model we'll always return the probability prediction for a single class.
If you are explaining a model not supported natively, you need to create the [predict_model()] function yourself, and pass it on to as an argument to [explain()].
For more details on how to explain such non-supported models (i.e. custom models), see the Advanced usage section of the general usage:
From R: vignette("general_usage", package = "shapr")
# Load example datadata("airquality")airquality <- airquality[complete.cases(airquality),]# Split data into test- and training datax_train <- head(airquality,-3)x_explain <- tail(airquality,3)# Fit a linear modelmodel <- lm(Ozone ~ Solar.R + Wind + Temp + Month, data = x_train)# Predicting for a model with a standardized formatpredict_model(x = model, newdata = x_explain)