visualize_relationship function

Visualizing the relationship between y and x in a partition model

Visualizing the relationship between y and x in a partition model

Attempts to show how the relationship between y and x is being modeled in a partition or random forest model

visualize_relationship(TREE,interest,on,smooth=TRUE,marginal=TRUE,nplots=5, seed=NA,pos="topright",...)

Arguments

  • TREE: A partition or random forest model (though it works with many regression models as well)
  • interest: The name of the predictor variable for which the plot of y vs. x is to be made.
  • on: A dataframe giving the values of the other predictor variables for which the relationship is to be visualized. Typically this is the dataframe on which the partition model was built.
  • smooth: If TRUE, the relationship is plotted using a loess to smooth out the relationship
  • marginal: If TRUE, the modeled value of y at a particular value of x is the average of the predicted values of y over all rows which have that common value of x. If FALSE, then nplots rows from on will be selected and all other predictors will be fixed, showing the relationship between y and x for that particular set of characteristics.
  • nplots: The number of rows of on for which the relationship is plotted (if marginal is set to FALSE)
  • seed: the seed for the random number seed if reproducibility is required
  • pos: the location of the legend
  • ...: additional arguments past to plot, namely xlim and ylim

Details

The function shows a scatterplot of y vs. x in the on dataframe, then shows how TREE is modeling the relationship between y and x with predicted values of y for each row in the data and also a curve illustrating the relationship. It is useful for seeing what the relationship between y and x as modeled by TREE "looks like", both as a whole and for particular combinations of other variables. If marginal is FALSE, then differences in the curves indicate the presence of some interaction between x and another variable.

References

Introduction to Regression and Modeling

Author(s)

Adam Petrie

See Also

loess, lm, glm

Examples

data(SALARY) FOREST <- randomForest(Salary~.,data=SALARY) visualize_relationship(FOREST,interest="Experience",on=SALARY) visualize_relationship(FOREST,interest="Months",on=SALARY,xlim=c(1,15),ylim=c(2500,4500)) data(WINE) TREE <- rpart(Quality~.,data=WINE) visualize_relationship(TREE,interest="alcohol",on=WINE,smooth=FALSE) visualize_relationship(TREE,interest="alcohol",on=WINE,marginal=FALSE,nplots=7,smooth=FALSE)
  • Maintainer: Adam Petrie
  • License: GPL (>= 2)
  • Last published: 2020-02-21

Useful links