Filter using coefficients from partial least squares (PLS) regression to select optimal predictors.
pls_filter( y, x, force_vars =NULL, nfilter, ncomp =5, scale_x =TRUE, type = c("index","names","full"),...)
Arguments
y: Response vector
x: Matrix of predictors
force_vars: Vector of column names within x which are always retained in the model (i.e. not filtered). Default NULL means all predictors will be filtered.
nfilter: Either a single value for the total number of predictors to return. Or a vector of length ncomp to manually return predictors from each PLS component.
ncomp: the number of components to include in the PLS model.
scale_x: Logical whether to scale predictors before fitting the PLS model. This is recommended.
type: Type of vector returned. Default "index" returns indices, "names" returns predictor names, "full" returns a named vector of variable importance.
...: Other arguments passed to pls::plsr()
Returns
Integer vector of indices of filtered parameters (type = "index") or character vector of names (type = "names") of filtered parameters. If type is "full" full output of coefficients from plsr is returned as a list for each model component ordered by highest absolute coefficient.
Details
The best predictors may overlap between components, so if nfilter is specified as a vector, the total number of unique predictors returned may be variable.