pls function

Partial Least Squares

Partial Least Squares

A simple partial least squares procedure.

pls(x, y, K=1, scale=TRUE, verb=TRUE) ## S3 method for class 'pls' predict( object, newdata, type="response", ... ) ## S3 method for class 'pls' summary( object, ... ) ## S3 method for class 'pls' print(x, ... ) ## S3 method for class 'pls' plot(x, K=NULL, xlab="response", ylab=NULL, ...)

Arguments

  • x: The covariate matrix, in either dgCMatrix or matrix format. For plot and print: a pls output object.
  • y: The response vector.
  • K: The number of desired PLS directions. In plotting, this can be a vector of directions to draw, otherwise directions 1:fit$K are plotted.
  • scale: An indicator for whether to scale x; usually a good idea. If scale=TRUE, model is fit with x scaled to have variance-one columns.
  • verb: Whether or not to print a small progress script.
  • object: For predict and summary: a pls output object.
  • newdata: For predict, an ncol(x)-column matrix of new observations. Can be either a simple matrix or a simple_triplet_matrix.
  • type: For predict, a choice between output types: predictions scaled to the original response for "response", fitted partial least squares directions for "reduction".
  • xlab: For plot, the x-axis label.
  • ylab: For plot, the y-axis label. If null, will be set to `pls(k) fitted values' for each k.
  • ...: Additional arguments.

Returns

Output from pls is a list with the following entries - y: The response vector.

  • x: The unchanged covariate matrix.

  • directions: The pls directions: x%*%loadings - shift.

  • loadings: The pls loadings.

  • shift: Shift applied after projection to center the PLS directions.

  • fitted: K columns of fitted y values for each number of directions.

  • fwdmod: The lm object from forward regression lm(as.numeric(y)~directions).

predict.pls outputs either a vector of predicted resonse or an nrow(newcounts) by ncol(object$loadings) matrix of pls directions for each new observation. Summary and plot produce return nothing.

Details

pls fits the Partial Least Squares algorithm described in Taddy (2012; Appendix A.1). In particular, we obtain loadings loadings[,k] as the correlation between X and factors factors[,k], where factors[,1] is initialized at scale(as.numeric(y)) and subsequent factors are orthogonal to to the k'th pls direction, an ortho-normal transformation of x%*%loadings[,k].

predict.pls returns predictions from the object$fwdmod

forward regression α+βz\alpha + \beta*z for projections z = x*loadings - shift derived from new covariates, or if type="reduction" it just returns these projections. summary.pls prints dimension details and a quick summary of the corresponding forward regression. plot.pls draws response versus fitted values for least-squares fit onto the K pls directions.

References

Taddy (2013), Multinomial Inverse Regression for Text Analysis. Journal of the American Statistical Association 108.

Wold, H. (1975), Soft modeling by latent variables: The nonlinear iterative partial least squares approach. In Perspectives in Probability and Statistics, Papers in Honour of M.S. Bartlett.

Author(s)

Matt Taddy taddy@chicagobooth.edu

See Also

normalize, sdev, corr, congress109

Examples

data(congress109) x <- t( t(congress109Counts)/rowSums(congress109Counts) ) summary( fit <- pls(x, congress109Ideology$repshare, K=3) ) plot(fit, pch=21, bg=c(4,3,2)[congress109Ideology$party]) predict(fit, newdata=x[c(68,388),])