PLS_beta_wvc function

Light version of PLS_beta for cross validation purposes

Light version of PLS_beta for cross validation purposes

Light version of PLS_beta for cross validation purposes either on complete or incomplete datasets.

PLS_beta_wvc( dataY, dataX, nt = 2, dataPredictY = dataX, modele = "pls", family = NULL, scaleX = TRUE, scaleY = NULL, keepcoeffs = FALSE, keepstd.coeffs = FALSE, tol_Xi = 10^(-12), weights, method = "logistic", link = NULL, link.phi = NULL, type = "ML", verbose = TRUE )

Arguments

  • dataY: response (training) dataset
  • dataX: predictor(s) (training) dataset
  • nt: number of components to be extracted
  • dataPredictY: predictor(s) (testing) dataset
  • modele: name of the PLS glm or PLS beta model to be fitted ("pls", "pls-glm-Gamma", "pls-glm-gaussian", "pls-glm-inverse.gaussian", "pls-glm-logistic", "pls-glm-poisson", "pls-glm-polr", "pls-beta"). Use "modele=pls-glm-family" to enable the family option.
  • family: a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See family for details of family functions.) To use the family option, please set modele="pls-glm-family". User defined families can also be defined. See details.
  • scaleX: scale the predictor(s) : must be set to TRUE for modele="pls" and should be for glms pls.
  • scaleY: scale the response : Yes/No. Ignored since non always possible for glm responses.
  • keepcoeffs: whether the coefficients of the linear fit on link scale of unstandardized eXplanatory variables should be returned or not.
  • keepstd.coeffs: whether the coefficients of the linear fit on link scale of standardized eXplanatory variables should be returned or not.
  • tol_Xi: minimal value for Norm2(Xi) and det(pppp)det(pp'*pp) if there is any missing value in the dataX. It defaults to 101210^{-12}
  • weights: an optional vector of 'prior weights' to be used in the fitting process. Should be NULL or a numeric vector.
  • method: logistic, probit, complementary log-log or cauchit (corresponding to a Cauchy latent variable).
  • link: character specification of the link function in the mean model (mu). Currently, "logit", "probit", "cloglog", "cauchit", "log", "loglog" are supported. Alternatively, an object of class "link-glm" can be supplied.
  • link.phi: character specification of the link function in the precision model (phi). Currently, "identity", "log", "sqrt" are supported. The default is "log" unless formula is of type y~x where the default is "identity" (for backward compatibility). Alternatively, an object of class "link-glm" can be supplied.
  • type: character specification of the type of estimator. Currently, maximum likelihood ("ML"), ML with bias correction ("BC"), and ML with bias reduction ("BR") are supported.
  • verbose: should info messages be displayed ?

Returns

  • valsPredict: nrow(dataPredictY) * nt matrix of the predicted values - list("coeffs"): If the coefficients of the eXplanatory variables were requested:

    i.e. keepcoeffs=TRUE.

    ncol(dataX) * 1 matrix of the coefficients of the the eXplanatory variables

Details

This function is called by PLS_glm_kfoldcv_formula in order to perform cross validation either on complete or incomplete datasets.

There are seven different predefined models with predefined link functions available :

  • list(""pls""): ordinary pls models
  • list(""pls-glm-Gamma""): glm gaussian with inverse link pls models
  • list(""pls-glm-gaussian""): glm gaussian with identity link pls models
  • list(""pls-glm-inverse-gamma""): glm binomial with square inverse link pls models
  • list(""pls-glm-logistic""): glm binomial with logit link pls models
  • list(""pls-glm-poisson""): glm poisson with log link pls models
  • list(""pls-glm-polr""): glm polr with logit link pls models

Using the "family=" option and setting "modele=pls-glm-family" allows changing the family and link function the same way as for the glm function. As a consequence user-specified families can also be used.

  • The: accepts the links (as names) identity, log and inverse.

  • list("gaussian"): accepts the links (as names) identity, log and inverse.

  • family: accepts the links (as names) identity, log and inverse.

  • The: accepts the links logit, probit, cauchit, (corresponding to logistic, normal and Cauchy CDFs respectively) log

     and `cloglog` (complementary log-log).
    
  • list("binomial"): accepts the links logit, probit, cauchit, (corresponding to logistic, normal and Cauchy CDFs respectively) log and cloglog

     (complementary log-log).
    
  • family: accepts the links logit, probit, cauchit, (corresponding to logistic, normal and Cauchy CDFs respectively) log and cloglog (complementary log-log).

  • The: accepts the links inverse, identity and log.

  • list("Gamma"): accepts the links inverse, identity and log.

  • family: accepts the links inverse, identity and log.

  • The: accepts the links log, identity, and sqrt.

  • list("poisson"): accepts the links log, identity, and sqrt.

  • family: accepts the links log, identity, and sqrt.

  • The: accepts the links 1/mu^2, inverse, identity and log.

  • list("inverse.gaussian"): accepts the links 1/mu^2, inverse, identity and log.

  • family: accepts the links 1/mu^2, inverse, identity and log.

  • The: accepts the links logit, probit, cloglog, identity, inverse, log, 1/mu^2 and sqrt.

  • list("quasi"): accepts the links logit, probit, cloglog, identity, inverse, log, 1/mu^2 and sqrt.

  • family: accepts the links logit, probit, cloglog, identity, inverse, log, 1/mu^2 and sqrt.

  • The function: can be used to create a power link function.

  • list("power"): can be used to create a power link function.

Non-NULL weights can be used to indicate that different observations have different dispersions (with the values in weights being inversely proportional to the dispersions); or equivalently, when the elements of weights are positive integers w_i, that each response y_i is the mean of w_i unit-weight observations.

Examples

data("GasolineYield",package="betareg") yGasolineYield <- GasolineYield$yield XGasolineYield <- GasolineYield[,2:5] modpls <- PLS_beta_wvc(yGasolineYield,XGasolineYield,nt=3,modele="pls-beta") modpls rm("modpls")

References

Frédéric Bertrand, Nicolas Meyer, Michèle Beau-Faller, Karim El Bayed, Izzie-Jacques Namer, Myriam Maumy-Bertrand (2013). Régression Bêta PLS. Journal de la Société Française de Statistique, 154 (3):143-159. http://publications-sfds.math.cnrs.fr/index.php/J-SFdS/article/view/215

See Also

PLS_beta for more detailed results, PLS_beta_kfoldcv for cross validating models and PLS_lm_wvc for the same function dedicated to plsR models

Author(s)

Frédéric Bertrand

frederic.bertrand@utt.fr

https://fbertran.github.io/homepage/