locpvs function

Pairwise variable selection for classification in local models

Pairwise variable selection for classification in local models

Performs pairwise variable selection on subclasses.

locpvs(x, subclasses, subclass.labels, prior=NULL, method="lda", vs.method = c("ks.test", "stepclass", "greedy.wilks"), niveau=0.05, fold=10, impr=0.1, direct="backward", out=FALSE, ...)

Arguments

  • x: matrix or data frame containing the explanatory variables. x must consist of numerical data only.
  • subclasses: vector indicating the subclasses (a factor)
  • subclass.labels: must be a matrix with 2 coloumns, where the first coloumn specifies the subclass and the second coloumn the according upper class
  • prior: prior probabilites for the classes. If not specified the prior probabilities will be set according to proportion in subclasses . If specified the order of prior probabilities must be the same as in subclasses .
  • method: character, name of classification function (e.g. ‘lda’ (default)).
  • vs.method: character, name of variable selection method. Must be one of ‘ks.test’ (default), ‘stepclass’ or ‘greedy.wilks’ .
  • niveau: used niveau for ‘ks.test’
  • fold: parameter for cross-validation, if ‘stepclass’ is chosen ‘vs.method’
  • impr: least improvement of performance measure desired to include or exclude any variable (<=1), if ‘stepclass’ is chosen ‘vs.method’
  • direct: direction of variable selection, if ‘stepclass’ is chosen ‘vs.method’ . Must be one if ‘forward’ , ‘backward’ (default) or ‘both’ .
  • out: indicator (logical) for textoutput during computation (slows down computation!), if ‘stepclass’ is chosen ‘vs.method’
  • ...: further parameters passed to classification function (‘method’ ) or variable selection method (‘vs.method’ )

Details

A call on pvs is performed using subclasses as grouping variable. See pvs for further details.

Returns

An object of class ‘locpvs’ containing the following components: - pvs.result: the complete output of the call to pvs (see pvs for further details

  • subclass.labels: the subclass.labels as specified in function call

Author(s)

Gero Szepannek, szepannek@statistik.tu-dortmund.de , Christian Neumann

References

Szepannek, G. and Weihs, C. (2006) Local Modelling in Classification on Different Feature Subspaces. In Advances in Data Mining., ed Perner, P., LNAI 4065, pp. 226-234. Springer, Heidelberg.

See Also

predict.locpvs for predicting ‘locpvs’ models and pvs

Examples

## this example might be a bit artificial, but it sufficiently shows how locpvs has to be used ## learn a locpvs-model on the Vehicle dataset library("mlbench") data("Vehicle") subclass <- Vehicle$Class # use four car-types in dataset as subclasses ## aggregate "bus" and "van" to upper-class "big" and "saab" and "opel" to upper-class "small" subclass_class <- matrix(c("bus","van","saab","opel","big","big","small","small"),ncol=2) ## learn now a locpvs-model for the subclasses: model <- locpvs(Vehicle[,1:18], subclass, subclass_class) model # short summary, showing the class-pairs of the submodels # together with the selected variables and the relation of sub- to upperclasses ## predict: pred <- predict(model, Vehicle[,1:18]) ## now you can look at the predicted classes: pred$class ## or at the posterior probabilities: pred$posterior ## or at the posterior probabilities for the subclasses: pred$subclass.posteriors