Train a k nearest neighbors (knn) classifer via cross validation (cv).
Train a k nearest neighbors (knn) classifer via cross validation (cv).
Train a k nearest neighbors (knn) classifer via cross validation (cv). The number of folds and the set of the number of neihbors to consider may be specified.
knn_cv(xy, k.cv =5, kvec = seq(1,47, by =2))
Arguments
xy: Data frame with the data matrix x as the first set of columns and the vector y as the last column.
k.cv: scalar. number of folds to use. default is 5.
kvec: vector. set of neighbors to consider. default is odd integers between 1 and 47 (inclusive).
Returns
kvec: set of neighbors considered
error: vector of misclassification error rates corresponding to kvec
k.best: number of neighbors with lowest error rate
k.cv: number of folds to used
References
Hastie, T., Tibshiani, R., and Friedman, J. (2017), The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition, New York: Springer.
James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013), An Introduction to Statistical Learning with Applications in R, New York: Springer.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.