Updist function

Educated distances for semi-supervised clustering

Educated distances for semi-supervised clustering

Updates distance matrix to help link or unlink objects

Updist(dst, link=NULL, unlink=NULL, dmax=max(dst), dmin=min(dst))

Arguments

  • dst: dist object
  • link: 1-level list with the arbitrary number of components, each component is a numeric vector of row numbers for objects which you prefer to be linked
  • unlink: 1-level list with the arbitrary number of components, each component is a numeric vector of row numbers for objects which you prefer to be not linked
  • dmax: Distance to set for not linked objects
  • dmin: Distance to set for linked objects

Details

This function borrows the idea of MPCKM semi-supervised k-means (Bilenko et al., 2004) but instead of updating distances on the run, it simply updates the distances object beforehand in accordance with 'link' and 'unlink' constraints.

Amazingly, it works as expected :) Please see the examples below.

References

Bilenko M., Basu S., Mooney R.J. 2004. Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the twenty-first international conference on Machine learning. P. 11. ACM.

See Also

dist

Examples

iris.d <- dist(iris[, -5]) iris.km <- kmeans(iris.d, 3) iris.h <- cutree(hclust(iris.d, method="ward.D"), k=3) Misclass(iris.km$cluster, iris$Species, best=TRUE) Misclass(iris.h, iris$Species, best=TRUE) i.vv <- cbind(which(iris$Species == "versicolor"), which(iris$Species == "virginica")) i.link <- list(sample(i.vv[, 2], 25), sample(i.vv[, 1], 25)) i.unlink <- list(i.vv[1, ], i.vv[2, ]) iris.upd <- Updist(iris.d, link=i.link, unlink=i.unlink) iris.ukm <- kmeans(iris.upd, 3) iris.uh <- cutree(hclust(iris.upd, method="ward.D"), k=3) Misclass(iris.ukm$cluster, iris$Species, best=TRUE) Misclass(iris.uh, iris$Species, best=TRUE) ## === aad <- dist(t(atmospheres)) plot(hclust(aad)) aadu <- Updist(aad, unlink=list(c("Earth", "Mercury"))) plot(hclust(aadu))
  • Maintainer: ORPHANED
  • License: GPL (>= 2)
  • Last published: 2023-02-05

Useful links