ClusterChallenge function

Generates a Fundamental Clustering Challenge based on specific artificial datasets.

Generates a Fundamental Clustering Challenge based on specific artificial datasets.

Lsun3D and FCPS datasets were introduced in various publications for a specific fixed size. This function generalizes them for any sample size.

ClusterChallenge(Name,SampleSize, PlotIt=FALSE,PointSize=1,Plotter3D="rgl",...)

Arguments

  • Name: string, either 'Atom', 'Chainlink, 'EngyTime', 'GolfBall', 'Hepta', 'Lsun3D', 'Target' 'Tetra' 'TwoDiamonds' 'WingNut
  • SampleSize: Size of Sample higher than 300, preferable above 500
  • PlotIt: TRUE: Plots the challenge with ClusterPlotMDS
  • PointSize: If PlotIt=TRUE: see ClusterPlotMDS
  • Plotter3D: If PlotIt=TRUE: see ClusterPlotMDS
  • ...: If PlotIt=TRUE: further arguments for ClusterPlotMDS

Details

A detailed description of the datasets can be found in [Thrun/Ultsch 2020]. Sampling works by combining Pareto Density Estimation with rejection sampling.

Returns

LIST, with - Name: [1:SampleSize,1:d] data matrix

  • Cls: [1:SampleSize] numerical vector of classification

References

[Thrun/Ultsch, 2020] Thrun, M. C., & Ultsch, A.: Clustering Benchmark Datasets Exploiting the Fundamental Clustering Problems, Data in Brief, Vol. in press, pp. 105501, tools:::Rd_expr_doi("10.1016/j.dib.2020.105501") , 2020.

Author(s)

Michael Thrun

Examples

## Not run: ClusterChallenge("Chainlink",2000,PlotIt=TRUE) ## End(Not run)

See Also

ClusterPlotMDS

  • Maintainer: Michael Thrun
  • License: GPL-3
  • Last published: 2023-10-19