data.simulation function

Simulates subspace clustering data

Simulates subspace clustering data

Generates data for simulation with a low-rank subspace structure: variables are clustered and each cluster has a low-rank representation. Factors than span subspaces are not shared between clusters.

data.simulation(n = 100, SNR = 1, K = 10, numb.vars = 30, max.dim = 2, min.dim = 1, equal.dims = TRUE)

Arguments

  • n: An integer, number of individuals.
  • SNR: A numeric, signal to noise ratio measured as variance of the variable, element of a subspace, to the variance of noise.
  • K: An integer, number of subspaces.
  • numb.vars: An integer, number of variables in each subspace.
  • max.dim: An integer, if equal.dims is TRUE then max.dim is dimension of each subspace. If equal.dims is FALSE then subspaces dimensions are drawn from uniform distribution on [min.dim,max.dim].
  • min.dim: An integer, minimal dimension of subspace .
  • equal.dims: A boolean, if TRUE (value set by default) all clusters are of the same dimension.

Returns

A list consisting of: - X: matrix, generated data

  • signals: matrix, data without noise - dims: vector, dimensions of subspaces - factors: matrix, columns of which span subspaces

  • s: vector, true partiton of variables

Examples

sim.data <- data.simulation() sim.data2 <- data.simulation(n = 30, SNR = 2, K = 5, numb.vars = 20, max.dim = 3, equal.dims = FALSE)
  • Maintainer: Piotr Sobczyk
  • License: GPL-3
  • Last published: 2019-06-26

Useful links