The generator produces regression data data with 4 discrete and 7 numeric attributes.
regDataGen(noInst, t1=0.8, t2=0.5, noise=0.1)
Arguments
noInst: Number of instances to generate.
t1, t2: Parameters controlling the shape of the distribution.
noise: Parameter controlling the amount of noise. If noise=0, there is no noise. If noise = 1, then the level of the signal and noise are the same.
Returns
Returns a data.frame with noInst rows and 11 columns. Range of values of the attributes and response are - a1: 0,1
a2: a,b,c,d
a3: 0,1 (irrelevant)
a4: a,b,c,d (irrelevant)
x1: numeric (gaussian with different sd for each class)
x2: numeric (gaussian with different sd for each class)
x3: numeric (gaussian, irrelevant)
x4: numeric from [0,1]
x5: numeric from [0,1]
x6: numeric from [0,1]
response: numeric
Details
The response variable is derived from x4, x5, x6 using two different functions. The choice depends on a hidden variable, which determines weather the response value would follow a linear dependency f=x4−2x5+3x6, or a nonlinear one f=cos(4πx4)(2x5−3x6).
Attributes a1, a2, x1, x2 carry some information on the hidden variables depending on parameters t1, t2. Extreme values of the parameters are t1=0.5 and t2=1, when there is no information. On the other hand, if t1=0 or t1=1 then each of the attributes a1, a2 carries full information. If t2=0, then each of x1, x2 carries full information on the hidden variable.
The attributes x4, x5, x6 are available with a noise level depending on parameter noise. If noise=0, there is no noise. If noise=1, then the level of the signal and noise are the same.
Author(s)
Petr Savicky
See Also
classDataGen,ordDataGen,CoreModel,
Examples
#prepare a regression data setregData <-regDataGen(noInst=200)# build regression tree similar to CARTmodelRT <- CoreModel(response ~ ., regData, model="regTree", modelTypeReg=1)print(modelRT)destroyModels(modelRT)# clean up