cols: The number of columns of data to generate. Excludes the response column if has_response = TRUE.
randomize: A logical value indicating whether data values should be randomly generated. This must be TRUE if either categorical_fraction or integer_fraction is non-zero.
value: If randomize = FALSE, then all real-valued entries will be set to this value.
real_range: The range of randomly generated real values.
categorical_fraction: The fraction of total columns that are categorical.
factors: The number of (unique) factor levels in each categorical column.
integer_fraction: The fraction of total columns that are integer-valued.
integer_range: The range of randomly generated integer values.
binary_fraction: The fraction of total columns that are binary-valued.
binary_ones_fraction: The fraction of values in a binary column that are set to 1.
time_fraction: The fraction of randomly created date/time columns.
string_fraction: The fraction of randomly created string columns.
missing_fraction: The fraction of total entries in the data frame that are set to NA.
response_factors: If has_response = TRUE, then this is the number of factor levels in the response column.
has_response: A logical value indicating whether an additional response column should be pre-pended to the final H2O data frame. If set to TRUE, the total number of columns will be cols+1.
seed: A seed used to generate random values when randomize = TRUE.
seed_for_column_types: A seed used to generate random column types when randomize = TRUE.