Reading Machine Learning Benchmark Data Sets in Different Formats
Determine the type of values in each column of a data frame.
Checks consistency of the data frame dsList
.
Compares the type of columns stored in dsList
and in a data set itse...
Run an external tool to download a data set.
Loading machine learning data from a directory tree using a unified in...
Search a dataset by string matching against the names stored in `dsLis...
Sort the rows of a data frame.
Checks consistency of the data frame dsList
.
Prints the information on the fields in the data frame dsList
describ...
Determine the path to package example directories.
Determines the type vector for an input data set.
Prepares a data frame dsList
, which describes the data contained in ...
Reading data from different sources in their original format.
Handling XML files.
Functions for reading data sets in different formats for testing machine learning tools are provided. This allows to run a loop over several data sets in their original form, for example if they are downloaded from UCI Machine Learning Repository. The data are not part of the package and have to be downloaded separately.