split_data function

Split the data frame to create training and test data