data_preprocess function

Outputs a synthetic survey using a simple model