prepare_data function

Prepare data for corpus exploration