rl_generate_policy function

Function performs Reinforcement Learning using the past data to generate model policy