rcv1sub5 function

Dataset from the Reuters corpus (subset 5)