rcv1sub4 function

Dataset from the Reuters corpus (subset 4)