tokenize_spaces_punct function

Tokenise text into a sequence of words