default_tokenize function

Default function to tokenize

Default function to tokenize

This tokenizer uses stringi::stri_split_boundaries() to tokenize a character vector. To be used with [explain.character()`.

default_tokenize(text)

Arguments

  • text: text to tokenize as a character vector

Returns

a character vector.

Examples

data('train_sentences') default_tokenize(train_sentences$text[1])
  • Maintainer: Emil Hvitfeldt
  • License: MIT + file LICENSE
  • Last published: 2022-08-19