This method either takes a character vector or objects inheriting class kRp.text
(i.e., text tokenized by koRpus), and jumbles the words. This usually means that the first and last letter of each word is left intact, while all characters inbetween are being randomized.
methods
jumbleWords(words,...)## S4 method for signature 'kRp.text'jumbleWords(words, min.length =3, intact = c(start =1, end =1))## S4 method for signature 'character'jumbleWords(words, min.length =3, intact = c(start =1, end =1))
Arguments
words: Either a character vector or an object inheriting from class kRp.text.
...: Additional options, currently unused.
min.length: An integer value, defining the minimum word length. Words with less characters will not be changed. Grapheme clusters are counted as one.
intact: A named vector with the two integer values named start and stop. These define how many characters of each relevant words will be left unchanged at its start and its end, respectively.
Returns
Depending on the class of words, either a character vector or an object of class kRp.text with the added feature diff.
Examples
# code is only run when the english language package can be loadedif(require("koRpus.lang.en", quietly =TRUE)){ sample_file <- file.path( path.package("koRpus"),"examples","corpus","Reality_Winner.txt") tokenized.obj <- tokenize( txt=sample_file, lang="en") tokenized.obj <- jumbleWords(tokenized.obj) pasteText(tokenized.obj)# diff stats are now part of the object hasFeature(tokenized.obj) diffText(tokenized.obj)}else{}