corp.rm.class: A character vector with word classes which should be removed. The default value "nonpunct" has special meaning and will cause the result of kRp.POS.tags(lang, tags=c("punct","sentc"), list.classes=TRUE) to be used. Another valid value is "stopword" to remove all detected stopwords.
corp.rm.tag: A character vector with valid POS tags which should be removed.
as.vector: Logical. If TRUE, results will be returned as a character vector containing only the text parts which survived the filtering.
update.desc: Logical. If TRUE, the desc slot of the tagged object will be fully recalculated using the filtered text. If FALSE, the desc slot will be copied from the original object. Finally, if NULL, the desc slot remains empty.
Returns
An object of the input class. If as.vector=TRUE, returns only a character vector.
Examples
# code is only run when the english language package can be loadedif(require("koRpus.lang.en", quietly =TRUE)){ sample_file <- file.path( path.package("koRpus"),"examples","corpus","Reality_Winner.txt") tokenized.obj <- tokenize( txt=sample_file, lang="en") filterByClass(tokenized.obj)}else{}