Find the most frequently occurring terms in a text vector.
freq_terms( text.var, top =20, at.least =1, stopwords =NULL, extend =TRUE,...)
Arguments
text.var: The text variable.
top: Top number of terms to show.
at.least: An integer indicating at least how many letters a word must be to be included in the output.
stopwords: A character vector of words to remove from the text. qdap has a number of data sets that can be used as stop words including: Top200Words, Top100Words, Top25Words. For the tm package's traditional English stop words use tm::stopwords("english").
extend: logical. If TRUE the top argument is extended to any word that has the same frequency as the top word.
``: Other arguments passed to all_words.
Returns
Returns a dataframe with the top occurring words.
Examples
## Not run:freq_terms(DATA$state,5)freq_terms(DATA$state)freq_terms(DATA$state, extend =FALSE)freq_terms(DATA$state, at.least =4)(out <- freq_terms(pres_debates2012$dialogue, stopwords = Top200Words))plot(out)## All words by sentence (row)library(qdapTools)x <- raj$dialogue
list_df2df(setNames(lapply(x, freq_terms, top=Inf), seq_along(x)),"row")list_df2df(setNames(lapply(x, freq_terms, top=10, stopwords = Dolch), seq_along(x)),"Title")## All words by personFUN <-function(x, n=Inf) freq_terms(paste(x, collapse=" "), top=n)list_df2df(lapply(split(x, raj$person), FUN),"person")## Plot itout <- lapply(split(x, raj$person), FUN, n=10)pdf("Freq Terms by Person.pdf", width=13)lapply(seq_along(out),function(i){## dev.new() plot(out[[i]], plot=FALSE)+ ggtitle(names(out)[i])})dev.off()## Keep spacesfreq_terms(space_fill(DATA$state,"are you"),500, char.keep="~~")## End(Not run)