textual function

Text mining

Text mining

Calculates the number of occurence of each words and a contingence table

textual(tab, num.text, contingence.by=1:ncol(tab), maj.in.min = TRUE, sep.word=NULL)

Arguments

  • tab: a data frame with one textual variable
  • num.text: indice of the textual variable
  • contingence.by: a list with the indices of the variables for which a contingence table is calculated by default a contingence table is calculated for all the variables (except the textual one). A contingence table can also be calculated for couple of variables. If contingence.by is equal to num.text, then the contingence table is calculated for each row of the data table
  • maj.in.min: boolean, if TRUE majuscule are transformed in minuscule
  • sep.word: a string with all the characters which correspond to separator of words

Returns

Returns a list including: - cont.table: the contingence table with in rows the categories of the categorical variables (or the couple of categories), and in column the words, and in each cell the number of occurence

  • nb.words: a data.frame with all the words and for each word, the number of lists in which it is present, and the number of occurence

Author(s)

Francois Husson francois.husson@institut-agro.fr

See Also

CA, descfreq

Examples

data(poison.text) res.text <- textual(poison.text, num.text = 3, contingence.by = 1) descfreq(res.text$cont.table) ## Contingence table for the couple of variable sick-sex res.text2 <- textual(poison.text, num.text = 3, contingence.by = list(c(1,2))) descfreq(res.text2$cont.table) ## Contingence table for sex, sick and the couple of variable sick-sex res.text2 <- textual(poison.text, num.text = 3, contingence.by = list(1,2,c(1,2)))
  • Maintainer: Francois Husson
  • License: GPL (>= 2)
  • Last published: 2024-04-20