An R Wrapper for the Java Mallet Topic Modeling Toolkit
Load a Mallet state into Mallet
An R Wrapper for the Java Mallet Topic Modeling Toolkit
Retrieve a matrix of topic weights for every document
Import text documents into Mallet format
Import documents from a directory into Mallet format
Estimate topic-word distributions from a sub-corpus
Get the most probable words and their probabilities for one topic
Return a hierarchical clustering of topics
Get strings containing the most probable words for each topic
Load (read) and save (write) a topic from a file
Retrieve a matrix of words weights for topics
Descriptive statistics of word frequencies
Return the mallet jar filename(s)
Return the file path to the mallet stoplists
Mallet supported stoplists
Create a Mallet topic model trainer
Load and save mallet instances from/to file
Save a Mallet state to file
An R interface for the Java Machine Learning for Language Toolkit (mallet) <http://mallet.cs.umass.edu/> to estimate probabilistic topic models, such as Latent Dirichlet Allocation. We can use the R package to read textual data into mallet from R objects, run the Java implementation of mallet directly in R, and extract results as R objects. The Mallet toolkit has many functions, this wrapper focuses on the topic modeling sub-package written by David Mimno. The package uses the rJava package to connect to a JVM.