tidytext0.4.2 package

Text Mining using 'dplyr', 'ggplot2', and Other Tidy Tools

bind_tf_idf

Bind the term frequency and inverse document frequency of a tidy text ...

cast_sparse

Create a sparse matrix from row names, column names, and values in a t...

corpus_tidiers

Tidiers for a corpus object from the quanteda package

dictionary_tidiers

Tidy dictionary objects from the quanteda package

document_term_casters

Casting a data frame to a DocumentTermMatrix, TermDocumentMatrix, or d...

get_sentiments

Get a tidy data frame of a single sentiment lexicon

get_stopwords

Get a tidy data frame of a single stopword lexicon

lda_tidiers

Tidiers for LDA and CTM objects from the topicmodels package

mallet_tidiers

Tidiers for Latent Dirichlet Allocation models from the mallet package

reexports

Objects exported from other packages

reorder_within

Reorder an x or y axis within facets

stm_tidiers

Tidiers for Structural Topic Models from the stm package

tdm_tidiers

Tidy DocumentTermMatrix, TermDocumentMatrix, and related objects from ...

tidy.Corpus

Tidy a Corpus object from the tm package

tidy_triplet

Utility function to tidy a simple triplet matrix

tidytext-package

tidytext: Text Mining using 'dplyr', 'ggplot2', and Other Tidy Tools

unnest_character

Wrapper around unnest_tokens for characters and character shingles

unnest_ngrams

Wrapper around unnest_tokens for n-grams

unnest_ptb

Wrapper around unnest_tokens for Penn Treebank Tokenizer

unnest_regex

Wrapper around unnest_tokens for regular expressions

unnest_sentences

Wrapper around unnest_tokens for sentences, lines, and paragraphs

unnest_tokens

Split a column into tokens

unnest_tweets

Wrapper around unnest_tokens for tweets

Using tidy data principles can make many text mining tasks easier, more effective, and consistent with tools already in wide use. Much of the infrastructure needed for text mining with tidy data frames already exists in packages like 'dplyr', 'broom', 'tidyr', and 'ggplot2'. In this package, we provide functions and supporting data sets to allow conversion of text to and from tidy formats, and to switch seamlessly between tidy tools and existing text mining packages.

  • Maintainer: Julia Silge
  • License: MIT + file LICENSE
  • Last published: 2024-04-10