Quantitative Analysis of Textual Data
Modify only documents matching a logical condition
Coercion and checking methods for corpus objects
Convert a dfm to a data.frame
Coercion and checking functions for dfm objects
Coercion and checking functions for dictionary objects
Coercion and checking functions for fcm objects
Coerce a dfm to a matrix or data.frame
Coercion, checking, and combining functions for tokens objects
Convert quanteda dictionary objects to the YAML format
Function extending base::attributes()
Bootstrap a dfm
Combine dfm objects by Rows or Columns
Select or remove elements from a character vector
Convert the case of character objects
Check object class for functions
Check arguments passed to other functions via ...
Validate input vectors
Return the concatenator character from an object
Convenience wrappers for dfm convert
Convert quanteda objects to non-quanteda formats
Combine documents in corpus by a grouping variable
Recast the document units of a corpus
Randomly sample documents from a corpus
Segment texts on a pattern match
Extract a subset of a corpus
Remove sentences based on their token lengths or a pattern match
Base method extensions for corpus objects
Construct a corpus object
Internal data sets
Formerly included data objects
Recombine a dfm or fcm by combining identical dimension elements
Combine documents in a dfm by a grouping variable
Apply a dictionary to a dfm
Match the feature set of a dfm to given feature names
Replace features in dfm
Randomly sample documents from a dfm
Select features from a dfm or fcm
Sort a dfm by frequency of one or more margins
Extract a subset of a dfm
Weight a dfm by tf-idf
Convert the case of the features of a dfm and combine
Trim a dfm using frequency threshold-based feature selection
Weight the feature frequencies in a dfm
Virtual class "dfm" for a document-feature matrix
Internal functions for dfm objects
Create a document-feature matrix
Convert a dfm to an lsa "textmatrix"
dictionary class objects and functions
Create a dictionary
Compute the (weighted) document frequency of a feature
Get or set document names
Get or set document-level variables
Internal function for select_types()
to escape regular expressions
Simpler and faster version of expand.grid() in base package
Sort an fcm in alphabetical order of the features
Virtual class "fcm" for a feature co-occurrence matrix
Create a feature co-occurrence matrix
Compute the frequencies of features
Get the feature labels from a dfm
Shortcut functions to access or assign metadata
Flatten a hierarchical dictionary into a list of character vectors
Internal function to flatten a nested list
format a sparsity value for printing
Internal function to extract docvars
Get the package version that created an object
Grouping variable(s) for various functions
Return the first or last part of a dfm
Locate a pattern in a tokens object
Get information on TBB library
Check if patterns contains glob wildcard
Check if a glob pattern is indexed by index_types
Check if a string is a regular expression
Check if an object is collocations
Locate keywords-in-context
Internal function to convert a list to a dictionary
Internal function to lowercase dictionary values
Internal function to make new system-level docvars
Internal functions to create a list of the meta fields
Converts a Matrix to a dfm
Converts a Matrix to a fcm
Internal function to merge values of duplicated keys
Print messages in dfm methods
Return an error message
Print messages in tokens methods
Message parameter documentation
Internal function to get, set or initialize system metadata
Get or set object metadata
Conditionally format messages
Special handling for names of quanteda objects
Count the number of documents or features
Utility function to generate a nested list
Count the number of sentences
Count the number of tokens or types
Object builders
Match quanteda objects against token types
Pattern for feature, token and keyword matching
Match patterns against token types
Declare a pattern to be a sequence of separate patterns
Pipe operator
Print methods for quanteda core objects
Print a phrase object
Get or set package options for quanteda
An R package for the quantitative analysis of textual data
Internal functions to import dictionary files
Objects exported from other packages
Utility function to remove empty keys
Internal function to replace dictionary values
Sample a vector
Internal function to subset or duplicate docvar rows
Select types without performing slow regex search
Internal function for select_types
to search the index using fastmat...
Function to serialize list-of-character tokens
Internal functions to set dimnames
Extensions for and from spacy_parse objects
Compute the sparsity of a document-feature matrix
Internal function for special handling of multi-word dictionary values
Functions to add or retrieve corpus summary metadata
Summarize a corpus
Models for scaling and classification of textual data
Plots for textual data
Get or assign corpus texts [deprecated]
Statistics for textual data
Customizable tokenizer
quanteda tokenizers
Segment tokens object by chunks of a given size
Convert token sequences into compound tokens
Combine documents in a tokens object by a grouping variable
Apply a dictionary to a tokens object
Create n-grams and skip-grams from tokens
recompile a serialized tokens object
Replace tokens in a tokens object
Restore special tokens
Randomly sample documents from a tokens object
Segment tokens object by patterns
Select or remove tokens from a tokens object
Split tokens by a separator pattern
Extract a subset of a tokens
Convert the case of tokens
Trim tokens using frequency threshold-based feature selection
Stem the terms in an object
Methods for tokens_xptr objects
Base method extensions for tokens objects
Construct a tokens object
Identify the most frequent features in a dfm
Get word types from a tokens object
Unlist a list of character vectors safely
Unlist a list of integer vectors safely
Pattern matching using valuetype
A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.