tm0.7-14 package

Text Mining Package

DataframeSource

Data Frame Source

getTokenizers

Tokenizers

getTransformations

Transformations

hpc

Parallelized lapply

inspect

Inspect Objects

matrix

Term-Document Matrix

meta

Metadata Management

removeWords

Remove Words from a Text Document

stripWhitespace

Strip Whitespace from a Text Document

combine

Combine Corpora, Documents, Term-Document Matrices, and Term Frequency...

content_transformer

Content Transformers

Corpus

Corpora

DirSource

Directory Source

Docs

Access Document IDs and Terms

findAssocs

Find Associations in a Term-Document Matrix

findFreqTerms

Find Frequent Terms

findMostFreqTerms

Find Most Frequent Terms

foreign

Read Document-Term Matrices

PCorpus

Permanent Corpora

PlainTextDocument

Plain Text Documents

plot

Visualize a Term-Document Matrix

readDataframe

Read In a Text Document from a Data Frame

readDOC

Read In a MS Word Document

Reader

Readers

readPDF

Read In a PDF Document

readPlain

Read In a Text Document

readRCV1

Read In a Reuters Corpus Volume 1 Document

readReut21578XML

Read In a Reuters-21578 XML Document

readTagged

Read In a POS-Tagged Word Text Document

readXML

Read In an XML Document

removeNumbers

Remove Numbers from a Text Document

removePunctuation

Remove Punctuation Marks from a Text Document

removeSparseTerms

Remove Sparse Terms from a Term-Document Matrix

SimpleCorpus

Simple Corpora

Source

Sources

stemCompletion

Complete Stems

stemDocument

Stem Words

stopwords

Stopwords

termFreq

Term Frequency Vector

TextDocument

Text Documents

tm_filter

Filter and Index Functions on Corpora

tm_map

Transformations on Corpora

tm_reduce

Combine Transformations

tm_term_score

Compute Score for Matching Terms

tokenizer

Tokenizers

URISource

Uniform Resource Identifier Source

VCorpus

Volatile Corpora

VectorSource

Vector Source

weightBin

Weight Binary

WeightFunction

Weighting Function

Zipf_n_Heaps

Explore Corpus Term Frequency Characteristics

weightSMART

SMART Weightings

weightTf

Weight by Term Frequency

weightTfIdf

Weight by Term Frequency - Inverse Document Frequency

writeCorpus

Write a Corpus to Disk

XMLSource

XML Source

XMLTextDocument

XML Text Documents

ZipSource

ZIP File Source

A framework for text mining applications within R.