inpdfr0.1.12 package

Analyse Text Documents Using Ecological Tools

getListFiles

List files in a specified directory sorted by extension.

getMostFreqWord

Returns most frequent words.

getMostFreqWordCor

Test for correlation between the most frequent words.

mergeWordFreq

Merge word-occurrence data.frames into a single data.frame.

getPDF

Extract text from PDF files and return a word-occurrence data.frame.

getStopWords

Load a list of stopwords.

getSummaryStatsBARPLOT

Perform a barplot with the number of unique words per document

getSummaryStatsHISTO

Plot an histogram with the number of words excluding stop words

getSummaryStatsOCCUR

Plot a scatter plot with the proportion of documents using similar wor...

getTXT

Extract text from TXT files and return a word-occurrence data.frame.

getwordOccuDF

A quick way to obtain the word-occurrence data.frame from a set of doc...

getXFreqWord

Returns most frequent words

IdentifyStructure

Copy of the identifyStructure function from Tad Dallas metacom package...

inpdfr

inpdfr: A package to analyse PDF Files Using Ecological Tools.

makeWordcloud

Word cloud based on the word-occurrence data.frame.

postProcTxt

Prossess vectors containing words into a data.frame of word occurrence...

preProcTxt

Extract text from txt files and pre-process content.

quitSpaceFromChars

Delete spaces in file names.

truncNumWords

Truncate the word-occurrence data.frame.

doCluster

Performs a cluster analysis on the basis of the word-occurrence data.f...

doKmeansClust

Performs a k-means cluster analysis on the basis of the word-occurrenc...

doMetacomEntropart

Performs an analysis of ecological diversity and structure.

doMetacomMetacom

Performs a metacomunity analysis.

excludeStopWords

Exclude StopWords form the word-occurrence data.frame.

getAllAnalysis

A quick way to compute a set of analysis from the word-occurrence data...

A set of functions to analyse and compare texts, using classical text mining functions, as well as those from theoretical ecology.

  • Maintainer: Rebaudo Francois
  • License: GPL-2
  • Last published: 2023-08-24