Phrase Mining
Convert a phraseDoc Object to a Matrix
Find Informative Documents in a Corpus
Calculate Canberra Distance
Create a DFSource object from a data frame
Calculate a Distance Matrix
Display Frequent Principal Phrases
Display Frequency Matrix for Phrases
Obtain the current row of the content of a DFSource
Display Frequency Matrix for Documents
Create a data table from a text file in PubMed format
phraseDoc Creation
Print a phraseDoc Object
Print a textCluster Object
Create a PlainTextDocument from a row in a data frame
Remove Phrases from phraseDoc Object
Show Cluster Contents
Words that Principal Phrases do not End with
Phrases that are not Principal Phrases
Words that Principal Phrases do not Start with
Cluster a Term-Document Matrix
Calculate Text Distance (sparse version)
Calculate Text Distance (dense version)
Calculate a Text Distance Matrix
Functions to extract and handle commonly occurring principal phrases obtained from collections of texts. Major speed improvements - core functions rewritten in C++ for faster phrase-document parsing, clustering, and text distance computations. Based on, Small, E., & Cabrera, J. (2025). Principal phrase mining, an automated method for extracting meaningful phrases from text. International Journal of Computers and Applications, 47(1), 84–92.