Miscellaneous Tools for Chinese Text Mining and More
Convenient Tool to Segment Chinese Texts
Remove Words through Speech Tagging
An Enhanced Version of as.character
An Enhanced Version of as.numeric
Miscellaneous Tools for Chinese Text Mining and More
Create Corpus or Document Term Matrix with 1 Line
Create Term-Term Matrix (Term-Cooccurrence Matrix)
Write Texts in CSV into Many TXT/RTF Files
A Default Value for corp_or_dtm 1
A Default Value for corp_or_dtm 2
A Default Cutter
Making DTM/TDM for Groups of Words
Collect Full Filenames from a Mix of Directories and Files
Extract Words of Some Certain Tags through Pos-Tagging
Check The Locale Functions are to Assume
A Convenient Version of is.character
A Convenient Version of is.integer
Rewrite Terms and Frequencies into Many Files
Convert Objects among matrix, dgCMatrix, simple_triplet_matrix, Docume...
Input a Filename and Return a Vector of Stop Words
Extract Strings by Regular Expression Quickly
Convert or Write DTM/TDM Object Quickly
Read a Text File by Auto-Detecting Encoding
Find High Frequency Terms
Check How many Words are Left under Certain Sparse Values
Transform Terms and Frequencies into a Text
Simple Rise or Fall Trend of Several Years
Write Many Separated Files into a CSV
Copy and Paste from Excel-Like Files
Copy and Paste from Excel-Like Files
Copy and Paste from Excel-Like Files
Copy and Paste from Excel-Like Files
Copy and Paste from Excel-Like Files
Word Correlation in DTM/TDM
Efforts are made to make Chinese text mining easier, faster, and robust to errors. Document term matrix can be generated by only one line of code; detecting encoding, segmenting and removing stop words are done automatically. Some convenient tools are also supplied.