Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning
Find text required python packages env
Semantic similarity score between single words' and an aggregated word...
Plots words from textCentrality()
Cleans text from standard personal information
Clean non-ASCII characters
Compute descriptive statistics of character variables.
Run diagnostics for the text package
Change dimension names
Semantic distance
Semantic distance across multiple word embeddings
Semantic distance between a text variable and a word norm
Compare two language domains
textEmbed() extracts layers and aggregate them to word embeddings, for...
Aggregate layers
Extract layers of hidden states
Pre-trained dimension reduction (experimental)
Apply static word embeddings
Identify language examples.
Detect non-ASCII characters
Domain Adapted Pre-Training (EXPERIMENTAL - under development)
Task Adapted Pre-Training (EXPERIMENTAL - under development)
Text generation
The LBAM library
Number of layers
Check downloaded, available models.
Delete a specified model
Named Entity Recognition. (experimental)
textPCA()
textPCAPlot
Plot words
textPredict, textAssess and textClassify
Predict from several models, selecting the correct input
Significance testing for model prediction performance
Supervised Dimension Projection
Plot Supervised Dimension Projection
Question Answering. (experimental)
Initialize text required python packages
Install text required python packages in conda or virtualenv environme...
Uninstall textrpp conda environment
Semantic Similarity
Semantic similarity across multiple word embeddings
Semantic similarity between a text variable and a word norm
Summarize texts. (experimental)
Tokenize text-variables
Tokenize and count
BERTopics
textTopicsReduce (EXPERIMENTAL)
Wrapper for topicsTest function from the topics package
textTopicsTest (EXPERIMENTAL) to get the hierarchical topic tree
Plot word clouds
Trains word embeddings
Train lists of word embeddings
Cross-validated accuracies across sample-sizes
Plot cross-validated accuracies across sample sizes
Trains word embeddings usig random forest
Train word embeddings to a numeric variable.
Translation. (experimental)
Zero Shot Classification (Experimental)
Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.
Useful links