Word Embedding Research Framework for Psychological Science
Word vectors data class: wordvec and embed.
Cosine similarity/distance between two vectors.
Transform plain text of word vectors into wordvec (data.table) or `e...
Load word vectors data (wordvec or embed) from ".RData" file.
[S3 method] Extract a subset of word vectors data.
Expand a dictionary from the most similar words.
Reliability analysis and PCA of a dictionary.
Extract word vector(s).
Find the Top-N most similar words.
Normalize all word vectors to the unit length 1.
Orthogonal Procrustes rotation for matrix alignment.
Compute a matrix of cosine similarity/distance of word pairs.
Visualize a (partial correlation) network graph of words.
Visualize cosine similarity of word pairs.
Visualize word vectors with dimensionality reduced using t-SNE.
Visualize word vectors.
PsychWordVec: Word Embedding Research Framework for Psychological Scie...
Objects exported from other packages
Calculate the sum vector of multiple words.
Tabulate cosine similarity/distance of word pairs.
Relative Norm Distance (RND) analysis.
Word Embedding Association Test (WEAT) and Single-Category WEAT.
Tokenize raw text for training word embeddings.
Train static word embeddings using the Word2Vec, GloVe, or FastText al...
An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a group of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; and (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <doi:10.48550/arXiv.1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <doi:10.48550/arXiv.1607.04606>.
Useful links