Compute Semantic Distance Between Text Constituents
clean_dialogue
clean_monologue_or_list
clean_paired_cols
A Typical Dialogue Transcript
dist_anchor
dist_dialogue
dist_ngram2ngram
dist_ngram2word
dist_paired_cols
Glove Semantic Embeddings
The Grandfather Passage: A Standardized Reading Passage
Load all .rda files from a GitHub data folder into the package environ...
A Typical Monologue Transcript
SD15_2025_complete Experiential Semantic Distance Values
Stopword List
Unordered_List
Word Pairs in Columns
wordlist_to_network
Cleans and formats language transcripts guided by a series of transformation options (e.g., lemmatize words, omit stopwords, split strings across rows). 'SemanticDistance' computes two distinct metrics of cosine semantic distance (experiential and embedding). These values reflect pairwise cosine distance between different elements or chunks of a language sample. 'SemanticDistance' can process monologues (e.g., stories, ordered text), dialogues (e.g., conversation transcripts), word pairs arrayed in columns, and unordered word lists. Users specify options for how they wish to chunk distance calculations. These options include: rolling ngram-to-word distance (window of n-words to each new word), ngram-to-ngram distance (2-word chunk to the next 2-word chunk), pairwise distance between words arrayed in columns, matrix comparisons (i.e., all possible pairwise distances between words in an unordered list), turn-by-turn distance (talker to talker in a dialogue transcript). 'SemanticDistance' includes visualization options for analyzing distances as time series data and simple semantic network dynamics (e.g., clustering, undirected graph network).
Useful links