tlda R package [Documentation]

disp_DA_tdm

Calculate the dispersion measure $D_{A}$ for a term-document matrix

disp_DA

Calculate the dispersion measure $D_{A}$

disp_DKL_tdm

Calculate the dispersion measure $D_{KL}$ for a term-document matrix

disp_DKL

Calculate the dispersion measure $D_{KL}$

disp_DP_tdm

Calculate Gries's deviation of proportions for a term-document mat...

disp_DP

Calculate Gries's deviation of proportions

disp_R_tdm

Calculate the dispersion measure 'range' for a term-document matrix

disp_R

Calculate the dispersion measure 'range'

disp_S_tdm

Calculate the dispersion measure $S$ for a term-document matrix

disp_S

Calculate the dispersion measure $S$

disp_tdm

Calculate parts-based dispersion measures for a term-document matrix

disp

Calculate parts-based dispersion measures

find_max_disp_tdm

Find the maximally dispersed distribution of each item in a term-docum...

find_max_disp

Find the maximally dispersed distribution of an item across corpus par...

find_min_disp_tdm

Find the minimally dispersed distribution of each item in a term-docum...

find_min_disp

Find the minimally dispersed distribution of an item across corpus par...

Download source package Read PDF manual

Support functions and datasets to facilitate the analysis of linguistic data. The current focus is on the calculation of corpus-linguistic dispersion measures as described in Gries (2021) <doi:10.1007/978-3-030-46216-1_5> and Soenning (2025) <doi:10.3366/cor.2025.0326>. The most commonly used parts-based indices are implemented, including different formulas and modifications that are found in the literature, with the additional option to obtain frequency-adjusted scores. Dispersion scores can be computed based on individual count variables or a term-document matrix.

Maintainer: Lukas Soenning
License: MIT + file LICENSE
Last published: 2025-04-25

Useful links

tlda0.1.0 package

Functions

Readme

Datasets

Dependencies

Versions

News