pangoling1.0.3 package

Access to Large Language Model Predictions

causal_preload

Preloads a causal language model

causal_tokens_lp_tbl-defunct

Get the log probability of each token in a sentence (or group of sente...

causal_config

Returns the configuration of a causal model

causal_lp_mats-defunct

Get a list of matrices with the log probabilities of possible words gi...

causal_lp-defunct

Get the log probability of each element of a vector of words (or phras...

causal_next_tokens_pred_tbl

Generate next tokens after a context and their predictability using a ...

causal_next_tokens_tbl-defunct

Get the possible next tokens and their log probabilities for its previ...

causal_pred_mats

Generate a list of predictability matrices using a causal transformer ...

causal_predictability

Compute predictability using a causal transformer model

install_py_pangoling

Install the Python packages needed for pangoling

installed_py_pangoling

Check if the required Python dependencies for pangoling are installe...

masked_config

Returns the configuration of a masked model

masked_lp-defunct

Get the log probability of a target word (or phrase) given a left and ...

masked_preload

Preloads a masked language model

masked_targets_pred

Get the predictability of a target word (or phrase) given a left and r...

masked_tokens_pred_tbl

Get the possible tokens and their log probabilities for each mask in a...

masked_tokens_tbl-defunct

Get the possible tokens and their log probabilities for each mask in a...

ntokens

The number of tokens in a string or vector of strings

pangoling-defunct

Defunct functions in package pangoling.

pangoling-package

pangoling: Access to Large Language Model Predictions

perplexity_calc

Calculates perplexity

set_cache_folder

Set cache folder for HuggingFace transformers

tokenize_lst

Tokenize an input

transformer_vocab

Returns the vocabulary of a model

Provides access to word predictability estimates using large language models (LLMs) based on 'transformer' architectures via integration with the 'Hugging Face' ecosystem <https://huggingface.co/>. The package interfaces with pre-trained neural networks and supports both causal/auto-regressive LLMs (e.g., 'GPT-2') and masked/bidirectional LLMs (e.g., 'BERT') to compute the probability of words, phrases, or tokens given their linguistic context. For details on GPT-2 and causal models, see Radford et al. (2019) <https://storage.prod.researchhub.com/uploads/papers/2020/06/01/language-models.pdf>, for details on BERT and masked models, see Devlin et al. (2019) <doi:10.48550/arXiv.1810.04805>. By enabling a straightforward estimation of word predictability, the package facilitates research in psycholinguistics, computational linguistics, and natural language processing (NLP).

  • Maintainer: Bruno Nicenboim
  • License: MIT + file LICENSE
  • Last published: 2025-04-07