fuzzylink0.2.5 package

Probabilistic Record Linkage Using Pretrained Text Embeddings

Links datasets through fuzzy string matching using pretrained text embeddings. Produces more accurate record linkage when lexical string distance metrics are a poor guide to match quality (e.g., "Patricia" is more lexically similar to "Patrick" than it is to "Trish"). Capable of performing multilingual record linkage. Methods are described in Ornstein (2025) <doi:10.1017/pan.2025.10016>.

  • Maintainer: Joe Ornstein
  • License: MIT + file LICENSE
  • Last published: 2025-08-29