Record Linkage Based on an Entropy-Maximizing Classifier
Absolute Distance Comparison Function
Create Comparison Vectors for Record Linkage
Controls for the kliep Function
Create a Custom Record Linkage Model
Jaro-Winkler Distance Complement
Unsupervised Maximum Entropy Classifier for Record Linkage
Predict Matches Based on a Given Record Linkage Model
Train a Record Linkage Model
The goal of 'automatedRecLin' is to perform record linkage (also known as entity resolution) in unsupervised or supervised settings. It compares pairs of records from two datasets using selected comparison functions to estimate the probability or density ratio between matched and non-matched records. Based on these estimates, it predicts a set of matches that maximizes entropy. For details see: Lee et al. (2022) <https://www150.statcan.gc.ca/n1/pub/12-001-x/2022001/article/00007-eng.htm>, Vo et al. (2023) <https://ideas.repec.org/a/eee/csdana/v179y2023ics0167947322002365.html>, Sugiyama et al. (2008) <doi:10.1007/s10463-008-0197-x>.
Useful links