Record Linkage Functions for Linking and Deduplicating Data Sets
Mean Residual Life Plot
Generalized Log-Linear Fitting
Check for FALSE
Create record pairs from blocks of ids.
Concatenate comparison patterns or classification results
Supervised Classification
Unsupervised Classification
Serialization of record linkage object.
Compare Records
Remove NULL Values
Edit Matching Status
Weight-based Classification of Data Pairs
Calculate weights
Classify record pairs with EpiLink weights
Calculate EpiLink weights
Class "ff_vector"
Class "ffdf"
Generate Training Set
Calculate Error Measures
Estimate number of record pairs.
Get attribute frequencies
Create a minimal training set
Extract Record Pairs
Backend function for getPairs
Estimate Threshold from Pareto Distribution
Build contingency table
Estimate Threshold from Pareto Distribution
Internal functions and methods
Optimal Threshold for Record Linkage
Phonetic Code
Class "RecLinkClassif"
Class "RecLinkData"
Record Linkage Data Object
Class "RecLinkResult"
Record Linkage Result Object
Safe Sampling
Class "RLBigData"
Constructors for big data objects.
Class "RLBigDataDedup"
Class "RLBigDataLinkage"
Test data for Record Linkage
Class "RLResult"
Show a RLBigData object
Split Data
Stochastic record linkage.
String Metrics
Subset operator for record linkage objects
Print Summary of Record Linkage Data
summary methods for "RLBigData"
objects.
Summary method for "RLResult"
objects.
LaTeX Summary of linkage results
Train a Classifier
Create Unordered Pairs
Provides functions for linking and deduplicating data sets. Methods based on a stochastic approach are implemented as well as classification algorithms from the machine learning domain. For details, see our paper "The RecordLinkage Package: Detecting Errors in Data" Sariyar M / Borg A (2010) <doi:10.32614/RJ-2010-017>.