Blocking for Record Linkage
Function to convert a record into a bag of tokens with a fieldwise fla...
Function that reduces a bag of words into a signature matrix using mul...
Returns the block ids associated with a blocking method.
Function to calculate the inverse document frequency given a shingled ...
Perform evaluations (recall) for blocking.
Function that reduces a bag of words into a signature matrix using mul...
Returns the reduction ratio associated with a blocking method
Returns the reduction ratio associated with a blocking method
Function that generates unit random vectors and takes (weighted) proje...
Function to convert all records into a bag of tokens
Function to token a string into its k components
An implementation of the blocking algorithm KLSH in Steorts, Ventura, Sadinle, Fienberg (2014) <DOI:10.1007/978-3-319-11257-2_20>, which is a k-means variant of locality sensitive hashing. The method is illustrated with examples and a vignette.