getMatches function

getMatches

getMatches

Subset two data frames to the matches returned by fastLink()

or matchesLink(). Can also return a single deduped data frame if dfA and dfB are identical and fl.out is of class 'fastLink.dedupe'.

getMatches(dfA, dfB, fl.out, threshold.match, combine.dfs)

Arguments

  • dfA: Dataset A - matched to Dataset B by fastLink().
  • dfB: Dataset B - matches to Dataset A by fastLink().
  • fl.out: Either the output from fastLink() or matchesLink().
  • threshold.match: A number between 0 and 1 indicating the lower bound that the user wants to declare a match. For instance, threshold.match = .85 will return all pairs with posterior probability greater than .85 as matches. Default is 0.85.
  • combine.dfs: Whether to combine the two data frames being merged into a single data frame. If FALSE, two data frames are returned in a list. Default is TRUE.

Returns

getMatches() returns a list of two data frames: - dfA.match: A subset of dfA subsetted down to the successful matches.

  • dfB.match: A subset of dfB subsetted down to the successful matches.

Examples

## Not run: fl.out <- fastLink(dfA, dfB, varnames = c("firstname", "lastname", "streetname", "birthyear"), n.cores = 1) ret <- getMatches(dfA, dfB, fl.out) ## End(Not run)

Author(s)

Ben Fifield benfifield@gmail.com

  • Maintainer: Ted Enamorado
  • License: GPL (>= 3)
  • Last published: 2023-11-17

Useful links