deduplicate_equivalence function

Deduplication using equivalence groups

Deduplication using equivalence groups

deduplicate_equivalence(pairs, variable, selection, x = attr(pairs, "x"))

Arguments

  • pairs: a pairs object, such as generated by pair_blocking
  • variable: name of the variable to create in x that will contain the group labels.
  • selection: a logical variable with the same length as pairs has rows, or the name of such a variable in pairs. Pairs are only selected when select is TRUE. When missing it is assumed all pairs are selected.
  • x: the first data set; when missing attr(pairs, "x") is used.

Returns

Returns x with a variable containing the group labels. Records with the same group label (should) correspond to the same entity.

  • Maintainer: Jan van der Laan
  • License: GPL-3
  • Last published: 2024-02-09