pair function

Generate all possible pairs

Generate all possible pairs

Generates all combinations of records from x and y.

pair(x, y, deduplication = FALSE, add_xy = TRUE)

Arguments

  • x: first data.frame
  • y: second data.frame. Ignored when deduplication = TRUE.
  • deduplication: generate pairs from only x. Ignore y. This is usefull for deduplication of x.
  • add_xy: add x and y as attributes to the returned pairs. This makes calling some subsequent operations that need x and y (such as compare_pairs easier.

Returns

A data.table with two columns, .x and .y, is returned. Columns .x and .y are row numbers from data.frames .x and .y respectively.

Details

Generating (all) pairs of the records of two data sets, is usually the first step when linking the two data sets.

Examples

data("linkexample1", "linkexample2") pairs <- pair(linkexample1, linkexample2)

See Also

pair_blocking and pair_minsim are other methods to generate pairs.

  • Maintainer: Jan van der Laan
  • License: GPL-3
  • Last published: 2024-02-09