Generates all combinations of records from x and y.
pair(x, y, deduplication =FALSE, add_xy =TRUE)
Arguments
x: first data.frame
y: second data.frame. Ignored when deduplication = TRUE.
deduplication: generate pairs from only x. Ignore y. This is usefull for deduplication of x.
add_xy: add x and y as attributes to the returned pairs. This makes calling some subsequent operations that need x and y (such as compare_pairs easier.
Returns
A data.table with two columns, .x and .y, is returned. Columns .x and .y are row numbers from data.frames .x and .y respectively.
Details
Generating (all) pairs of the records of two data sets, is usually the first step when linking the two data sets.