predict.problink_em function

Calculate weights and probabilities for pairs

Calculate weights and probabilities for pairs

## S3 method for class 'problink_em' predict( object, pairs = newdata, newdata = NULL, type = c("weights", "mpost", "probs", "all"), binary = FALSE, add = FALSE, comparators, inplace = FALSE, new_name = NULL, ... )

Arguments

  • object: an object of type problink_em as produced by problink_em.
  • pairs: a object with pairs for which to calculate weights.
  • newdata: an alternative name for the pairs argument. Specify newdata or pairs.
  • type: a character vector of length one specifying what to calculate. See results for more information.
  • binary: convert comparison vectors to binary vectors using the comparison function in comparators.
  • add: add the predictions to the original pairs object.
  • comparators: a list of comparison functions (see compare_pairs). When missing attr(pairs, 'comparators') is used.
  • inplace: logical indicating whether pairs should be modified in place. When pairs is large this can be more efficient.
  • new_name: name of new object to assign the pairs to on the cluster nodes (only relevant when pairs is of type cluster_pairs.
  • ...: unused.

Returns

When pairs is of type pairs, returns a data.table with either the .x and .y columns from pairs (when add = FALSE) or all columns of pairs. To these columns are added:

  • In case of type = "weights" a column weights with the calculated weights.
  • In case of type = "mpost" a column mpost with the calculated posterior probabilities (probability that pair is a match given comparison vector.
  • In case of type = "prob" the columns mprob and uprob with the m and u-probabilites and mpost and upost with the posterior m- and u-probabilities.
  • In case of type = "all" all of the above.

In case of compare_pairs.cluster_pairs, compare_pair.pairs is called on each cluster node and the resulting pairs are assigned to new_name in the environment reclin_env. When new_name is not given (or equal to NULL) the original pairs on the nodes are overwritten.

  • Maintainer: Jan van der Laan
  • License: GPL-3
  • Last published: 2024-02-09