cluster_call function

Call a function on each of the worker nodes and pass it the pairs

Call a function on each of the worker nodes and pass it the pairs

cluster_call(pairs, fun, ...)

Arguments

  • pairs: an object or type cluster_pairs as created for example by cluster_pair.
  • fun: a function to call on each of the worker nodes. See details on the arguments of this function.
  • ...: additional arguments are passed on to fun.

Returns

The function will return a list with for each worker the result of the function call. When the functions return NULL the result is returned invisibly. Because the result is returned to main node, make sure you don't accidentally return all pairs. If you don't want to return anything end your function with NULL.

Details

The function will have to accept the following arguments as its first three arguments:

  • pairs: the data.table with the pairs of the worker node.
  • x: a data.table with the portion of x present on the worker node.
  • y: a data.table with y.

Examples

# Generate some pairs library(parallel) data("linkexample1", "linkexample2") cl <- makeCluster(2) pairs <- cluster_pair(cl, linkexample1, linkexample2) compare_pairs(pairs, c("lastname", "firstname", "address", "sex")) # Add a new column to pairs cluster_call(pairs, function(pairs, ...) { pairs[, name := firstname & lastname] # we don't want to return the pairs; so make sure to return something # else NULL }) # Get the number of pairs on each node lenghts <- cluster_call(pairs, function(pairs, ...) { nrow(pairs) }) lengths <- unlist(lenghts) lenghts # Cleanup stopCluster(cl)
  • Maintainer: Jan van der Laan
  • License: GPL-3
  • Last published: 2024-02-09