Call a function on each of the worker nodes and pass it the pairs
Call a function on each of the worker nodes and pass it the pairs
cluster_call(pairs, fun,...)
Arguments
pairs: an object or type cluster_pairs as created for example by cluster_pair.
fun: a function to call on each of the worker nodes. See details on the arguments of this function.
...: additional arguments are passed on to fun.
Returns
The function will return a list with for each worker the result of the function call. When the functions return NULL the result is returned invisibly. Because the result is returned to main node, make sure you don't accidentally return all pairs. If you don't want to return anything end your function with NULL.
Details
The function will have to accept the following arguments as its first three arguments:
pairs: the data.table with the pairs of the worker node.
x: a data.table with the portion of x present on the worker node.
y: a data.table with y.
Examples
# Generate some pairslibrary(parallel)data("linkexample1","linkexample2")cl <- makeCluster(2)pairs <- cluster_pair(cl, linkexample1, linkexample2)compare_pairs(pairs, c("lastname","firstname","address","sex"))# Add a new column to pairscluster_call(pairs,function(pairs,...){ pairs[, name := firstname & lastname]# we don't want to return the pairs; so make sure to return something# elseNULL})# Get the number of pairs on each nodelenghts <- cluster_call(pairs,function(pairs,...){ nrow(pairs)})lengths <- unlist(lenghts)lenghts
# CleanupstopCluster(cl)