join function

Join methods for Kusto tables

Join methods for Kusto tables

These methods are the same as other joining methods, with the exception of the .strategy, .shufflekeys and .num_partitions optional arguments. They provide hints to the Kusto engine on how to execute the join, and can sometimes be useful to speed up a query. See the Kusto documentation for more details.

## S3 method for class 'tbl_kusto_abstract' inner_join( x, y, by = NULL, copy = NULL, suffix = c(".x", ".y"), ..., keep = NULL, .strategy = NULL, .shufflekeys = NULL, .num_partitions = NULL, .remote = NULL ) ## S3 method for class 'tbl_kusto_abstract' left_join( x, y, by = NULL, copy = NULL, suffix = c(".x", ".y"), ..., keep = NULL, .strategy = NULL, .shufflekeys = NULL, .num_partitions = NULL, .remote = NULL ) ## S3 method for class 'tbl_kusto_abstract' right_join( x, y, by = NULL, copy = NULL, suffix = c(".x", ".y"), ..., keep = NULL, .strategy = NULL, .shufflekeys = NULL, .num_partitions = NULL, .remote = NULL ) ## S3 method for class 'tbl_kusto_abstract' full_join( x, y, by = NULL, copy = NULL, suffix = c(".x", ".y"), ..., keep = NULL, .strategy = NULL, .shufflekeys = NULL, .num_partitions = NULL, .remote = NULL ) ## S3 method for class 'tbl_kusto_abstract' semi_join( x, y, by = NULL, copy = NULL, ..., suffix = c(".x", ".y"), .strategy = NULL, .shufflekeys = NULL, .num_partitions = NULL, .remote = NULL ) ## S3 method for class 'tbl_kusto_abstract' anti_join( x, y, by = NULL, copy = NULL, suffix = c(".x", ".y"), .strategy = NULL, .shufflekeys = NULL, .num_partitions = NULL, .remote = NULL, ... )

Arguments

  • x, y: Kusto tbls.
  • by: The columns to join on.
  • copy: Needed for agreement with generic. Not otherwise used.
  • suffix: The suffixes to use for deduplicating column names.
  • ...: Other arguments passed to lower-level functions.
  • keep: Needed for agreement with generic. Not otherwise used. Kusto retains keys from both sides of joins.
  • .strategy: A join strategy hint to pass to Kusto. Currently the values supported are "shuffle" and "broadcast".
  • .shufflekeys: A character vector of column names to use as shuffle keys.
  • .num_partitions: The number of partitions for a shuffle query.
  • .remote: A join strategy hint to use for cross-cluster joins. Can be "left", "right", "local" or "auto" (the default).

Examples

## Not run: tbl1 <- tbl_kusto(db, "table1") tbl2 <- tbl_kusto(db, "table2") # standard dplyr syntax: left_join(tbl1, tbl2) # Kusto extensions: left_join(tbl1, tbl2, .strategy = "broadcast") # a broadcast join left_join(tbl1, tbl2, .shufflekeys = c("var1", "var2")) # shuffle join with shuffle keys left_join(tbl1, tbl2, .num_partitions = 5) # no. of partitions for a shuffle join ## End(Not run)

See Also

dplyr::join

  • Maintainer: Alex Kyllo
  • License: MIT + file LICENSE
  • Last published: 2023-10-12