Subset distinct/unique rows
This is a method for the dplyr distinct()
generic. It adds the DISTINCT
clause to the SQL query.
## S3 method for class 'tbl_lazy' distinct(.data, ..., .keep_all = FALSE)
.data
: A lazy data frame backed by a database query....
: <data-masking
> Variables, or functions of variables. Use desc()
to sort a variable in descending order..keep_all
: If TRUE
, keep all variables in .data
. If a combination of ...
is not distinct, this keeps the first row of values.Another tbl_lazy
. Use show_query()
to see the generated query, and use collect()
to execute the query and return data to R.
library(dplyr, warn.conflicts = FALSE) db <- memdb_frame(x = c(1, 1, 2, 2), y = c(1, 2, 1, 1)) db %>% distinct() %>% show_query() db %>% distinct(x) %>% show_query()
Useful links