fcount function

Efficiently Count Observations by Group

Efficiently Count Observations by Group

A much faster replacement for dplyr::count.

fcount(x, ..., w = NULL, name = "N", add = FALSE, sort = FALSE, decreasing = FALSE) fcountv(x, cols = NULL, w = NULL, name = "N", add = FALSE, sort = FALSE, ...)

Arguments

  • x: a data frame or list-like object, including 'grouped_df' or 'indexed_frame'. Atomic vectors or matrices can also be passed, but will be sent through qDF.
  • ...: for fcount: names or sequences of columns to count cases by - passed to fselect. For fcountv: further arguments passed to GRP (such as decreasing, na.last, method, effect etc.). Leaving this empty will count on all columns.
  • cols: select columns to count cases by, using column names, indices, a logical vector or a selector function (e.g. is_categorical).
  • w: a numeric vector of weights, may contain missing values. In fcount this can also be the (unquoted) name of a column in the data frame. fcountv also supports a single character name. Note that the corresponding argument in dplyr::count is called wt, but collapse has a global default for weights arguments to be called w.
  • name: character. The name of the column containing the count or sum of weights. dplyr::count it is called "n", but "N" is more consistent with the rest of collapse and data.table.
  • add: TRUE adds the count column to x. Alternatively add = "group_vars" (or add = "gv" for parsimony) can be used to retain only the variables selected for counting in x and the count.
  • sort, decreasing: arguments passed to GRP affecting the order of rows in the output (if add = FALSE), and the algorithm used for counting. In general, sort = FALSE is faster unless data is already sorted by the columns used for counting.

Returns

If x is a list, an object of the same type as x with a column (name) added at the end giving the count. Otherwise, if x is atomic, a data frame returned from qDF(x) with the count column added. By default (add = FALSE) only the unique rows of x of the columns used for counting are returned.

See Also

GRPN, fnobs, fndistinct, Fast Grouping and Ordering , Collapse Overview

Examples

fcount(mtcars, cyl, vs, am) fcountv(mtcars, cols = .c(cyl, vs, am)) fcount(mtcars, cyl, vs, am, sort = TRUE) fcount(mtcars, cyl, vs, am, add = TRUE) fcount(mtcars, cyl, vs, am, add = "group_vars") ## With grouped data mtcars |> fgroup_by(cyl, vs, am) |> fcount() mtcars |> fgroup_by(cyl, vs, am) |> fcount(add = TRUE) mtcars |> fgroup_by(cyl, vs, am) |> fcount(add = "group_vars") ## With indexed data: by default counting on the first index variable wlddev |> findex_by(country, year) |> fcount() wlddev |> findex_by(country, year) |> fcount(add = TRUE) # Use fcountv to pass additional arguments to GRP.pdata.frame, # here using the effect argument to choose a different index variable wlddev |> findex_by(country, year) |> fcountv(effect = "year") wlddev |> findex_by(country, year) |> fcountv(add = "group_vars", effect = "year")
  • Maintainer: Sebastian Krantz
  • License: GPL (>= 2) | file LICENSE
  • Last published: 2025-03-10