count() R function from [dplyr]

Count the observations in each group

count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()). count() is paired with tally(), a lower-level helper that is equivalent to df %>% summarise(n = n()). Supply wt to perform weighted counts, switching the summary from n = n() to n = sum(wt).

add_count() and add_tally() are equivalents to count() and tally()

but use mutate() instead of summarise() so that they add a new column with group-wise counts.


count(x, ..., wt = NULL, sort = FALSE, name = NULL)

## S3 method for class 'data.frame'
count(
  x,
  ...,
  wt = NULL,
  sort = FALSE,
  name = NULL,
  .drop = group_by_drop_default(x)
)

tally(x, wt = NULL, sort = FALSE, name = NULL)

add_count(x, ..., wt = NULL, sort = FALSE, name = NULL, .drop = deprecated())

add_tally(x, wt = NULL, sort = FALSE, name = NULL)

Arguments

x: A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).
...: <data-masking> Variables to group by.
wt: <data-masking> Frequency weights. Can be NULL or a variable:
- If NULL (the default), counts the number of rows in each group.
- If a variable, computes sum(wt) for each group.
sort: If TRUE, will show the largest groups at the top.
name: The name of the new column in the output.

If omitted, it will default to n. If there's already a column called n, it will use nn. If there's a column called n and nn, it'll use nnn, and so on, adding ns until it gets a new name.
.drop: Handling of factor levels that don't appear in the data, passed on to group_by().

For count(): if FALSE will include counts for empty groups (i.e. for levels of factors that don't exist in the data).

For add_count(): deprecated since it can't actually affect the output.

Returns

An object of the same type as .data. count() and add_count()

group transiently, so the output has the same groups as the input.

Examples


# count() is a convenient way to get a sense of the distribution of
# values in a dataset
starwars %>% count(species)
starwars %>% count(species, sort = TRUE)
starwars %>% count(sex, gender, sort = TRUE)
starwars %>% count(birth_decade = round(birth_year, -1))

# use the `wt` argument to perform a weighted count. This is useful
# when the data has already been aggregated once
df <- tribble(
  ~name,    ~gender,   ~runs,
  "Max",    "male",       10,
  "Sandra", "female",      1,
  "Susan",  "female",      4
)
# counts rows:
df %>% count(gender)
# counts runs:
df %>% count(gender, wt = runs)

# When factors are involved, `.drop = FALSE` can be used to retain factor
# levels that don't appear in the data
df2 <- tibble(
  id = 1:5,
  type = factor(c("a", "c", "a", NA, "a"), levels = c("a", "b", "c"))
)
df2 %>% count(type)
df2 %>% count(type, .drop = FALSE)

# Or, using `group_by()`:
df2 %>% group_by(type, .drop = FALSE) %>% count()

# tally() is a lower-level function that assumes you've done the grouping
starwars %>% tally()
starwars %>% group_by(species) %>% tally()

# both count() and tally() have add_ variants that work like
# mutate() instead of summarise
df %>% add_count(gender, wt = runs)
df %>% add_tally(wt = runs)

dplyr package Read PDF manual

Maintainer: Hadley Wickham
License: MIT + file LICENSE
Last published: 2023-11-17

Useful links

count function

Count the observations in each group

Arguments

Returns

Examples