If a function , it is used as is. It should have at least 2 formal arguments.
If a formula , e.g. ~ head(.x), it is converted to a function.
In the formula, you can use
. or .x to refer to the subset of rows of .tbl
for the given group
.y to refer to the key, a one row tibble with one column per grouping variable that identifies the group
...: Additional arguments passed on to .f
.keep: are the grouping variables kept in .x
Returns
group_modify() returns a grouped tibble. In that case .f must return a data frame.
group_map() returns a list of results from calling .f on each group.
group_walk() calls .f for side effects and returns the input .tbl, invisibly.
Details
Use group_modify() when summarize() is too limited, in terms of what you need to do and return for each group. group_modify() is good for "data frame in, data frame out". If that is too limited, you need to use a nested or split workflow. group_modify() is an evolution of do(), if you have used that before.
Each conceptual group of the data frame is exposed to the function .f with two pieces of information:
The subset of the data for the group, exposed as .x.
The key, a tibble with exactly one row and columns for each grouping variable, exposed as .y.
For completeness, group_modify(), group_map and group_walk() also work on ungrouped data frames, in that case the function is applied to the entire data frame (exposed as .x), and .y is a one row tibble with no column, consistently with group_keys().
Examples
# return a listmtcars %>% group_by(cyl)%>% group_map(~ head(.x,2L))# return a tibble grouped by `cyl` with 2 rows per group# the grouping data is recalculatedmtcars %>% group_by(cyl)%>% group_modify(~ head(.x,2L))# a list of tibblesiris %>% group_by(Species)%>% group_map(~ broom::tidy(lm(Petal.Length ~ Sepal.Length, data = .x)))# a restructured grouped tibbleiris %>% group_by(Species)%>% group_modify(~ broom::tidy(lm(Petal.Length ~ Sepal.Length, data = .x)))# a list of vectorsiris %>% group_by(Species)%>% group_map(~ quantile(.x$Petal.Length, probs = c(0.25,0.5,0.75)))# to use group_modify() the lambda must return a data frameiris %>% group_by(Species)%>% group_modify(~{ quantile(.x$Petal.Length, probs = c(0.25,0.5,0.75))%>% tibble::enframe(name ="prob", value ="quantile")})iris %>% group_by(Species)%>% group_modify(~{ .x %>% purrr::map_dfc(fivenum)%>% mutate(nms = c("min","Q1","median","Q3","max"))})# group_walk() is for side effectsdir.create(temp <- tempfile())iris %>% group_by(Species)%>% group_walk(~ write.csv(.x, file = file.path(temp, paste0(.y$Species,".csv"))))list.files(temp, pattern ="csv$")unlink(temp, recursive =TRUE)# group_modify() and ungrouped data framesmtcars %>% group_modify(~ head(.x,2L))
See Also
Other grouping functions: group_by(), group_nest(), group_split(), group_trim()