groupid is an enhanced version of data.table::rleid for atomic vectors. It generates a run-length type group-id where consecutive identical values are assigned the same integer. It is a generalization as it can be applied to unordered vectors, generate group id's starting from an arbitrary value, and skip missing values.
groupid(x, o =NULL, start =1L, na.skip =FALSE, check.o =TRUE)
Arguments
x: an atomic vector of any type. Attributes are not considered.
o: an (optional) integer ordering vector specifying the order by which to pass through x.
start: integer. The starting value of the resulting group-id. Default is starting from 1.
na.skip: logical. Skip missing values i.e. if TRUE something like groupid(c("a", NA, "a")) gives c(1, NA, 1) whereas FALSE gives c(1, 2, 3).
check.o: logical. Programmers option: FALSE prevents checking that each element of o is in the range [1, length(x)], it only checks the length of o. This gives some extra speed, but will terminate R if any element of o is too large or too small.
Returns
An integer vector of class 'qG'. See qG.
See Also
seqid, timeid, qG, Fast Grouping and Ordering , Collapse Overview
Examples
groupid(airquality$Month)groupid(airquality$Month, start =0)groupid(wlddev$country)[1:100]## Same thing since country is alphabetically ordered: (groupid is faster..)all.equal(groupid(wlddev$country), qG(wlddev$country, na.exclude =FALSE))## When data is unordered, group-id can be generated through an ordering..uo <- order(rnorm(fnrow(airquality)))monthuo <- airquality$Month[uo]o <- order(monthuo)groupid(monthuo, o)identical(groupid(monthuo, o)[o], unattrib(groupid(airquality$Month)))