Fast Data Manipulation
collapse provides the following functions for fast manipulation of (mostly) data frames.
fselect
is a much faster alternative to dplyr::select
to select columns using expressions involving column names. get_vars
is a more versatile and programmer friendly function to efficiently select and replace columns by names, indices, logical vectors, regular expressions, or using functions to identify columns.num_vars
, cat_vars
, char_vars
, fact_vars
, logi_vars
and date_vars
are convenience functions to efficiently select and replace columns by data type.add_vars
efficiently adds new columns at any position within a data frame (default at the end). This can be done vie replacement (i.e. add_vars(data) \<- newdata
) or returning the appended data, e.g., add_vars(data, newdata1, newdata2, ...)
. It is thus also an efficient alternative to cbind.data.frame
.rowbind
efficiently combines data frames / lists row-wise. The implementation is derived from data.table::rbindlist
, it is also a fast alternative to rbind.data.frame
.join
provides fast, class-agnostic, and verbose table joins.pivot
efficiently reshapes data, supporting longer, wider and recast pivoting, as well as multi-column-pivots and pivots taking along variable labels.fsubset
is a much faster version of subset
to efficiently subset vectors, matrices and data frames. If the non-standard evaluation offered by fsubset
is not needed, the function ss
is a much faster and more secure alternative to [.data.frame
.fslice(v)
is a much faster alternative to dplyr::slice_[head|tail|min|max]
for filtering/deduplicating matrix-like objects (by groups).fsummarise
is a much faster version of dplyr::summarise
, especially when used together with the Fast Statistical Functions and fgroup_by
.fmutate
is a much faster version of dplyr::mutate
, especially when used together with the Fast Statistical Functions , the fast Data Transformation Functions , and fgroup_by
.ftransform(v)
is a much faster version of transform
, which also supports list input and nested pipelines. settransform(v)
does all of that by reference, i.e. it assigns to the calling environment. fcompute(v)
is similar to ftransform(v)
but only returns modified/computed columns.roworder
is a fast substitute for dplyr::arrange
, but the syntax is inspired by data.table::setorder
.colorder
efficiently reorders columns in a data frame, see also data.table::setcolorder
.frename
is a fast substitute for dplyr::rename
, to efficiently rename various objects. setrename
renames objects by reference. relabel
and setrelabel
do the same thing for variable labels (see also vlabels
).Function / S3 Generic | Methods | Description | ||
fselect(<-) | No methods, for data frames | Fast select or replace columns (non-standard evaluation) | ||
get_vars(<-) , num_vars(<-) , cat_vars(<-) , char_vars(<-) , fact_vars(<-) , logi_vars(<-) , date_vars(<-) | No methods, for data frames | Fast select or replace columns | ||
add_vars(<-) | No methods, for data frames | Fast add columns | ||
rowbind | No methods, for lists of lists/data frames | Fast row-binding lists | ||
join | No methods, for data frames | Fast table joins | ||
pivot | No methods, for data frames | Fast reshaping | ||
fsubset | default, matrix, data.frame, pseries, pdata.frame | Fast subset data (non-standard evaluation) | ||
ss | No methods, for data frames | Fast subset data frames | ||
fslice(v) | No methods, for matrices and data frames | Fast slicing of rows | ||
fsummarise | No methods, for data frames | Fast data aggregation | ||
fmutate , (f/set)transform(v)(<-) | No methods, for data frames | Compute, modify or delete columns (non-standard evaluation) | ||
fcompute(v) | No methods, for data frames | Compute or modify columns, returned in a new data frame (non-standard evaluation) | ||
roworder(v) | No methods, for data frames incl. pdata.frame | Reorder rows and return data frame (standard and non-standard evaluation) | ||
colorder(v) | No methods, for data frames | Reorder columns and return data frame (standard and non-standard evaluation) | ||
(f/set)rename , (set)relabel | No methods, for all objects with 'names' attribute | Rename and return object / relabel columns in a data frame. |
Collapse Overview , Quick Data Conversion , Recode and Replace Values
Useful links