Larger-than-RAM Disk-Based Data Manipulation Framework
Add a chunk to the disk.frame
Convert disk.frame to data.frame by collecting all chunks
Convert disk.frame to data.table by collecting all chunks
Make a data.frame into a disk.frame
Bind rows
#' @export #' @importFrom dplyr add_count #' @rdname dplyr_verbs add_c...
Apply the same function to all chunks
cmap2
a function to two disk.frames
Bring the disk.frame into R
Return the column names of the disk.frame
Force computations. The results are stored in a folder.
Create function that applies to each chunk if disk.frame
Convert CSV file(s) to disk.frame format
Delete a disk.frame
Get the size of RAM in gigabytes
Create a disk.frame from a folder
A function to convert a disk.frame to parquet format
The dplyr verbs implemented for disk.frame
Helper function to evalparse some glue::glue
string
Find globals in an expression by searching through the chain
Apply data.table's foverlaps to the disk.frame
Generate synthetic dataset for testing
Obtain one chunk by chunk id
Get the chunk IDs and files names
Get the partitioning structure of a folder
A function to parse the summarize function
The shard keys of the disk.frame
Head and tail of the disk.frame
Checks if a folder is a disk.frame
Performs join/merge for disk.frames
Merge function for disk.frames
Move or copy a disk.frame to another location
Returns the number of chunks in a disk.frame
Number of rows or columns
One Stage function
Check if the outdir exists or not
Filter the dataset based on folder partitions
Play the recorded lazy operations
Print disk.frame
Pull a column from table similar to dplyr::pull
.
Used to convert a function to purrr syntax if needed
rbindlist disk.frames together
Increase or decrease the number of chunks in the disk.frame
Recommend number of chunks based on input size
Removes a chunk from the disk.frame
Sample n rows from a disk.frame
Set up disk.frame environment
Shard a data.frame/data.table or disk.frame into chunk and saves it in...
Returns the shardkey (not implemented yet)
Compare two disk.frame shardkeys
Show the code to setup disk.frame
Turn a string of the form /partion1=val/partion2=val2 into data.frame
Keep only the variables from the input listed in selections
[[ interface for disk.frame using fst backend
Column names for RStudio auto-complete
Write disk.frame to disk
zip_to_disk.frame
is used to read and convert every CSV file within ...
A disk-based data manipulation tool for working with large-than-RAM datasets. Aims to lower the barrier-to-entry for manipulating large datasets by adhering closely to popular and familiar data manipulation paradigms like 'dplyr' verbs and 'data.table' syntax.