ff4.5.0 package

Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

add.rd

Incrementing an ff or ram object

array2vector.rd

Array: make vector from array

arrayIndex2vectorIndex.rd

Array: make vector positions from array index

as.ff.bit.rd

Conversion between bit and ff boolean

as.ff.rd

Coercing ram to ff and ff to ram objects

as.ffdf.rd

Coercing to ffdf and data.frame

as.hi.rd

Hybrid Index, coercion to

as.integer.hi.rd

Hybrid Index, coercing from

as.vmode.rd

Coercing to virtual mode

bigsample.rd

Sampling from large pools

CFUN.rd

Collapsing functions for batch processing

chunk.ffdf.rd

Chunk ff_vector and ffdf

clone.ff.rd

Cloning ff and ram objects

clone.ffdf.rd

Cloning ffdf objects

close.ff.rd

Closing ff files

delete.rd

Deleting the file behind an ff object

dim.ff.rd

Getting and setting dim and dimorder

dimnames.ff.rd

Getting and setting dimnames

dimnames.ffdf.rd

Getting and setting dimnames of ffdf

dimorderCompatible.rd

Test for dimorder compatibility

dummy.dimnames.rd

Array: make dimnames

Extract.ff.rd

Reading and writing vectors and arrays (high-level)

Extract.ffdf.rd

Reading and writing data.frames (ffdf)

ff.rd

ff classes for representing (large) atomic data

ffapply.rd

Apply for ff objects

ffconform.rd

Get most conforming argument

ffdf.rd

ff class for data.frames

ffdfindexget.rd

Reading and writing ffdf data.frame using ff subscripts

ffdfsort.rd

Sorting: convenience wrappers for data.frames

ffdrop.rd

Delete an ffarchive

ffindexget.rd

Reading and writing ff vectors using ff subscripts

ffindexorder.rd

Sorting: chunked ordering of integer suscript positions

ffinfo.rd

Inspect content of ff saves

ffload.rd

Reload ffSaved Datasets

fforder.rd

Sorting: order from ff vectors

ffreturn.rd

Return suitable ff object

ffsave.rd

Save R and ff objects

ffsort.rd

Sorting of ff vectors

ffsuitable.rd

Test ff object for suitability

ffxtensions.rd

Test for availability of ff extensions

file.resize.rd

Change size of move an existing file

filename.rd

Get or set filename

finalize.rd

Call finalizer

finalizer.rd

Get and set finalizer (name)

fixdiag.rd

Test for fixed diagonal

Forbidden_ffdf.rd

Forbidden ffdf functions

geterror.ff.rd

Get error and error string

getpagesize.rd

Get page size information

getset.ff.rd

Reading and writing vectors of values (low-level)

hi.rd

Hybrid index class

hiparse.rd

Hybrid Index, parsing

Internal_ffdf.rd

Internal ffdf functions

is.ff.rd

Test for class ff

is.ffdf.rd

Test for class ff

is.open.rd

Test if object is opened

is.readonly.rd

Get readonly status

is.sorted.rd

Getting and setting 'is.sorted' physical attribute

length.ff.rd

Getting and setting length

length.ffdf.rd

Getting length of a ffdf dataframe

length.hi.rd

Hybrid Index, querying

levels.ff.rd

Getting and setting factor levels

LimWarn.rd

ff Limitations and Warnings

matcomb.rd

Array: make matrix indices from row and columns positions

matprint.rd

Print beginning and end of big matrix

maxffmode.rd

Lossless vmode coercability

maxlength.rd

Get physical length of an ff or ram object

mismatch.rd

Test for recycle mismatch

na.count.rd

Getting and setting 'na.count' physical attribute

names.ff.rd

Getting and setting names

nrowAssign.rd

Assigning the number of rows or columns

open.ff.rd

Opening an ff file

pagesize.rd

Pagesize of ff object

physical.ff.rd

Getting and setting physical and virtual attributes of ff objects

physical.ffdf.rd

Getting physical and virtual attributes of ffdf objects

print.ff.rd

Print and str methods

ram2ffcode.rd

Factor codings

ramattribs.rd

Get ramclass and ramattribs

ramorder.default.rd

Sorting: order R vector in-RAM and in-place

ramsort.default.rd

Sorting: Sort R vector in-RAM and in-place

read.table.ffdf.rd

Importing csv files into ff data.frames

readwrite.ff.rd

Reading and writing vectors (low-level)

regtest.fforder.rd

Sorting: regression tests

repnam.rd

Replicate with names

sortLevels.rd

Factor level manipulation

splitPathFile.rd

Analyze pathfile-strings

swap.rd

Reading and writing in one operation (high-level)

symmetric.rd

Test for symmetric structure

symmIndex2vectorIndex.rd

Array: make vector positions from symmetric array index

unclass_-.rd

Unclassed assignement

undim.rd

Undim

unsort.rd

Hybrid Index, internal utilities

update.ff.rd

Update ff content from another object

vecprint.rd

Print beginning and end of big vector

vector.vmode.rd

Create vector of virtual mode

vector2array.rd

Array: make array from vector

vectorIndex2arrayIndex.rd

Array: make array from index vector positions

vmode.ffdf.rd

Virtual storage mode of ffdf

vmode.rd

Virtual storage mode

vt.rd

Virtual transpose

vw.rd

Getting and setting virtual windows

write.table.ffdf.rd

Exporting csv files from ff data.frames

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.