Relational Query Generator for Data Manipulation at Scale
Execute an ordered sequence of left joins.
Implement an affine transformaton
Execute pipeline treating pipe_left_arg as local data to be copied int...
Apply pipeline to a database.
S4 dispatch method for apply_right.
S4 dispatch method for apply_right.
Data arrow
Assign a value to a slice of data (set of rows meeting a condition, an...
Build a join plan.
Return column names
Return columns used
Hyderdrive (science fiction show) synonym for execute
Complete an experimental design.
Convert a series of simple objects (from YAML deserializaton) to an rq...
Count NULLs per row for given column set.
Construct a table description from a database source.
Build a nice description of a table.
Make a drop columns node (not a relational operation).
Execute a wrapped execution pipeline.
Build some example tables (requires DBI).
Execute an operator tree, bringing back the result to memory.
Cross product vectors in database.
Extend data by adding more columns.
Extend data by adding more columns.
Format a single node for printing.
Get a database connection option.
Build a draw-able specification of the join diagram
Build a sequence of statements simulating an if/else block-`if(){}else...
Build a relop
node simulating a per-row block-if(){}else{}
.
check that a join plan is consistent with table descriptions.
Return all columns as guess of preferred primary keys.
Return all primary key columns as guess at preferred primary keys for ...
Return all primary key columns as guess at preferred primary keys for ...
Construct a table description of a local data.frame.
Use one column to pick values from other columns.
Make a list of assignments, applying many functions to many columns.
Remap values in a set of columns.
Indicate NULLs per row for given column set.
Materialize an optree as a table.
Create a materialize node.
Materialize a user supplied SQL statement as a table.
Make a table description directly.
Make a natural_join node.
Wrap a non-SQL node.
Build an optree pipeline that normalizes a set of columns so each colu...
Create a null_replace node.
Build a diagram of a optree pipeline.
Make a order_expr node.
Make a order_expr node.
Make an orderby node (not a relational operation).
Make an orderby node (not a relational operation).
Build an optree pipeline that selects up to the top k rows from each g...
pre_sql_token funtion name
pre_sql_identifier: abstract name of a column and where it is comming ...
pre_sql_string
pre_sql_sub_expr
Convert a pre_sql token object to SQL query text.
Convert a pre_sql token object to SQL query text.
Return SQL transform of tokens.
pre_sql_token
project data by grouping, and adding aggregate columns.
project data by grouping, and adding aggregate columns.
Compute quantiles of specified columns (without interpolation, needs a...
Compute quantiles over non-NULL values (without interpolation, needs a...
Quote an identifier.
Quote a value
Quote a string
Quote a table name.
Make a rename columns node (copies columns not renamed).
Build an optree pipeline counts rows.
List table column names.
Get column types by example values as a data.frame.
Get advice for a DB connection (beyond tests).
Build a canonical name for a db connection class.
Try and test database for some option settings.
Copy local R table to remote data handle.
Execute a query, typically an update that is not supposed to return re...
Return function mappings for a connection
Execute a get query, typically a non-update that is supposed to return...
Get head of db table
Count rows and return as numeric
Remove table
Check if a table exists.
rquery
: Relational Query Generator for Data Manipulation
Execute optree in an environment where d is the only data.
Build a db information stand-in
An example rquery_db_info
object useful for formatting SQL
without...
Default to_sql method implementations.
Quick look at remote data
Compute usable summary of columns of remote table.
Create an rsumary relop operator node.
Make a select columns node (not a relational operation).
Make a select rows node.
Make a select rows node.
Make a set indicator node.
Set a database connection option.
Set a database connection option.
Build a query that applies a SQL expression to a set of columns.
Make a general SQL node.
Structure of a pre_sql_sub_expr
Return vector of table names used.
Make a theta_join node.
Make a theta_join node.
Return SQL implementation of operation tree.
Convert an rquery op diagram to a simple representation, appropriate f...
Cross-parse from an R parse tree into SQL.
Topologically sort join plan so values are available before uses.
Make an unionall node (not a relational operation).
Wrap a data frame for later execution.
A piped query generator based on Edgar F. Codd's relational algebra, and on production experience using 'SQL' and 'dplyr' at big data scale. The design represents an attempt to make 'SQL' more teachable by denoting composition by a sequential pipeline notation instead of nested queries or functions. The implementation delivers reliable high performance data processing on large data systems such as 'Spark', databases, and 'data.table'. Package features include: data processing trees or pipelines as observable objects (able to report both columns produced and columns used), optimized 'SQL' generation as an explicit user visible table modeling step, plus explicit query reasoning and checking.
Useful links