api-perform() R function from [bigrquery]

BigQuery jobs: perform a job

These functions are low-level functions designed to be used by experts. Each of these low-level functions is paired with a high-level function that you should use instead:

bq_perform_copy(): bq_table_copy().
bq_perform_query(): bq_dataset_query(), bq_project_query().
bq_perform_upload(): bq_table_upload().
bq_perform_load(): bq_table_load().
bq_perform_extract(): bq_table_save().


bq_perform_extract(
  x,
  destination_uris,
  destination_format = "NEWLINE_DELIMITED_JSON",
  compression = "NONE",
  ...,
  print_header = TRUE,
  billing = x$project
)

bq_perform_upload(
  x,
  values,
  fields = NULL,
  create_disposition = "CREATE_IF_NEEDED",
  write_disposition = "WRITE_EMPTY",
  ...,
  billing = x$project
)

bq_perform_load(
  x,
  source_uris,
  billing = x$project,
  source_format = "NEWLINE_DELIMITED_JSON",
  fields = NULL,
  nskip = 0,
  create_disposition = "CREATE_IF_NEEDED",
  write_disposition = "WRITE_EMPTY",
  ...
)

bq_perform_query(
  query,
  billing,
  ...,
  parameters = NULL,
  destination_table = NULL,
  default_dataset = NULL,
  create_disposition = "CREATE_IF_NEEDED",
  write_disposition = "WRITE_EMPTY",
  use_legacy_sql = FALSE,
  priority = "INTERACTIVE"
)

bq_perform_query_dry_run(
  query,
  billing,
  ...,
  default_dataset = NULL,
  parameters = NULL,
  use_legacy_sql = FALSE
)

bq_perform_copy(
  src,
  dest,
  create_disposition = "CREATE_IF_NEEDED",
  write_disposition = "WRITE_EMPTY",
  ...,
  billing = NULL
)

Arguments

x: A bq_table
destination_uris: A character vector of fully-qualified Google Cloud Storage URIs where the extracted table should be written. Can export up to 1 Gb of data per file. Use a wild card URI (e.g. gs://[YOUR_BUCKET]/file-name-*.json) to automatically create any number of files.
destination_format: The exported file format. Possible values include "CSV", "NEWLINE_DELIMITED_JSON" and "AVRO". Tables with nested or repeated fields cannot be exported as CSV.
compression: The compression type to use for exported files. Possible values include "GZIP", "DEFLATE", "SNAPPY", and "NONE". "DEFLATE" and "SNAPPY" are only supported for Avro.
...: Additional arguments passed on to the underlying API call. snake_case names are automatically converted to camelCase.
print_header: Whether to print out a header row in the results.
billing: Identifier of project to bill.
values: Data frame of values to insert.
fields: A bq_fields specification, or something coercible to it (like a data frame). Leave as NULL to allow BigQuery to auto-detect the fields.
create_disposition: Specifies whether the job is allowed to create new tables.

The following values are supported:
- "CREATE_IF_NEEDED": If the table does not exist, BigQuery creates the table.
- "CREATE_NEVER": The table must already exist. If it does not, a 'notFound' error is returned in the job result.
write_disposition: Specifies the action that occurs if the destination table already exists. The following values are supported:
- "WRITE_TRUNCATE": If the table already exists, BigQuery overwrites the table data.
- "WRITE_APPEND": If the table already exists, BigQuery appends the data to the table.
- "WRITE_EMPTY": If the table already exists and contains data, a 'duplicate' error is returned in the job result.
source_uris: The fully-qualified URIs that point to your data in Google Cloud.

For Google Cloud Storage URIs: Each URI can contain one `'*'`` wildcard character and it must come after the 'bucket' name. Size limits related to load jobs apply to external data sources.

For Google Cloud Bigtable URIs: Exactly one URI can be specified and it has be a fully specified and valid HTTPS URL for a Google Cloud Bigtable table. For Google Cloud Datastore backups: Exactly one URI can be specified. Also, the '*' wildcard character is not allowed.
source_format: The format of the data files:
- For CSV files, specify "CSV".
- For datastore backups, specify "DATASTORE_BACKUP".
- For newline-delimited JSON, specify "NEWLINE_DELIMITED_JSON".
- For Avro, specify "AVRO".
- For parquet, specify "PARQUET".
- For orc, specify "ORC".
nskip: For source_format = "CSV", the number of header rows to skip.
query: SQL query string.
parameters: Named list of parameters match to query parameters. Parameter x will be matched to placeholder @x.

Generally, you can supply R vectors and they will be automatically converted to the correct type. If you need greater control, you can call bq_param_scalar() or bq_param_array() explicitly.

See https://cloud.google.com/bigquery/docs/parameterized-queries

for more details.
destination_table: A bq_table where results should be stored. If not supplied, results will be saved to a temporary table that lives in a special dataset. You must supply this parameter for large queries (> 128 MB compressed).
default_dataset: A bq_dataset used to automatically qualify table names.
use_legacy_sql: If TRUE will use BigQuery's legacy SQL format.
priority: Specifies a priority for the query. Possible values include "INTERACTIVE" and "BATCH". Batch queries do not start immediately, but are not rate-limited in the same way as interactive queries.

Returns

A bq_job .

Google BigQuery API documentation

jobs

Additional information at:

Examples


ds <- bq_test_dataset()
bq_mtcars <- bq_table(ds, "mtcars")
job <- bq_perform_upload(bq_mtcars, mtcars)
bq_table_exists(bq_mtcars)

bq_job_wait(job)
bq_table_exists(bq_mtcars)
head(bq_table_download(bq_mtcars))

bigrquery package Read PDF manual

Maintainer: Hadley Wickham
License: MIT + file LICENSE
Last published: 2024-03-14

Useful links

api-perform function

BigQuery jobs: perform a job

Arguments

Returns

Google BigQuery API documentation

Examples