insertTable function

Insert a table on the server

Insert a table on the server

This function sends the data in a data frame to a table on the server. Either a new table is created, or the data is appended to an existing table.

insertTable( connection, databaseSchema = NULL, tableName, data, dropTableIfExists = TRUE, createTable = TRUE, tempTable = FALSE, oracleTempSchema = NULL, tempEmulationSchema = getOption("sqlRenderTempEmulationSchema"), bulkLoad = Sys.getenv("DATABASE_CONNECTOR_BULK_UPLOAD"), useMppBulkLoad = Sys.getenv("USE_MPP_BULK_LOAD"), progressBar = FALSE, camelCaseToSnakeCase = FALSE )

Arguments

  • connection: The connection to the database server created using either connect() or DBI::dbConnect().
  • databaseSchema: The name of the database schema. See details for platform-specific details.
  • tableName: The name of the table where the data should be inserted.
  • data: The data frame containing the data to be inserted.
  • dropTableIfExists: Drop the table if the table already exists before writing?
  • createTable: Create a new table? If false, will append to existing table.
  • tempTable: Should the table created as a temp table?
  • oracleTempSchema: DEPRECATED: use tempEmulationSchema instead.
  • tempEmulationSchema: Some database platforms like Oracle and Impala do not truly support temp tables. To emulate temp tables, provide a schema with write privileges where temp tables can be created.
  • bulkLoad: If using Redshift, PDW, Hive or Postgres, use more performant bulk loading techniques. Does not work for temp tables (except for HIVE). See Details for requirements for the various platforms.
  • useMppBulkLoad: DEPRECATED. Use bulkLoad instead.
  • progressBar: Show a progress bar when uploading?
  • camelCaseToSnakeCase: If TRUE, the data frame column names are assumed to use camelCase and are converted to snake_case before uploading.

Details

The databaseSchema argument is interpreted differently according to the different platforms: SQL Server and PDW: The databaseSchema schema should specify both the database and the schema, e.g. 'my_database.dbo'. Impala: the databaseSchema should specify the database. Oracle: The databaseSchema should specify the Oracle 'user'. All other : The databaseSchema should specify the schema.

This function sends the data in a data frame to a table on the server. Either a new table is created, or the data is appended to an existing table. NA values are inserted as null values in the database.

Bulk uploading:

Redshift: The MPP bulk loading relies upon the CloudyR S3 library to test a connection to an S3 bucket using AWS S3 credentials. Credentials are configured directly into the System Environment using the following keys: Sys.setenv("AWS_ACCESS_KEY_ID" = "some_access_key_id", "AWS_SECRET_ACCESS_KEY" = "some_secret_access_key", "AWS_DEFAULT_REGION" = "some_aws_region", "AWS_BUCKET_NAME" = "some_bucket_name", "AWS_OBJECT_KEY" = "some_object_key", "AWS_SSE_TYPE" = "server_side_encryption_type").

Spark (DataBricks): The MPP bulk loading relies upon the AzureStor library to test a connection to an Azure ADLS Gen2 storage container using Azure credentials. Credentials are configured directly into the System Environment using the following keys: Sys.setenv("AZR_STORAGE_ACCOUNT" = "some_azure_storage_account", "AZR_ACCOUNT_KEY" = "some_secret_account_key", "AZR_CONTAINER_NAME" = "some_container_name").

PDW: The MPP bulk loading relies upon the client having a Windows OS and the DWLoader exe installed, and the following permissions granted: --Grant BULK Load permissions - needed at a server level USE master; GRANT ADMINISTER BULK OPERATIONS TO user; --Grant Staging database permissions - we will use the user db. USE scratch; EXEC sp_addrolemember 'db_ddladmin', user; Set the R environment variable DWLOADER_PATH to the location of the binary.

PostgreSQL: Uses the 'psql' executable to upload. Set the POSTGRES_PATH environment variable to the Postgres binary path, e.g. 'C:/Program Files/PostgreSQL/11/bin' on Windows or '/Library/PostgreSQL/16/bin' on MacOs.

Examples

## Not run: connectionDetails <- createConnectionDetails( dbms = "mysql", server = "localhost", user = "root", password = "blah" ) conn <- connect(connectionDetails) data <- data.frame(x = c(1, 2, 3), y = c("a", "b", "c")) insertTable(conn, "my_schema", "my_table", data) disconnect(conn) ## bulk data insert with Redshift or PDW connectionDetails <- createConnectionDetails( dbms = "redshift", server = "localhost", user = "root", password = "blah", schema = "cdm_v5" ) conn <- connect(connectionDetails) data <- data.frame(x = c(1, 2, 3), y = c("a", "b", "c")) insertTable( connection = connection, databaseSchema = "scratch", tableName = "somedata", data = data, dropTableIfExists = TRUE, createTable = TRUE, tempTable = FALSE, bulkLoad = TRUE ) # or, Sys.setenv("DATABASE_CONNECTOR_BULK_UPLOAD" = TRUE) ## End(Not run)