glue() R function from [paws]

AWS Glue

Glue

Defines the public endpoint for the Glue service.


glue(config = list(), credentials = list(), endpoint = NULL, region = NULL)

Arguments

config: Optional configuration of credentials, endpoint, and/or region.
- credentials :
  - creds :
    - access_key_id : AWS access key ID
    - secret_access_key : AWS secret access key
    - session_token : AWS temporary session token
  - profile : The name of a profile to use. If not given, then the default profile is used.
  - anonymous : Set anonymous credentials.
- endpoint : The complete URL to use for the constructed client.
- region : The AWS Region used in instantiating the client.
- close_connection : Immediately close all HTTP connections.
- timeout : The time in seconds till a timeout exception is thrown when attempting to make a connection. The default is 60 seconds.
- s3_force_path_style : Set this to true to force the request to use path-style addressing, i.e. http://s3.amazonaws.com/BUCKET/KEY.
- sts_regional_endpoint : Set sts regional endpoint resolver to regional or legacy https://docs.aws.amazon.com/sdkref/latest/guide/feature-sts-regionalized-endpoints.html
credentials: Optional credentials shorthand for the config parameter
- creds :
  - access_key_id : AWS access key ID
  - secret_access_key : AWS secret access key
  - session_token : AWS temporary session token
- profile : The name of a profile to use. If not given, then the default profile is used.
- anonymous : Set anonymous credentials.
endpoint: Optional shorthand for complete URL to use for the constructed client.
region: Optional shorthand for AWS Region used in instantiating the client.

Returns

A client for the service. You can call the service's operations using syntax like svc$operation(...), where svc is the name you've assigned to the client. The available operations are listed in the Operations section.

Service syntax

svc <- glue(
  config = list(
    credentials = list(
 creds = list(
   access_key_id = "string",
   secret_access_key = "string",
   session_token = "string"
 ),
 profile = "string",
 anonymous = "logical"
    ),
    endpoint = "string",
    region = "string",
    close_connection = "logical",
    timeout = "numeric",
    s3_force_path_style = "logical",
    sts_regional_endpoint = "string"
  ),
  credentials = list(
    creds = list(
 access_key_id = "string",
 secret_access_key = "string",
 session_token = "string"
    ),
    profile = "string",
    anonymous = "logical"
  ),
  endpoint = "string",
  region = "string"
)

Operations


batch_create_partition	Creates one or more partitions in a batch operation
batch_delete_connection	Deletes a list of connection definitions from the Data Catalog
batch_delete_partition	Deletes one or more partitions in a batch operation
batch_delete_table	Deletes multiple tables at once
batch_delete_table_version	Deletes a specified batch of versions of a table
batch_get_blueprints	Retrieves information about a list of blueprints
batch_get_crawlers	Returns a list of resource metadata for a given list of crawler names
batch_get_custom_entity_types	Retrieves the details for the custom patterns specified by a list of names
batch_get_data_quality_result	Retrieves a list of data quality results for the specified result IDs
batch_get_dev_endpoints	Returns a list of resource metadata for a given list of development endpoint names
batch_get_jobs	Returns a list of resource metadata for a given list of job names
batch_get_partition	Retrieves partitions in a batch request
batch_get_table_optimizer	Returns the configuration for the specified table optimizers
batch_get_triggers	Returns a list of resource metadata for a given list of trigger names
batch_get_workflows	Returns a list of resource metadata for a given list of workflow names
batch_put_data_quality_statistic_annotation	Annotate datapoints over time for a specific data quality statistic
batch_stop_job_run	Stops one or more job runs for a specified job definition
batch_update_partition	Updates one or more partitions in a batch operation
cancel_data_quality_rule_recommendation_run	Cancels the specified recommendation run that was being used to generate rules
cancel_data_quality_ruleset_evaluation_run	Cancels a run where a ruleset is being evaluated against a data source
cancel_ml_task_run	Cancels (stops) a task run
cancel_statement	Cancels the statement
check_schema_version_validity	Validates the supplied schema
create_blueprint	Registers a blueprint with Glue
create_catalog	Creates a new catalog in the Glue Data Catalog
create_classifier	Creates a classifier in the user's account
create_column_statistics_task_settings	Creates settings for a column statistics task
create_connection	Creates a connection definition in the Data Catalog
create_crawler	Creates a new crawler with specified targets, role, configuration, and optional schedule
create_custom_entity_type	Creates a custom pattern that is used to detect sensitive data across the columns and rows of your structured data
create_database	Creates a new database in a Data Catalog
create_data_quality_ruleset	Creates a data quality ruleset with DQDL rules applied to a specified Glue table
create_dev_endpoint	Creates a new development endpoint
create_integration	Creates a Zero-ETL integration in the caller's account between two resources with Amazon Resource Names (ARNs): the SourceArn and TargetArn
create_integration_resource_property	This API can be used for setting up the ResourceProperty of the Glue connection (for the source) or Glue database ARN (for the target)
create_integration_table_properties	This API is used to provide optional override properties for the the tables that need to be replicated
create_job	Creates a new job definition
create_ml_transform	Creates an Glue machine learning transform
create_partition	Creates a new partition
create_partition_index	Creates a specified partition index in an existing table
create_registry	Creates a new registry which may be used to hold a collection of schemas
create_schema	Creates a new schema set and registers the schema definition
create_script	Transforms a directed acyclic graph (DAG) into code
create_security_configuration	Creates a new security configuration
create_session	Creates a new session
create_table	Creates a new table definition in the Data Catalog
create_table_optimizer	Creates a new table optimizer for a specific function
create_trigger	Creates a new trigger
create_usage_profile	Creates an Glue usage profile
create_user_defined_function	Creates a new function definition in the Data Catalog
create_workflow	Creates a new workflow
delete_blueprint	Deletes an existing blueprint
delete_catalog	Removes the specified catalog from the Glue Data Catalog
delete_classifier	Removes a classifier from the Data Catalog
delete_column_statistics_for_partition	Delete the partition column statistics of a column
delete_column_statistics_for_table	Retrieves table statistics of columns
delete_column_statistics_task_settings	Deletes settings for a column statistics task
delete_connection	Deletes a connection from the Data Catalog
delete_crawler	Removes a specified crawler from the Glue Data Catalog, unless the crawler state is RUNNING
delete_custom_entity_type	Deletes a custom pattern by specifying its name
delete_database	Removes a specified database from a Data Catalog
delete_data_quality_ruleset	Deletes a data quality ruleset
delete_dev_endpoint	Deletes a specified development endpoint
delete_integration	Deletes the specified Zero-ETL integration
delete_integration_table_properties	Deletes the table properties that have been created for the tables that need to be replicated
delete_job	Deletes a specified job definition
delete_ml_transform	Deletes an Glue machine learning transform
delete_partition	Deletes a specified partition
delete_partition_index	Deletes a specified partition index from an existing table
delete_registry	Delete the entire registry including schema and all of its versions
delete_resource_policy	Deletes a specified policy
delete_schema	Deletes the entire schema set, including the schema set and all of its versions
delete_schema_versions	Remove versions from the specified schema
delete_security_configuration	Deletes a specified security configuration
delete_session	Deletes the session
delete_table	Removes a table definition from the Data Catalog
delete_table_optimizer	Deletes an optimizer and all associated metadata for a table
delete_table_version	Deletes a specified version of a table
delete_trigger	Deletes a specified trigger
delete_usage_profile	Deletes the Glue specified usage profile
delete_user_defined_function	Deletes an existing function definition from the Data Catalog
delete_workflow	Deletes a workflow
describe_connection_type	The DescribeConnectionType API provides full details of the supported options for a given connection type in Glue
describe_entity	Provides details regarding the entity used with the connection type, with a description of the data model for each field in the selected entity
describe_inbound_integrations	Returns a list of inbound integrations for the specified integration
describe_integrations	The API is used to retrieve a list of integrations
get_blueprint	Retrieves the details of a blueprint
get_blueprint_run	Retrieves the details of a blueprint run
get_blueprint_runs	Retrieves the details of blueprint runs for a specified blueprint
get_catalog	The name of the Catalog to retrieve
get_catalog_import_status	Retrieves the status of a migration operation
get_catalogs	Retrieves all catalogs defined in a catalog in the Glue Data Catalog
get_classifier	Retrieve a classifier by name
get_classifiers	Lists all classifier objects in the Data Catalog
get_column_statistics_for_partition	Retrieves partition statistics of columns
get_column_statistics_for_table	Retrieves table statistics of columns
get_column_statistics_task_run	Get the associated metadata/information for a task run, given a task run ID
get_column_statistics_task_runs	Retrieves information about all runs associated with the specified table
get_column_statistics_task_settings	Gets settings for a column statistics task
get_connection	Retrieves a connection definition from the Data Catalog
get_connections	Retrieves a list of connection definitions from the Data Catalog
get_crawler	Retrieves metadata for a specified crawler
get_crawler_metrics	Retrieves metrics about specified crawlers
get_crawlers	Retrieves metadata for all crawlers defined in the customer account
get_custom_entity_type	Retrieves the details of a custom pattern by specifying its name
get_database	Retrieves the definition of a specified database
get_databases	Retrieves all databases defined in a given Data Catalog
get_data_catalog_encryption_settings	Retrieves the security configuration for a specified catalog
get_dataflow_graph	Transforms a Python script into a directed acyclic graph (DAG)
get_data_quality_model	Retrieve the training status of the model along with more information (CompletedOn, StartedOn, FailureReason)
get_data_quality_model_result	Retrieve a statistic's predictions for a given Profile ID
get_data_quality_result	Retrieves the result of a data quality rule evaluation
get_data_quality_rule_recommendation_run	Gets the specified recommendation run that was used to generate rules
get_data_quality_ruleset	Returns an existing ruleset by identifier or name
get_data_quality_ruleset_evaluation_run	Retrieves a specific run where a ruleset is evaluated against a data source
get_dev_endpoint	Retrieves information about a specified development endpoint
get_dev_endpoints	Retrieves all the development endpoints in this Amazon Web Services account
get_entity_records	This API is used to query preview data from a given connection type or from a native Amazon S3 based Glue Data Catalog
get_integration_resource_property	This API is used for fetching the ResourceProperty of the Glue connection (for the source) or Glue database ARN (for the target)
get_integration_table_properties	This API is used to retrieve optional override properties for the tables that need to be replicated
get_job	Retrieves an existing job definition
get_job_bookmark	Returns information on a job bookmark entry
get_job_run	Retrieves the metadata for a given job run
get_job_runs	Retrieves metadata for all runs of a given job definition
get_jobs	Retrieves all current job definitions
get_mapping	Creates mappings
get_ml_task_run	Gets details for a specific task run on a machine learning transform
get_ml_task_runs	Gets a list of runs for a machine learning transform
get_ml_transform	Gets an Glue machine learning transform artifact and all its corresponding metadata
get_ml_transforms	Gets a sortable, filterable list of existing Glue machine learning transforms
get_partition	Retrieves information about a specified partition
get_partition_indexes	Retrieves the partition indexes associated with a table
get_partitions	Retrieves information about the partitions in a table
get_plan	Gets code to perform a specified mapping
get_registry	Describes the specified registry in detail
get_resource_policies	Retrieves the resource policies set on individual resources by Resource Access Manager during cross-account permission grants
get_resource_policy	Retrieves a specified resource policy
get_schema	Describes the specified schema in detail
get_schema_by_definition	Retrieves a schema by the SchemaDefinition
get_schema_version	Get the specified schema by its unique ID assigned when a version of the schema is created or registered
get_schema_versions_diff	Fetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry
get_security_configuration	Retrieves a specified security configuration
get_security_configurations	Retrieves a list of all security configurations
get_session	Retrieves the session
get_statement	Retrieves the statement
get_table	Retrieves the Table definition in a Data Catalog for a specified table
get_table_optimizer	Returns the configuration of all optimizers associated with a specified table
get_tables	Retrieves the definitions of some or all of the tables in a given Database
get_table_version	Retrieves a specified version of a table
get_table_versions	Retrieves a list of strings that identify available versions of a specified table
get_tags	Retrieves a list of tags associated with a resource
get_trigger	Retrieves the definition of a trigger
get_triggers	Gets all the triggers associated with a job
get_unfiltered_partition_metadata	Retrieves partition metadata from the Data Catalog that contains unfiltered metadata
get_unfiltered_partitions_metadata	Retrieves partition metadata from the Data Catalog that contains unfiltered metadata
get_unfiltered_table_metadata	Allows a third-party analytical engine to retrieve unfiltered table metadata from the Data Catalog
get_usage_profile	Retrieves information about the specified Glue usage profile
get_user_defined_function	Retrieves a specified function definition from the Data Catalog
get_user_defined_functions	Retrieves multiple function definitions from the Data Catalog
get_workflow	Retrieves resource metadata for a workflow
get_workflow_run	Retrieves the metadata for a given workflow run
get_workflow_run_properties	Retrieves the workflow run properties which were set during the run
get_workflow_runs	Retrieves metadata for all runs of a given workflow
import_catalog_to_glue	Imports an existing Amazon Athena Data Catalog to Glue
list_blueprints	Lists all the blueprint names in an account
list_column_statistics_task_runs	List all task runs for a particular account
list_connection_types	The ListConnectionTypes API provides a discovery mechanism to learn available connection types in Glue
list_crawlers	Retrieves the names of all crawler resources in this Amazon Web Services account, or the resources with the specified tag
list_crawls	Returns all the crawls of a specified crawler
list_custom_entity_types	Lists all the custom patterns that have been created
list_data_quality_results	Returns all data quality execution results for your account
list_data_quality_rule_recommendation_runs	Lists the recommendation runs meeting the filter criteria
list_data_quality_ruleset_evaluation_runs	Lists all the runs meeting the filter criteria, where a ruleset is evaluated against a data source
list_data_quality_rulesets	Returns a paginated list of rulesets for the specified list of Glue tables
list_data_quality_statistic_annotations	Retrieve annotations for a data quality statistic
list_data_quality_statistics	Retrieves a list of data quality statistics
list_dev_endpoints	Retrieves the names of all DevEndpoint resources in this Amazon Web Services account, or the resources with the specified tag
list_entities	Returns the available entities supported by the connection type
list_jobs	Retrieves the names of all job resources in this Amazon Web Services account, or the resources with the specified tag
list_ml_transforms	Retrieves a sortable, filterable list of existing Glue machine learning transforms in this Amazon Web Services account, or the resources with the specified tag
list_registries	Returns a list of registries that you have created, with minimal registry information
list_schemas	Returns a list of schemas with minimal details
list_schema_versions	Returns a list of schema versions that you have created, with minimal information
list_sessions	Retrieve a list of sessions
list_statements	Lists statements for the session
list_table_optimizer_runs	Lists the history of previous optimizer runs for a specific table
list_triggers	Retrieves the names of all trigger resources in this Amazon Web Services account, or the resources with the specified tag
list_usage_profiles	List all the Glue usage profiles
list_workflows	Lists names of workflows created in the account
modify_integration	Modifies a Zero-ETL integration in the caller's account
put_data_catalog_encryption_settings	Sets the security configuration for a specified catalog
put_data_quality_profile_annotation	Annotate all datapoints for a Profile
put_resource_policy	Sets the Data Catalog resource policy for access control
put_schema_version_metadata	Puts the metadata key value pair for a specified schema version ID
put_workflow_run_properties	Puts the specified workflow run properties for the given workflow run
query_schema_version_metadata	Queries for the schema version metadata information
register_schema_version	Adds a new version to the existing schema
remove_schema_version_metadata	Removes a key value pair from the schema version metadata for the specified schema version ID
reset_job_bookmark	Resets a bookmark entry
resume_workflow_run	Restarts selected nodes of a previous partially completed workflow run and resumes the workflow run
run_statement	Executes the statement
search_tables	Searches a set of tables based on properties in the table metadata as well as on the parent database
start_blueprint_run	Starts a new run of the specified blueprint
start_column_statistics_task_run	Starts a column statistics task run, for a specified table and columns
start_column_statistics_task_run_schedule	Starts a column statistics task run schedule
start_crawler	Starts a crawl using the specified crawler, regardless of what is scheduled
start_crawler_schedule	Changes the schedule state of the specified crawler to SCHEDULED, unless the crawler is already running or the schedule state is already SCHEDULED
start_data_quality_rule_recommendation_run	Starts a recommendation run that is used to generate rules when you don't know what rules to write
start_data_quality_ruleset_evaluation_run	Once you have a ruleset definition (either recommended or your own), you call this operation to evaluate the ruleset against a data source (Glue table)
start_export_labels_task_run	Begins an asynchronous task to export all labeled data for a particular transform
start_import_labels_task_run	Enables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality
start_job_run	Starts a job run using a job definition
start_ml_evaluation_task_run	Starts a task to estimate the quality of the transform
start_ml_labeling_set_generation_task_run	Starts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels
start_trigger	Starts an existing trigger
start_workflow_run	Starts a new run of the specified workflow
stop_column_statistics_task_run	Stops a task run for the specified table
stop_column_statistics_task_run_schedule	Stops a column statistics task run schedule
stop_crawler	If the specified crawler is running, stops the crawl
stop_crawler_schedule	Sets the schedule state of the specified crawler to NOT_SCHEDULED, but does not stop the crawler if it is already running
stop_session	Stops the session
stop_trigger	Stops a specified trigger
stop_workflow_run	Stops the execution of the specified workflow run
tag_resource	Adds tags to a resource
test_connection	Tests a connection to a service to validate the service credentials that you provide
untag_resource	Removes tags from a resource
update_blueprint	Updates a registered blueprint
update_catalog	Updates an existing catalog's properties in the Glue Data Catalog
update_classifier	Modifies an existing classifier (a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field is present)
update_column_statistics_for_partition	Creates or updates partition statistics of columns
update_column_statistics_for_table	Creates or updates table statistics of columns
update_column_statistics_task_settings	Updates settings for a column statistics task
update_connection	Updates a connection definition in the Data Catalog
update_crawler	Updates a crawler
update_crawler_schedule	Updates the schedule of a crawler using a cron expression
update_database	Updates an existing database definition in a Data Catalog
update_data_quality_ruleset	Updates the specified data quality ruleset
update_dev_endpoint	Updates a specified development endpoint
update_integration_resource_property	This API can be used for updating the ResourceProperty of the Glue connection (for the source) or Glue database ARN (for the target)
update_integration_table_properties	This API is used to provide optional override properties for the tables that need to be replicated
update_job	Updates an existing job definition
update_job_from_source_control	Synchronizes a job from the source control repository
update_ml_transform	Updates an existing machine learning transform
update_partition	Updates a partition
update_registry	Updates an existing registry which is used to hold a collection of schemas
update_schema	Updates the description, compatibility setting, or version checkpoint for a schema set
update_source_control_from_job	Synchronizes a job to the source control repository
update_table	Updates a metadata table in the Data Catalog
update_table_optimizer	Updates the configuration for an existing table optimizer
update_trigger	Updates a trigger definition
update_usage_profile	Update an Glue usage profile
update_user_defined_function	Updates an existing function definition in the Data Catalog
update_workflow	Updates an existing workflow

Examples


## Not run:

svc <- glue()
svc$batch_create_partition(
  Foo = 123
)
## End(Not run)

paws package Read PDF manual

Maintainer: Dyfan Jones
License: Apache License (>= 2.0)
Last published: 2025-03-17

Useful links

glue function