AWS Glue Data Catalog

This article describes how to create a connection to your AWS Glue Data Catalog using a SQL command.

In order to create and work with tables within Upsolver, you first need to establish a connection with a metadata store such as Glue Data Catalog.

The tables created from your Glue Data Catalog connection can also be queried from Athena and from within the SQLake UI.

Note that when you integrate Upsolver with your AWS account, there is a Glue Data Catalog connection created by default, but you may still wish to create your own connection for specific access configurations.

See the article Deploying Upsolver on AWS.

Syntax

CREATE GLUE_CATALOG CONNECTION 
    <connection_identifier> 
    [{ AWS_ROLE = '<role_arn>' 
       EXTERNAL_ID = '<external_id>'
     | AWS_ACCESS_KEY_ID = '<key_id>' 
       AWS_SECRET_ACCESS_KEY = '<key>' }]
    DEFAULT_STORAGE_CONNECTION = <identifier>
    DEFAULT_STORAGE_LOCATION = 's3://<bucket>/<folder-path>/'
    [ REGION = '<region>' ]  
    [ DATABASE_DISPLAY_FILTER[S] = { '<database_name>' | ('<database_name>' [, ...]) } ]
    [ COMMENT = '<comment>' ] 

Jump to

Connection options

AWS_ROLE — editable

Type: text

(Optional) The AWS IAM role ARN. Used in conjunction with EXTERNAL_ID.

If omitted, the role created when integrating Upsolver with the AWS account is used.

EXTERNAL_ID — editable

Type: text

(Optional) The external ID of the role to assume. Used in conjunction with AWS_ROLE.

If omitted, the role created when integrating Upsolver with the AWS account is used.

AWS_ACCESS_KEY_ID — editable

Type: text

(Optional) The AWS access key ID. Used in conjunction with AWS_SECRET_ACCESS_KEY.

If omitted, the role created when integrating Upsolver with the AWS account is used.

AWS_SECRET_ACCESS_KEY — editable

Type: text

(Optional) The AWS secret key corresponding to the provided AWS_ACCESS_KEY_ID.

If omitted, the role created when integrating Upsolver with the AWS account is used.

DEFAULT_STORAGE_CONNECTION

Type: identifier

An Amazon S3 connection with the appropriate credentials to write to the DEFAULT_STORAGE_LOCATION provided.

DEFAULT_STORAGE_LOCATION

Type: text

The Amazon S3 path that serves as the default storage location for the underlying files associated with tables created under this metastore connection.

REGION

Type: text

Default: Region in which Upsolver is deployed within your AWS account

(Optional) The region your Glue Catalog is in.

DATABASE_DISPLAY_FILTER[S] — editable

Type: text | list

(Optional) A single database or the list of databases to show. If left empty, all databases are visible.

COMMENT — editable

Type: text

(Optional) A description or comment regarding this connection.

Minimum example

CREATE GLUE_CATALOG CONNECTION my_glue_catalog_connection
    DEFAULT_STORAGE_CONNECTION = my_s3_storage_connection
    DEFAULT_STORAGE_LOCATION = 's3://sqlake/my_glue_catalog_table_files/';

This example uses the default credentials from Upsolver's integration with AWS.

See: Deploying Upsolver on AWS

Additionally, this example assumes that you have created the Amazon S3 connection my_s3_storage_connection with proper write permissions to the specified storage location.

Refer to Amazon S3 for more information on creating a connection using SQL.

Full example

CREATE GLUE_CATALOG CONNECTION my_glue_catalog_connection
    AWS_ROLE = 'arn:aws:iam::123456789012:role/upsolver-sqlake-role'
    DEFAULT_STORAGE_CONNECTION = my_s3_storage_connection
    DEFAULT_STORAGE_LOCATION = 's3://sqlake/my_glue_catalog_table_files/'
    REGION = 'us-east-1'
    DATABASE_DISPLAY_FILTERS = ('demo_db', 'prod_db')
    COMMENT = 'glue catalog connection example';

Last updated