AWS Glue Data Catalog

In order to create and work with data lake tables and Apache Iceberg tables within Upsolver, you first need to establish a connection with a metadata store such as AWS Glue Data Catalog.

The tables created from your AWS Glue Data Catalog connection can also be queried from Athena and from within the Upsolver UI.

Note that when you integrate Upsolver with your AWS account, there is a Glue Data Catalog connection created by default, but you may still wish to create your own connection for specific access configurations.

See the how-to guide to Deploy Upsolver on AWS.

Syntax

CREATE GLUE_CATALOG CONNECTION <connection_identifier> 
 [ { AWS_ROLE = '<role_arn>' 
    EXTERNAL_ID = '<external_id>'
  | AWS_ACCESS_KEY_ID = '<key_id>' 
    AWS_SECRET_ACCESS_KEY = '<key>' } ]
 DEFAULT_STORAGE_CONNECTION = <identifier>
 DEFAULT_STORAGE_LOCATION = 's3://<bucket>/<folder-path>/'
 [ REGION = '<region>' ]  
 [ DATABASE_DISPLAY_FILTER[S] = { '<database_name>' | ('<database_name>' [, ...]) } ]
 [ COMMENT = '<comment>' ]

Connection options

`AWS_ROLE` — editable

Type: text

(Optional) The AWS IAM role ARN. Used in conjunction with EXTERNAL_ID.

If omitted, the role created when integrating Upsolver with the AWS account is used.

`EXTERNAL_ID` — editable

Type: text

(Optional) The external ID of the role to assume. Used in conjunction with AWS_ROLE.

If omitted, the role created when integrating Upsolver with the AWS account is used.

`AWS_ACCESS_KEY_ID` — editable

Type: text

(Optional) The AWS access key ID. Used in conjunction with AWS_SECRET_ACCESS_KEY.

If omitted, the role created when integrating Upsolver with the AWS account is used.

`AWS_SECRET_ACCESS_KEY` — editable

Type: text

(Optional) The AWS secret key corresponding to the provided AWS_ACCESS_KEY_ID.

If omitted, the role created when integrating Upsolver with the AWS account is used.

`DEFAULT_STORAGE_CONNECTION`

Type: identifier

An Amazon S3 connection with the appropriate credentials to write to the DEFAULT_STORAGE_LOCATION provided.

`DEFAULT_STORAGE_LOCATION`

Type: text

The Amazon S3 path that serves as the default storage location for the underlying files associated with tables created under this metastore connection.

`REGION`

Type: text

Default: Region in which Upsolver is deployed within your AWS account

(Optional) The region your Glue Catalog is in.

`DATABASE_DISPLAY_FILTER[S]` — editable

Type: text | list

(Optional) A single database or the list of databases to show. If left empty, all databases are visible.

`COMMENT` — editable

Type: text

(Optional) A description or comment regarding this connection.

Minimum example

CREATE GLUE_CATALOG CONNECTION my_glue_catalog_connection
    DEFAULT_STORAGE_CONNECTION = my_s3_storage_connection
    DEFAULT_STORAGE_LOCATION = 's3://sqlake/my_glue_catalog_table_files/';

This example uses the default credentials from Upsolver's integration with AWS.

See the how-to guide to Deploy Upsolver on AWS.

Additionally, this example assumes that you have created the Amazon S3 connection my_s3_storage_connection with proper write permissions to the specified storage location.

Refer to Amazon S3 for more information on creating a connection using SQL.

Full example

CREATE GLUE_CATALOG CONNECTION my_glue_catalog_connection
    AWS_ROLE = 'arn:aws:iam::123456789012:role/upsolver-sqlake-role'
    DEFAULT_STORAGE_CONNECTION = my_s3_storage_connection
    DEFAULT_STORAGE_LOCATION = 's3://sqlake/my_glue_catalog_table_files/'
    REGION = 'us-east-1'
    DATABASE_DISPLAY_FILTERS = ('demo_db', 'prod_db')
    COMMENT = 'glue catalog connection example';

Last updated 1 year ago

Syntax

Jump to

Connection options

AWS_ROLE — editable

EXTERNAL_ID — editable

AWS_ACCESS_KEY_ID — editable

AWS_SECRET_ACCESS_KEY — editable

DEFAULT_STORAGE_CONNECTION

DEFAULT_STORAGE_LOCATION

REGION

DATABASE_DISPLAY_FILTER[S] — editable

COMMENT — editable

Minimum example

Full example

`AWS_ROLE` — editable

`EXTERNAL_ID` — editable

`AWS_ACCESS_KEY_ID` — editable

`AWS_SECRET_ACCESS_KEY` — editable

`DEFAULT_STORAGE_CONNECTION`

`DEFAULT_STORAGE_LOCATION`

`REGION`

`DATABASE_DISPLAY_FILTER[S]` — editable

`COMMENT` — editable