Amazon S3
This article describes how to create and maintain connections to your Amazon S3 bucket.
Amazon S3 connections have a wide variety of uses in SQLake. As with other connection types, they can be used to read your data and/or write transformed data to a specified location. However, unlike other types, Amazon S3 connections also serve as a storage location for the underlying files for your Upsolver-managed tables as well as the intermediate files used while running a job.
This means that even if you don't intend to write to an Amazon S3 bucket as a target location, you should still have an Amazon S3 connection that has write permissions to an Amazon S3 bucket.
Note that an Amazon S3 connection is created by default when you deploy Upsolver on your AWS account.
Create an Amazon S3 connection
Simple example
An Amazon S3 connection can be created very simply as follows:
The connection in this example is created based on the default credentials derived from Upsolver's integration with your AWS account.
Full example
The following example creates an Amazon S3 connection but explicitly configures the credentials by providing a specific role:
To establish a connection with specific permissions, you can configure the AWS_ROLE
and EXTERNAL_ID
options as per the example above, or you can configure theAWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
options to provide the credentials to read from your bucket.
When creating a new connection, SQLake automatically tries to list the bucket and prefixes to allow users to discover their data. This validates that the permissions are defined correctly.
Additionally, you can limit the list of buckets displayed within your catalog by providing a list of paths to display using PATH_DISPLAY_FILTER[S]
.
All connections have read and write permissions by default but you can easily create a connection with read-only access by setting READ_ONLY
to true
.
The options ENCRYPTION_KMS_KEY
or ENCRYPTION_CUSTOMER_MANAGED_KEY
can be used to configure your bucket's encryption.
Finally, by using the COMMENT
option, you can add a description for your connection.
For a detailed guide on how to configure permissions to access your Amazon S3 data in SQLake, see Configure access to Amazon S3
For the full list of connection options with syntax and detailed descriptions, see Amazon S3 connection with SQL
After creating the connection you can browse your Amazon S3 buckets and prefixes from the navigation tree.
After creating your connection, you are ready to move on to the next step of building your data pipeline: reading your data into SQLake with an ingestion job.
Alter an Amazon S3 connection
Many connection options are considered mutable, meaning that in some cases, you need only run a SQL command to alter an existing Amazon S3 connection, rather than create a new one.
For example, take the Amazon S3 connection we created previously based on default credentials:
If you only need to change the connection's permissions, you can run the following command:
Note that some options such as READ_ONLY
cannot be altered once the connection has been created.
To check which specific connection options are mutable, see Amazon S3 connection with SQL
Drop an Amazon S3 connection
If you no longer need a connection, you can easily drop it with the following SQL command:
However, if existing tables or jobs are dependent upon the connection, the connection cannot be deleted.
For more details, see DROP CONNECTION
Last updated