LogoLogo
OverviewQuickstartsHow To GuidesReferenceArticlesSupport
How To Guides
How To Guides
  • How To Guides
  • SETUP
    • Deploy Upsolver on AWS
      • Deployment Guide
      • AWS Role Permissions
      • VPC Peering Guide
      • Role-Based AWS Credentials
    • Enable API Integration
    • Install the Upsolver CLI
  • CONNECTORS
    • Create Connections
      • Amazon Kinesis
      • Amazon Redshift
      • Amazon S3
      • Apache Kafka
      • AWS Glue Data Catalog
      • ClickHouse
      • Confluent Cloud
      • Elasticsearch
      • Microsoft SQL Server
      • MongoDB
      • MySQL
      • PostgreSQL
      • Snowflake
      • Tabular
    • Configure Access
      • Amazon Kinesis
      • Amazon S3
      • Apache Kafka
      • AWS Glue Data Catalog
      • Confluent Kafka
    • Enable CDC
      • Microsoft SQL Server
      • MongoDB
      • MySQL
      • PostgreSQL
  • JOBS
    • Basics
      • Real-time Data Ingestion — Amazon Kinesis to ClickHouse
      • Real-time Data Ingestion — Amazon S3 to Amazon Athena
      • Real-time Data Ingestion — Apache Kafka to Amazon Athena
      • Real-time Data Ingestion — Apache Kafka to Snowflake
    • Advanced Use Cases
      • Build a Data Lakehouse
      • Enriching Data - Amazon S3 to ClickHouse
      • Joining Data — Amazon S3 to Amazon Athena
      • Upserting Data — Amazon S3 to Amazon Athena
      • Aggregating Data — Amazon S3 to Amazon Athena
      • Managing Data Quality - Ingesting Data with Expectations
    • Database Replication
      • Replicate CDC Data into Snowflake
      • Replicate CDC Data to Multiple Targets in Snowflake
      • Ingest Your Microsoft SQL Server CDC Data to Snowflake
      • Ingest Your MongoDB CDC Data to Snowflake
      • Handle PostgreSQL TOAST Values
    • VPC Flow Logs
      • Data Ingestion — VPC Flow Logs
      • Data Analytics — VPC Flow Logs
    • Job Monitoring
      • Export Metrics to a Third-Party System
    • Data Observability
      • Observe Data with Datasets
  • DATA
    • Query Upsolver Iceberg Tables from Snowflake
  • APACHE ICEBERG
    • Analyze Your Iceberg Tables Using the Upsolver CLI
    • Optimize Your Iceberg Tables
Powered by GitBook
On this page
  • Create an Amazon S3 connection
  • Alter an Amazon S3 connection
  • Drop an Amazon S3 connection
  1. CONNECTORS
  2. Create Connections

Amazon S3

This article describes how to create and maintain connections to your Amazon S3 bucket.

Last updated 11 months ago

Amazon S3 connections have a wide variety of uses in Upsolver. As with other connection types, they can be used to read your data and/or write transformed data to a specified location. However, unlike other types, Amazon S3 connections also serve as a storage location for the underlying files for your Upsolver-managed tables as well as the intermediate files used while running a job.

This means that even if you don't intend to write to an Amazon S3 bucket as a target location, you should still have an Amazon S3 connection that has write permissions to an Amazon S3 bucket.

An Amazon S3 connection is created by default when you deploy Upsolver on your AWS account. See the guide to for more information.

Create an Amazon S3 connection

Simple example

An Amazon S3 connection can be created very simply as follows:

CREATE S3 CONNECTION my_s3_connection;

The connection in this example is created based on the default credentials derived from Upsolver's integration with your AWS account.

Full example

The following example creates an Amazon S3 connection but explicitly configures the credentials by providing a specific role:

CREATE S3 CONNECTION s3_example
    AWS_ROLE = 'arn:aws:iam::123456789012:role/upsolver-sqlake-role'
    PATH_DISPLAY_FILTERS = ('s3://bucket1/', 's3://bucket2/folder-path/')
    READ_ONLY = TRUE
    ENCRYPTION_KMS_KEY = 'arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab'
    COMMENT = 'My new S3 connection';

To establish a connection with specific permissions, you can configure the AWS_ROLE and EXTERNAL_ID options as per the example above, or you can configure the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY options to provide the credentials to read from your bucket. When creating a new connection, Upsolver automatically tries to list the bucket and prefixes to allow users to discover their data. This validates that the permissions are defined correctly.

Additionally, you can limit the list of buckets displayed within your catalog by providing a list of paths to display using PATH_DISPLAY_FILTER[S].

All connections have read and write permissions by default but you can easily create a connection with read-only access by setting READ_ONLY to true.

The options ENCRYPTION_KMS_KEY or ENCRYPTION_CUSTOMER_MANAGED_KEY can be used to configure your bucket's encryption.

Finally, by using the COMMENT option, you can add a description for your connection.

After creating the connection you can browse your Amazon S3 buckets and prefixes from the navigation tree.

Alter an Amazon S3 connection

Many connection options are considered mutable, meaning that in some cases, you need only run a SQL command to alter an existing Amazon S3 connection, rather than create a new one.

For example, take the Amazon S3 connection we created previously based on default credentials:

CREATE S3 CONNECTION my_s3_connection;

If you only need to change the connection's permissions, you can run the following command:

ALTER S3 CONNECTION my_s3_connection
    SET AWS_ROLE = 'arn:aws:iam::123456789012:role/new-upsolver-role'; 

Note that some options such as READ_ONLY cannot be altered once the connection has been created.

Drop an Amazon S3 connection

If you no longer need a connection, you can easily drop it with the following SQL command:

DROP CONNECTION my_s3_connection; 

However, if existing tables or jobs are dependent upon the connection, the connection cannot be deleted.


Learn More

To discover which connection options are mutable, and to learn more about the options, please see the SQL command reference for .

Deploy Upsolver on AWS
Amazon S3