Links

Configure access to Amazon S3

This section covers how to configure an Amazon S3 Connection in SQLake to read from and write to an Amazon S3 location managed by another AWS account.
In order to create an IAM role and a trust relationship, please visit the Role Based AWS Credentials documentation, and use the following documentation to create the IAM policy with the required Amazon S3 permissions.

Create an IAM policy with required Amazon S3 permissions

SQLake requires the following permissions:
s3:GetBucketLocation
s3:ListBucket
s3:GetObject
s3:GetObjectVersion
The following permissions are required to perform additional SQL actions:
Permission
SQL action
s3:PutObject
Write data to the target location using COPY FROM, INSERT, and MERGE jobs
s3:DeleteObject
Enable table retention to delete old data
When creating an Amazon S3 connection in SQLake, you can include the PATH_DISPLAY_FILTERS property, which allows you to restrict the Amazon S3 paths that users can see in the SQLake navigation tree. However, this does not limit the user’s ability to read and write objects; that is still managed by the permissions in the IAM role attached to the connection. This property is not to be used to restrict access to data.
If the PATH_DISPLAY_FILTERS property is omitted, SQLake attempts to list all buckets in the account. The available buckets are listed in the SQLake navigation tree to make it easier for users to discover datasets. For this to function correctly, SQLake requires the IAM policy to include s3:ListAllMyBuckets.
If PATH_DISPLAY_FILTERS is included when creating the Amazon S3 connection, you do not need to add the s3:ListAllMyBuckets permission.
When creating the IAM policy, add the policy statements that allow SQLake to access the data in your Amazon S3 location:
Note: Make sure to replace the <bucket> and <prefix> with your actual bucket name and folder prefix name.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::<BUCKET_1>",
"arn:aws:s3:::<BUCKET_2>"
]
},
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:DeleteObject",
"s3:DeleteObjectVersion",
"s3:GetObject",
"s3:GetObjectVersion"
],
"Resource": [
"arn:aws:s3:::<BUCKET_1>/<PREFIX>/*",
"arn:aws:s3:::<BUCKET_2>/<PREFIX>/*"
]
}
]
}
The above policy allows the SQLake job to read and write data to the listed buckets. In the case where a read-only connection is needed, you can use the following policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::<BUCKET_1>",
"arn:aws:s3:::<BUCKET_2>"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion"
],
"Resource": [
"arn:aws:s3:::<BUCKET_1>/<PREFIX>/*",
"arn:aws:s3:::<BUCKET_2>/<PREFIX>/*"
]
}
]
}
Note: When you create an Amazon S3 connection using a read-only IAM role, as shown above, and include a PATH_DISPLAY_FILTERS property to limit which paths are discoverable in the SQLake UI, you must also include the READ_ONLY = TRUE property. This tells SQLake that the IAM permissions do not include s3:PutObject and to skip validation.
To learn more about setting permissions for Amazon S3, see Policies and Permissions in Amazon S3.