Deployment guide

Learn how to deploy SQLake to a VPC in your AWS account

Prerequisite for integration

To properly deploy SQLake into your AWS account, you must have an AWS IAM user with permission to execute the actions listed below. You will use this IAM user to launch the CloudFormation stack created for you by Upsolver.

  • Create, launch, stop, and rollback CloudFormation stacks

  • Create and modify IAM users, roles, and policies

  • Create and modify S3 buckets and objects

  • Create and modify VPC, security groups, subnets, internet gateway, and route tables

Required AWS resources

The following table lists the types and quantities of resources Upsolver will consume when deploying SQLake into your VPC. You must ensure that you have the required resources or the installation may fail.

Item

Required

Default Limit

Comments

CloudFormation Stack

At least 1. One is required for every VPC integration

200

-

IAM Role

1

1000

-

IAM Instance Profile

1

1000

-

S3 Bucket

1

100

-

VPC

1

5

-

Internet Gateway

1

5

-

EC2 Elastic IPs

2

5

-

EC2 Spot Instances

2

Dynamic

Instance Types:

  • r4.large

  • m5.2xlarge

  • m5d.2xlarge

  • m5a.2xlarge

Kinesis Stream Shard

1

200

Upsolver uses Kinesis Data Streams to balance workload between servers

AWS resources created during deployment

The following resources are created when you deploy SQLake to your AWS account:

  • UpsolverServerRole: This IAM role is attached to SQLake EC2 servers and provides access to read and write data from S3 and communicate with other AWS resources.

  • UpsolverManagementRole: This IAM role is attached to the management plane servers used by SQLake to create and manage data processing servers in your AWS account. It does not provide access to data; your data remains inaccessible from outside the account.

  • Default S3 Bucket: This bucket is used as the default output location for Upsolver jobs.

  • Kinesis Data Stream: Used for synchronization of pipeline operations across clusters and between data processing servers.

  • Elastic IP: An EIP used to access the API server.

See: AWS role permissions for a detailed description of the permissions given to each of the roles.

Deploy SQLake into your AWS Account

Step 1: Login to Upsolver Cloud to start your integration

To deploy SQLake in your AWS account, you first need to sign-up and log in to the Upsolver Cloud. If you already have an account with SQLake, log in via https://sqlake.upsolver.com and if you do not have an account, you can sign up at https://sqlake.upsolver.com/signup. Once logged in, you can continue to the next step to initiate the integration.

Step 2: Integrating with your AWS account

If this is your first time signing in to SQLake, you will be greeted with the screen shown below. If this is not your first time, you can reach this screen by clicking the Upsolver icon in the upper left corner of the console.

Click on the Integrate with AWS option to start the integration process.

Step 3: Starting the integration

In the page that appears, choose the Start Integration option.

Step 4: Configure your target deployment

The form that appears asks you to provide information specific to your target environment. In the majority of cases, the defaults can be used when deploying into a new VPC. If you are deploying into an existing VPC. you will need to update your specific VPC details.

VPC CIDR

This is the range of IPv4 addresses that will be used for SQLake servers. You may want to change the CIDR block to meet your company's IP allocation guidelines, avoid conflict with other connected networks, or simply leave the default. Note, that if you anticipate running large SQLake clusters, you will need sufficient IP addresses to support the number of expected nodes across all of your clusters.

Ingress Traffic CIDR List

A list of IP addresses (in CIDR range format) from which SQLake (API server) will be reachable. We recommend restricting access to SQLake for users accessing it from your corporate network or specific VPC IP addresses.

Allow Upsolver access for support & troubleshooting

If you anticipate needing Upsolver’s help in troubleshooting and quickly resolving issues with your SQLake environment, consider enabling this option. Enabling it will give Upsolver support engineers access to your API server, allowing them to inspect your data pipelines, access servers, and change configuration parameters. Upsolver does not have access to your actual data, only the SQL resources you create (connections, jobs, schemas) and associated metadata.

Disabling this option will not allow Upsolver to access your SQLake environment (API server).

You can enable and disable this option at a later time if requirements change.

Region

Select the AWS region where SQLake will be deployed. This is the region where the CloudFormation stack will execute and all the SQLake resources will be deployed. You can only deploy into one region at a time. If you need to deploy SQLake into multiple regions, you will need to follow this guide multiple times, once per region.

Currently, Upsolver supports deploying into AWS public cloud regions only, not including China.

Deploy to an existing VPC

By default, when deploying SQLake, Upsolver will create a new VPC in your AWS account where all the resources will be deployed. You may choose to deploy SQLake into an existing VPC by enabling this option.

When enabled, you will need to enter the target VPC ID. This can be found in the AWS console, from the VPC console.

After entering the VPC ID, enter the Subnets that the Upsolver servers will be deployed to. The Availability Zone and Subnet ID can be found in the Subnets section of the VPC console. You can either enter these into the deployment form manually or import them via a CSV file.

After you complete making the necessary changes to the configuration screen, click Next.

Step 5: Review the CloudFormation stack in the AWS Console

Before you can launch the CloudFormation stack, click the Launch CloudFormation button, which will redirect you to the AWS Console in a new browser window. Login using the IAM user you created as a prerequisite, with the required IAM permissions to launch the stack.

Step 6: Launch the CloudFormation stack

The CloudFormation stack is now created, and ready to be reviewed in the AWS console.

Scroll to the bottom of the page and check the box I acknowledge that AWS CloudFormation might create IAM resources with customer names. Then click the Create stack button.

The CloudFormation stack will now begin to run. If your logged in IAM user does not have the required permissions to deploy all of the resources defined in the stack, you will receive a permission denied error message and will need to update your user permissions and rerun the stack.

Step 7: Monitor CloudFormation Stack Progress

The CloudFormation console will automatically refresh periodically updating you on its progress. You can track the detailed progress of each stage in the stack by clicking on the Events tab, and for quicker updates, manually refresh the page. You will see certain events in progress, and others that have been completed.

For example:

If any errors are encountered during the deployment process, you will see them here and the stack will be rolled back. All resources that were previously created by the stack will be deleted, so there is no need for you to manually clean up any deployment that did not succeed. After the rollback is marked complete, you can fix any errors and try again.

For assistance in troubleshooting any failed stack deployment, please reach out to Upsolver Support or leave a comment in our Slack channel.

Step 8: Finalizing SQLake deployment

After the stack has deployed successfully, switch back to the SQLake interface. Below is an example of a successful deployment.

Back in SQLake, you will be greeted by a message that the integration was successful, followed by a screen detailing the remaining steps in the deployment, such as installing the SQLake software and configuring the servers. This process should take between 3-5 minutes to complete.

That’s it! You are now ready to start using SQLake within your own AWS account. You can begin building pipelines using your own data simply by browsing the template gallery and selecting one that matches the sources and targets you are looking for. Check out the Quickstarts and How-to guides to get started now.

Last updated