Links

Output to a Snowflake table

Prerequisites

Ensure that you have a Snowflake connection with the correct permissions to write to your target table. Additionally, this target table should already exist within Snowflake before writing to it using SQLake.
You also need a storage connection that has access to the bucket you would like the job to use to store the intermediate files used while running the job.
Finally, you should also have a staging table created previously that contains your ingested data.
For more information regarding ingesting your data into a staging table, see: Ingestion jobs

Create a job writing to Snowflake

Once you have fulfilled the prerequisites, you can create an INSERT job as follows:
CREATE SYNC JOB load_data_to_snowflake
RUN_INTERVAL = 1 MINUTE
START_FROM = BEGINNING
COMMENT = 'insert data into snowflake target table'
AS INSERT INTO SNOWFLAKE your_snowflake_connection.your_schema.target_table MAP_COLUMNS_BY_NAME
SELECT customer_id,
COUNT(DISTINCT order_id) AS num_orders,
SUM(net_total) AS total_spent,
MIN(order_date) AS first_purchase,
MAX(order_date) AS last_purchase
FROM default_glue_catalog.your_database.staging_table
WHERE time_filter()
GROUP BY customer_id;
This example only uses a subset of all job options available when writing to Snowflake.
Depending on your use case, you may want to configure a different set of options. For instance, this example contains an aggregation, which means you may want to configure the AGGREGATION_PARALLELISM option.
For the full list of job options with syntax and detailed descriptions, see: Snowflake transformation job options
For more details regarding the INSERT command syntax, see: INSERT

Alter a job writing to Snowflake

Certain job options are considered mutable, meaning that in some cases, you can run a SQL command to alter an existing transformation job rather than having to create a new one.
For example, take the job we created as an example earlier:
CREATE SYNC JOB load_data_to_snowflake
RUN_INTERVAL = 1 MINUTE
START_FROM = BEGINNING
COMMENT = 'insert data into snowflake target table'
AS INSERT INTO SNOWFLAKE your_snowflake_connection.your_schema.target_table MAP_COLUMNS_BY_NAME
SELECT customer_id,
COUNT(DISTINCT order_id) AS num_orders,
SUM(net_total) AS total_spent,
MIN(order_date) AS first_purchase,
MAX(order_date) AS last_purchase
FROM default_glue_catalog.your_database.staging_table
WHERE time_filter()
GROUP BY customer_id;
If you wanted to keep the job as is but just change the cluster that is running the job, you can run the following command:
ALTER JOB load_data_to_snowflake
SET COMPUTE_CLUSTER = high_memory_cluster;
Note that some options such as RUN_INTERVAL cannot be altered once the connection has been created.
To check which specific job options are mutable, see: Snowflake transformation job options

Drop a job writing to Snowflake

If you no longer need a certain job, you can easily drop it with the following SQL command:
DROP JOB load_data_to_snowflake;
For the full list of job options with descriptions, see: DROP JOB