LogoLogo
OverviewQuickstartsHow To GuidesReferenceArticlesSupport
Quickstarts
Quickstarts
  • Quickstarts
  • DATA INGESTION WIZARD
    • Using the Wizard
      • Source Set-up
        • Amazon Kinesis
        • Amazon S3
        • Apache Kafka
        • Confluent Cloud
        • Microsoft SQL Server
        • MongoDB
        • MySQL
        • PostgreSQL
      • Target Set-up
        • Amazon Redshift
        • AWS Glue Data Catalog
        • ClickHouse
        • Polaris Catalog
        • Snowflake
      • Job Configuration
        • Job Configuration
        • Job Configuration for CDC
      • Review and Run Job
  • CONNECTORS
    • Connectors
      • Amazon Kinesis
      • Amazon Redshift
      • Amazon S3
      • Apache Kafka
      • AWS Glue Data Catalog
      • ClickHouse
      • Confluent Cloud
      • Elasticsearch
      • Microsoft SQL Server
      • MongoDB
      • MySQL
      • Polaris Catalog
      • PostgreSQL
      • Snowflake
  • JOBS
    • Ingestion
      • Job Basics
        • Ingest to a Staging Table
        • Output to a Target Table
      • Stream and File Sources
        • Amazon Kinesis
        • Amazon S3
        • Apache Kafka
        • Confluent Kafka
      • CDC Sources
        • Microsoft SQL Server
        • MongoDB
        • MySQL
        • PostgreSQL
    • Transformation
      • Updating Data
        • Upsert Data to a Target Table
        • Delete Data from a Target Table
        • Aggregate and Output Data
        • Join Two Data Streams
      • Data Targets
        • Output to Amazon Athena
        • Output to Amazon Redshift
        • Output to Amazon S3
        • Output to Elasticsearch
        • Output to Snowflake
  • APACHE ICEBERG
    • Optimize Your Iceberg Tables
    • Install the Iceberg Table Analyzer
Powered by GitBook
On this page
  • Create a CDC job that reads from your database
  • Ingest to the data lake
  • Job options
  • Alter a job
  • Drop a job that reads from SQL Server
  1. JOBS
  2. Ingestion
  3. CDC Sources

Microsoft SQL Server

This page describes how to ingest your data from Microsoft SQL Server.

Last updated 11 months ago

Prerequisites

Ensure you have a connection to your database and that it has been enabled for CDC.

Please read the guide to for more information.

Create a CDC job that reads from your database

You can create a job to ingest your data from SQL Server into a staging table in the data lake.

Jump to

Ingest to the data lake

After completing the prerequisites, you can create your staging tables. The example below creates a table without defining columns or data types, as these will be inferred automatically by Upsolver, though you can define columns if required:

CREATE TABLE default_glue_catalog.upsolver_samples.orders_raw_data()
    PARTITIONED BY $event_date;

Upsolver recommends partitioning by the system column $event_date or another date column within the data in order to optimize your query performance.

Next, create an ingestion job as follows:

CREATE JOB load_orders_raw_data_from_sqlserver
    COMMENT = 'Ingest from production SQL Server'
AS COPY FROM MSSQL my_mssql_connection
    INTO default_glue_catalog.upsolver_samples.orders_raw_data;

Job options

The example above only uses a small subset of all job options available when reading from SQL Server. Depending on your use case, you may want to configure a different set of options.

Transformations can be applied to your ingestion job, for example, to exclude columns, correct issues, or mask data, before it lands in the target. Furthermore, you can use expectations to define data quality rules on your data stream and take appropriate action.

Alter a job

You can alter some of the options of an existing job. For example, if you want to keep the job as is, but only change the cluster that is running the job, execute the following command:

ALTER JOB load_orders_raw_data_from_sqlserver 
    SET COMPUTE_CLUSTER = "my_new_cluster";

Note that some options such as COMPRESSION cannot be altered once the job has been created.

Drop a job that reads from SQL Server

If you no longer need a job, you can easily drop it using the following SQL command:

DROP JOB load_orders_raw_data_from_sqlserver;

Learn More

Please see the SQL command reference for for full details of the available job options and further examples.

Ingest to the data lake
Job options
Enable CDC on SQL Server
SQL Server
Microsoft SQL Server