Job configuration guide

Step 1 - General job configuration

Decide which events to ingest. When you choose 'Start from the earliest events', Upsolver fetches all events from your source. If you choose 'Start from now', only new events fired after the job is launched will be ingested.

Set how often you want to update the target. The job will check the source for new events every minute, and you can set the target write interval in Minutes, Hours, or Days.

Frequent writes provide up-to-date information but may be costly, especially for Snowflake.

Select a cluster for ingestion. Upsolver offers one default cluster. The Premium Edition allows you to create multiple clusters and assign dedicated compute resources per job.

Upsolver uses distributed locking technology to keep events strongly ordered. The ingestion timestamp is stored in a dedicated column reflecting the order. The column name is UPSOLVER_EVENT_TIME by default, but you can change it.

Step 2 - Prevent duplicate events

Select fields for deduplication. Using an event key consisting of event fields, e.g. a primary key, Upsolver identifies similar events fired by the source and processes them once.

Step 3 - Schema configuration

Upsolver gives you control over the data ingested into your target. Sometimes, you want to avoid ingesting sensitive data, for example, you do not need to replicate the customer's phone number. You can exclude fields from the results, or mask field values.

Automatically check for null values as events arrive by adding a data quality expectation to a column. Choose between Warn (report to system tables) or Drop events to define how you want Upsolver to handle null values. To add or customize an expectation with additional conditions, navigate to the next screen in the Wizard and choose Edit in Worksheet.

Schema evolution

The target is automatically updated when the schema changes at the source. Upsolver detects newly added fields and columns, updates the target schema, and ingests the values.

When a column is deleted from a source database table, the target schema does not change. New events will have a null value for the deleted column.

Last updated