Job Configuration
Step 1 - General job configuration
In the Name your job field, provide the name you want to use should you want to change the default name provided by Upsolver.
Decide which events to ingest:
Start from the earliest events: Upsolver fetches all events from your source.
Start from now: only new events fired after the job is launched will be ingested.
Then, set how often you want to update the target. By default, the job will check the source for new events every minute. You can set the target write interval in Minutes, Hours, or Days. Note that frequent writes provide up-to-date information but may be costly, especially for Snowflake.
Upsolver offers one default cluster. If you are using the Premium Edition, you can create multiple clusters and assign dedicated compute resources per job. If you have the Premium Edition and want to change the default cluster, select the one you want to run this job in the Select a cluster list.
To keep events strongly ordered, Upsolver uses distributed locking technology. The ingestion timestamp is stored in a dedicated column reflecting the order. The column name is UPSOLVER_EVENT_TIME by default, but you can change it. Optionally, use the button to prevent Upsolver from creating this column.
Step 2 - Prevent duplicate events
Using an event key consisting of event fields, e.g. a primary key, Upsolver identifies similar events fired by the source and processes them once. This is not enabled by default, so check the Prevent duplicate events (allow deduplication) button to include this feature. In the Select fields for deduplication key list, select one or more fields. Then set the Deduplication window interval in Minutes, Hours, or Days.
Step 3 - Schema configuration
Upsolver gives you control over the data ingested into your target. Sometimes, you want to avoid ingesting sensitive data, for example, you do not need to replicate the customer's phone number. You can exclude fields from the results, mask field values, or check for NULL values:
Exclude: in the schema tree, deselect the fields you want to exclude from the ingestion.
Hash: hold your mouse over the field name row and, when the options are visible, click + Hash to hide the values in this column. Click again to undo this action.
Check for NULLs: automatically check for null values as events arrive by adding a data quality expectation to a column. Hold your mouse over the field name row and, when the options are visible, click + Set Expectation. Choose between Warn (record in system monitoring table) or Drop events to define how you want Upsolver to handle NULL values. To add or customize an expectation with additional conditions, navigate to the next screen in the Wizard and choose Edit in Worksheet.
Schema evolution
The target is automatically updated when the schema changes at the source. Upsolver detects newly added fields and columns, updates the target schema, and ingests the values.
When a column is deleted from a source database table, the target schema does not change. New events will have a NULL value for the deleted column.