Monitoring

Drilling into a specific job from the Jobs page will take you to the job monitoring dashboard. Here you will find a list of job executions:

Each job execution represents an interval of data that the job is processing. You can track the latest job executions that were completed, and those which are still running.

A job execution can be in any of the following statuses:

Status Name

Status Description

Scheduled

All previous executions have been completed and the next execution is scheduled to run according to schedule.

In-progress

The execution is currently running.

Completed

The execution is completed.

Failed (Retrying)

The execution is experiencing errors that prevent it from progressing. Due to the built-in retry mechanism, Upsolver automatically attempts execution from the exact point where the error occurred.

Pending Dependency

If this is a transformation job and created as a SYNC job (i.e., it has a dependency in other preceding jobs), the execution, which is processing a specific data range, might be in status Pending Dependency if the data range is yet not processed by preceding jobs.

You can check the job dependencies in the Lineage tab.

Paused

If the cluster is paused, active executions have a status of Paused.

Writing Paused

In case the writings of the job were paused, the execution status will be Writing Paused. You will see the execution reading data and processing it, but not writing it to the target.

The following KPIs can be tracked for the execution:

Column name

Description

Status

The current status of the execution.

Data Start Time

The start time of the data range processed by the execution in local time. The end time of the data range is calculated by adding the Data Range to the Data Start Time.

Data Range

The time range processed by the execution, e.g. 15 minutes.

This range defines the time period during which events are processed, starting from the Data Start Time.

Bytes Discovered\Read

Ingestion jobs - the total size of the discovered source data.

Transformation jobs that read from lake tables - the total size of source data read.

Files\Messages Discovered

Amazon S3 sources - the total number of source files discovered.

Kafka and Kinesis sources - the total number of source messages discovered.

Rows Parsed\Scanned

Ingestion jobs - the total number of rows\messages parsed successfully from the source. The count may be lower than the number of discovered messages if certain messages failed to parse successfully. Additionally, the count may exceed the number of messages if a message contains a batch of rows.

Transformation jobs - the number of rows read from the source table and processed by the execution.

Rows Written

The number of rows that have been successfully written and committed to the target.

Alerts

Displays real-time errors currently occurring during the job execution, parse errors, and data rejection errors.

Principals

The data range (from data start time to data end time) represents the span of data that is being processed by the job execution:
- The dates correspond to the $event_time, which reflects the increment date configured for your job.
- The range size (e.g., 1 minute, 15 minutes, etc) is determined by the job interval, either COMMIT_INTERVAL when available, or RUN_INTERVAL.
Data ranges can be processed concurrently. Newer data ranges may begin processing alongside previous executions that are still running. This can happen in two scenarios:
- Backlog of History: You created a job that starts picking data from the earliest time. The backlog of history would run parallel job executions. The size of each execution is determined by the job COMMIT_INTERVAL.
- Job Backlog: The time it takes to process a single execution is longer than the time span between executions. For example, you scheduled a job to run every minute, but it takes longer to process a particular minute. Hence, newer minutes would start processing in parallel.
While executions can proceed concurrently, data commits to the target system are always sequential. Data ranges are loaded in order, ensuring that newer data ranges are never loaded before older ones. The parallel processing applies to all operations preceding to the commit phase.

Last updated 10 months ago