Monitoring

This section describes the Monitoring tab for stream and file data sources.

Metric groups

Some metrics may not apply to the job type you created or only be relevant to particular data sources or target destinations.

  • Summary: Displays the most relevant metrics to your job.

  • All: All metrics applicable to your job.

Job metrics are categorized into three groups:

Job Execution Status

The Job Execution Status metrics include all jobs running on the cluster:

MetricDescription

The number of currently running job executions.

The number of queued job executions pending.

The number of job executions completed today.

The number of job executions completed over the lifetime of the job.

The number of job executions that are waiting for a dependency to complete.

The number of job executions that encountered an error and are currently retrying.

The error message detailing why the job failed.

Data Scanned

The following metrics provide monitoring information regarding the data processed by your job:

MetricDescription

The total number of rows scanned by completed executions today. This is a measure of rows that were processed successfully.

The number of rows that were filtered out because they didn’t pass the WHERE clause predicate defined in the job.

The average number of rows scanned per job execution.

The maximum number of rows scanned in a single job execution today.

The number of rows in the source table that have not been processed yet.

The number of files to load discovered by the job.

The number of bytes to load discovered in the source stream.

The number of items that failed to parse. This value represents a lower bound as malformed items may corrupt subsequent items in the same file as well.

The number of rows written to the target by the job.

The number of rows that were filtered out because they did not pass the HAVING clause predicate defined in the job.

The number of rows that were filtered out because some or all of the partition columns were NULL or empty string.

The number of rows that were filtered out because some or all of the primary key columns were NULL.

The size of the data written by the job.

The number of columns written to by the job. This value can change over time if the query uses * in the SELECT clause.

The number of sparse columns written to today. A sparse column is a column that appears in less than 0.01% of all rows.

Cluster

The following metrics provide information about the cluster running your job:

MetricDescription

Represents how much of the server's processing capacity is in use.

The number of job tasks pending execution in the cluster queue.

The percentage of time that the server is doing garbage collection rather than working.

The percent of bytes re-loaded into memory from disk.

How many server crashes happened in the job’s cluster today.

Last updated