Export metrics to a third-party monitoring system

This article shows you how to export your job metrics to a third-party system for continuous monitoring.

Monitoring data plays a crucial role in ensuring the reliability and performance of your data pipelines. Some of the key advantages of exporting your monitoring data to a third-party system include:

  • Centralized visibility: by integrating with third-party monitoring systems, you can centralize your monitoring data in one location. This simplifies the monitoring and troubleshooting process because you don't need to switch between different tools and dashboards to gain insights into your data pipelines and systems.

  • Customization: third-party monitoring systems regularly offer advanced visualization and reporting capabilities. You can customize dashboards and alerts to suit your specific monitoring needs, enabling you to focus on the most critical metrics and anomalies for your business.

  • Alerting and notifications: many third-party monitoring tools provide robust alerting and notification features. You can create alerts based on predefined thresholds or anomalies in your data, ensuring you are promptly informed of issues requiring attention.

Upsolver exposes numerous metrics within the system monitoring tables that you can export to a supported platform.

Supported Integrations

This example uses Amazon CloudWatch, however the process is very similar for other integrations. Please see the Job Monitoring page for other options.

Step 1

Connect to your target

The first step is to create a connection to your target monitoring platform where you will send the metrics data for your Upsolver account.

Here's the code:

// Create a connection to Amazon CloudWatch
CREATE CLOUDWATCH CONNECTION my_cloudwatch_connection
    AWS_ROLE = 'arn:aws:iam::123456789012:role/upsolver-sqlake-role'
    REGION = 'us-east-1'
    COMMENT = 'CloudWatch connection for Upsolver metrics';

This script creates a new CloudWatch connection named my_cloudwatch_connection, which will use to send our metric data. Like any other connection you create in Upsolver, the connection is persistent, so you need only create this once.

Step 2

Create a job to export job metrics

Next, let's create a job named send_job_metrics_to_cloudwatch that will send data from the system.monitoring.jobs table to the upsolver namespace in the CloudWatch account that we created above:

CREATE JOB send_job_metrics_to_cloudwatch 
    START_FROM = NOW 
AS INSERT INTO my_cloudwatch_connection
    NAMESPACE = 'upsolver'
    MAP_COLUMNS_BY_NAME
       SELECT
              rows_pending_processing AS rows_pending_processing, 
              parse_errors_today AS parse_errors_today,
              tasks_failing_to_load as tasks_failing_to_load,
              queued_executions as queued_executions,
              executions_retrying_after_failure as executions_retrying_after_failure,
              avg_rows_scanned_per_execution_today as avg_rows_scanned_per_execution_today,
              job_id AS tags.job_id,
              job_name AS tags.job_name,       
             RUN_START_TIME() AS time
       FROM system.monitoring.jobs;

The send_job_metrics_to_cloudwatch includes the job_id and job_name tags to ensure the names and ids of your jobs are sent to CloudWatch, which is essential when you have multiple jobs running in your organization.

Upsolver creates the send_job_metrics_to_cloudwatch job, and immediately begins sending metrics on your jobs to CloudWatch.

If the upsolver namespace does not exist, Upsolver will create it for us.

Step 3

Create a job to export cluster metrics

Now, let's create another job named send_cluster_metrics_to_cloudwatch that will send data from the system.monitoring.clusters table to the same upsolver namespace:

CREATE JOB send_cluster_metrics_to_cloudwatch 
    START_FROM = NOW 
AS INSERT INTO my_cloudwatch_connection 
    NAMESPACE = 'upsolver'
    MAP_COLUMNS_BY_NAME
        SELECT utilization_percent AS utilization_percent,
              tasks_in_queue AS tasks_in_queue,
              memory_load_percent AS memory_load_percent,
              cluster_id AS tags.cluster_id,
              cluster_name AS tags.cluster_name,
              RUN_START_TIME() AS time
       FROM system.monitoring.clusters;

The send_cluster_metrics_to_cloudwatch job includes the cluster_id and cluster_name tags to ensure the name and id of each cluster are sent to CloudWatch, which is helpful if you have more than one cluster in your organization.

As above, Upsolver starts sending the cluster metrics to CloudWatch immediately after the job is created, and will run continuously until the job is deleted.

Conclusion

In this how-to guide, you learned how to create jobs within Upsolver to send metric data to your CloudWatch account to monitor your jobs and clusters. By integrating metrics from Upsolver into CloudWatch, you can use a central location to gain visibility into multiple operations within your organization, and ensure optimal performance and up-time.

Try it yourself

To monitor your Upsolver jobs from a third-party monitoring system:

  1. Create a connection to your monitoring system.

  2. Create one or more jobs to select metrics from the system tables, and write this to your monitoring system.

Last updated