MySQL Binlog Retention

This article describes how the binlog keeps track of data modifications in MySQL.

What is a binary log file?

The binary log keeps track of all modifications made to the databases' data and structure. It comprises an index and a collection of binary log files. Binlog retention is the period for which a binary log is retained by MySQL, after that period the logs are deleted and are inaccessible.

Binlog delays

The engine binlog delay metric calculates the difference between the last committed event and the current time. This delay will increase if we didn't get any event in the data source, or we ingest events slower than the speed they are generated. Some possible causes might be:

  • There are no new events in the tracked tables

  • The cdc engine fails to start

  • The connection to the database is slow or throttled

Effects of a large binlog delay

To read the CDC in a MySQL database, Upsolver reads the binary log and parses the events.

If we experience delays we keep trying to read the older binlog files we were not able to read before. If the binlog files that the system is trying to read are deleted by MySQL (due to the binlog retention setting), the CDC input will become stuck and won't be able to advance and read further messages. The missing files can't be skipped since it will lead to an inconsistent state with missing, duplicate, and/or invalid events. The only way to recover from this situation is to re-create the Data Source.

Because of this ensuring the binlog delay doesn't exceed the binlog retention is essential for the stability and health of the Data Source and any outputs that depend on it.

How to configure binlog retention in your MySQL (from RDS)

Log in to the source database and run the following SQL statement to set the retention period of binlog:

call mysql.rds_set_configuration('binlog retention hours', n);

The value n indicates an integer from 1 to 168 (7 days).

Monitoring binlog delays

Connect your monitoring system to Upsolver via the monitoring reports page.

Use the upsolver.binlog-delay metric to monitor the binlog delay of your CDC Data Sources.

We recommend you configure an alert that will let you know if the binlog delay is reaching the binlog retention configured in MySQL.

Once the delay exceeds the retention the data source is no longer recoverable.

Additionally, you should have a metastore connection that can be used to create a staging table as well as a corresponding storage connection that can be used to store your table's underlying files.

Last updated