MySQL Replication Best Practices

Paul Namuag

MySQL Replication has been the most common and widely used solution for high availability by huge organizations such as Github, Twitter, Facebook, etc. Although easy to setup, there are ultimate challenges that are faced when using this solution from doing maintenance such as software upgrades, data drift or data inconsistency across the replica nodes, topology changes, failover, and recovery. When MySQL released version 5.6, it brought a number of significant enhancements especially to replication which includes Global Transaction IDs (GTIDs), event checksums, multi-threaded slaves, and crash-safe slaves/masters. Replication got even better with MySQL 5.7 and MySQL 8.0.

It enables data from one MySQL server (the primary/master) to be replicated to one or more MySQL servers (the replica/slaves). MySQL Replication is very easy to setup, and is used to scale out read workloads, provide high availability and geographic redundancy, and offload backups and analytic jobs.

MySQL Replication in Nature

Let's have a quick overview of how MySQL Replication works in nature. MySQL Replication is broad and there are multiple ways to configure it and how it can be used. By default, it uses asynchronous replication which works as the transaction is completed on the local environment, there is no guarantee that any event will ever reach any slave. It is a loosely coupled master-slave relationship, where:

  • Primary does not wait for a replica.

  • Replica determines how much to read and from which point in the binary log.

  • Replica can be arbitrarily behind master in reading or applying changes.

If the primary crashes, transactions that it has committed might not have possibly been transmitted to any replica. Consequently, failover from primary to the most advanced replica, in this case, may result in a failover to the desired primary that is actually missing transactions relative to the previous server.

Asynchronous replication provides lower write latency since a write is acknowledged locally by a master before being written to slaves. It is great for read-scaling as adding more replicas does not impact replication latency. Good use cases for asynchronous replication include deployment of reading replicas for read-scaling, live backup copy for disaster recovery, and analytics/reporting.

As semi-synchronous

MySQL also supports semi-synchronous replication, where the master does not confirm transactions to the client until at least one slave has copied the change to its relay log, and flushed it to disk. To enable semi-synchronous replication, extra steps for plugin installation are required and must be enabled on the designated MySQL master and slave nodes.

Semi-synchronous seems to be a good and practical solution for many cases where high availability and no data loss are important. But you should consider that semi-synchronous has a performance impact due to the additional round trip and does not provide strong guarantees against data loss. When a commit returns successfully, it is known that the data exists in at least two places (on the master and at least one slave). If the master commits but a crash occurs while the master is waiting for acknowledgment from a slave, it is possible that the transaction may not have reached any slave. This is not that big of an issue as the commit will not be returned to the application in this case. It is the application’s task to retry the transaction in the future. What is important to keep in mind is that, when the master fails and a slave has been promoted, the old master cannot join the replication chain. Under some circumstances, this may lead to conflicts with data on the slaves (when the master crashed after the slave received the binary log event but before the master got the acknowledgment from the slave). Thus the only safe way is to discard the data on the old master and provision it from scratch using the data from the newly promoted master.

Practice makes perfect

We have digressed a bit on how MySQL Replication works by its nature. We have to deal with its imperfection and how it has to be applied to make your MySQL database environment tuned according to your needs. In this blog, we will take a look at the areas that MySQL database experts tend to manage and apply the best practices that should have to deal with on our database systems especially when it is in the production environment.

Using the replication format incorrectly

Since MySQL 5.7.7, the default binary log format or binlog_format variable uses ROW, which was STATEMENT prior to 5.7.7. The different replication formats correspond to the method used when the events are recorded in the source's binary log. Replication works because events written to the binary log are read from the source and then processed on the replica. The events are recorded within the binary log in different replication formats according to the type of event. Not knowing what to use for sure can be a problem. MySQL has three different formats of replication methods to use: STATEMENT, ROW, and MIXED.

  • The STATEMENT-based replication (SBR) format is exactly what it is: a replication stream of every statement run on the master that will be replayed on the slave node. By default, MySQL traditional (asynchronous) replication does not execute the replicated transactions to the slaves in parallel. By that, it means that the order of statements in the replication stream may not be 100% the same. Also replaying a statement may give different results when not executed at the exact same time as when it was executed from the source. This leads to an inconsistent state against the primary and its replica(s). This wasn’t an issue for many years, as not many ran MySQL with many simultaneous threads, but with modern multi-CPU architectures, this actually has become highly probable on a normal day-to-day workload.

  • The ROW replication format provides solutions that the SBR lacks. When using row-based replication (RBR) logging format, the source writes events to the binary log that indicate how individual table rows are changed. Replication of the source to the replica works by copying the events representing the changes to the table rows to the replica. This means that more data can be generated, it may affect disk space in the replica and can also affect network traffic and disk I/O. Consider if a statement changes many rows, let's say with an UPDATE statement, RBR writes more data to the binary log even for statements that are rolled back. Running point-in-time snapshots can take more time as well. Concurrency problems may come into play given the lock times needed to write large chunks of data into the binary log.

  • Then there is a method in between these two: mixed mode replication. This type of replication will always replicate statements, except when the query contains the UUID() function, triggers, stored procedures, UDFs, and a few other exceptions are used. Mixed mode will not solve the issue of data drift and, together with statement-based replication, should be avoided.

Planning to have a multi-master setup?

Circular Replication (also known as ring topology) is a known and common setup for MySQL replication. It is known and practiced if you are running a multi-master setup (see image below). It is often necessary if you have a multi-datacenter environment. Since the application can’t wait for the master in the other datacenter to acknowledge the writes, a local master is preferred. Normally the auto-increment offset is used to prevent data clashes between the masters. Having two masters perform writes to each other in this way is a broadly accepted solution.

However, if you need to write in multiple datacenters into the same database, you end up with multiple masters that need to write their data to each other. Before MySQL 5.7.6 there was no method to do a mesh type of replication, so the alternative would be to use a circular ring replication instead.

Ring replication in MySQL is problematic for the following reasons: latency, high availability, and data drift. Writing some data to server A, it would take three hops to end up on server D (via server B and C). Since (traditional) MySQL replication is single-threaded, any long-running query in the replication may stall the whole ring. Also, if any of the servers would go down, the ring would be broken and currently, there is no failover software that can repair ring structures. Then data drift may occur when data is written to server A and is altered at the same time on server C or D.

In general, circular replication is not a good fit with MySQL and it should be avoided at all costs. Galera Cluster would be a good alternative for multi-datacenter writes, as it has been designed with that in mind.

Stalling your Replication with Large Updates

Often, various housekeeping batch jobs will perform various tasks, ranging from cleaning up old data to calculating averages of ‘likes’ fetched from another source. This means at set intervals, a job will create a lot of database activity and, most likely, write a lot of data back to the database. Naturally, this means the activity within the replication stream will increase equally.

Statement-based replication will replicate the exact queries used in the batch jobs, so if the query took half an hour to process on the master, the slave thread will be stalled for at least the same amount of time. This means no other data can replicate and the slave nodes will start lagging behind the master. If this exceeds the threshold of your failover tool or proxy, it may drop these slave nodes from the available servers in the cluster. If you are using statement-based replication, you can prevent this by crunching the data for your job in smaller batches.

Now, you may think row-based replication isn’t affected by this, as it will replicate the row information instead of the query. This is partly true as, for DDL changes, the replication reverts back to statement-based format. Also, large numbers of CRUD operations will affect the replication stream: in most cases, this is still a single-threaded operation and thus every transaction will wait for the previous one to be replayed via replication. This means that if you have high concurrency on the master, the slave may stall on the overload of transactions during replication.

To get around this, both MariaDB and MySQL offer parallel replication. The implementation may differ per vendor and version. MySQL 5.6 offers parallel replication as long as the queries are separated by the schema. MariaDB 10.0 and MySQL 5.7 both can handle parallel replication across schemas but have other boundaries. Executing queries via parallel slave threads may speed up your replication stream if you are writing heavy, otherwise, it would be better to stick to the traditional single-threaded replication.

Handling your schema change or DDLs

Since the release of 5.7, managing the schema change or DDL (Data Definition Language ) change in MySQL has improved a lot. Until MySQL 8.0, DDL changes algorithms supported are COPY and INPLACE.

  • COPY: This algorithm creates a new temporary table with the altered schema, once it migrates the data completely to the new temporary table it swaps and drops the old table.

  • INPLACE: This algorithm performs operations in place to the original table and avoids the table copy and rebuild, whenever possible.

In MySQL 8.0, the algorithm INSTANT has been introduced which makes instant and in-place table alterations for column addition and allows concurrent DML with improved responsiveness and availability in busy production environments. This helps to avoid huge lags and stalls in the replica that were usually big problems in the application perspective causing stale data to be retrieved as the reads in the slave have not yet updated due to lag.

Although that is a promising improvement, there are still limitations with them, and sometimes it is not possible to apply those INSTANT and INPLACE algorithms. For example, changing a column's data type which is also a common change especially in the development perspective due to data change. These occasions are inevitable, thus you cannot go on with the COPY algorithm as this locks up the table causing delays in the slave, and it also impacts the primary/master server during this execution as it piles up incoming transactions that also references the table affected. You cannot perform a direct ALTER or schema change on a busy server as this accompanies downtime or possibly corrupts your database if you lose your patience especially if the target table is huge.

It is true that performing schema changes on a running production setup is always a pain. A frequently used workaround is to apply the schema change to the slave nodes first. For statement-based replication, this works fine, but for row-based replication, this can work up to a certain degree. Row-based replication allows extra columns to exist at the end of the table, so as long as it is able to write the first columns it will be fine. First apply the change to all slaves, then failover to one of the slaves and then apply the change to the master and attach that as a slave. If your change involves inserting a column in the middle or removal of a column this will work with row-based replication.

There are tools around that can perform online schema changes more reliably. Commonly used by DBAs are The Percona Online Schema Change (as known as pt-osc) and gh-ost by Schlomi Noach. These tools handle schema changes effectively by grouping the affected rows into chunks. Whereas these chunks can be configured accordingly depending on how many you want to group. 

If you are going to jump with pt-osc, this tool will create a shadow table with the new table structure, insert new data via triggers and backfill data in the background. Once it is done creating the new table, it will simply swap the old for the new table inside a transaction. This doesn’t work in all cases, especially if your existing table already has triggers.

Whereas using gh-ost, it will first make a copy of your existing table layout, alter the table to the new layout, and then hook up the process as a MySQL replica. It will make use of the replication stream to find new rows that have been inserted into the original table and, at the same time, backfills the table. Once it is done backfilling, the original and new tables will switch. Naturally, all operations to the new table will end up in the replication stream as well, thus on each replica, the migration happens at the same time.

Memory Tables and Replication

While we are on the subject of DDLs, a common issue is the creation of memory tables. Memory tables are non-persistent tables, their table structure remains but they lose their data after a restart of MySQL. When creating a new memory table on both a master and a slave, they will have an empty table and this will work perfectly fine. Once either one gets restarted, the table will be emptied and replication errors will occur.

Row-based replication will break once the data in the slave node returns different results, and statement-based replication will break once it attempts to insert data that already exists. For memory tables, this is a frequent replication-breaker. The fix is easy: make a fresh copy of the data, change the engine to InnoDB and it should now be replication safe.

Setting the read_only={True|1}

This is of course a possible case when you are using a ring topology. Whereas, we discourage the use of ring topology if possible. We described earlier, that not having the same data in the slave nodes can break replication. Often, this has been caused by something (or someone) altering the data on the slave node, but not on the master node. Once the master node’s data gets altered, this will be replicated to the slave where it can’t apply the change and this causes the replication to break. This can lead to data corruption at the cluster level as well, especially if the slave has been promoted or has failed over due to a crash. That can be a disaster.

An easy prevention for this is to make sure read_only and super_read_only (only on > 5.6) are set in ON or 1. You might have understood how these two variables differ and how it affects if you disable or enable them. With super_read_only (since MySQL 5.7.8) disabled, this allows the root user to prevent any changes in the target or replica. So when both are disabled, this will disallow anyone to make changes to the data, except for the replication. Most failover managers, such as ClusterControl, set this flag automatically to prevent users from writing to the used master during failover. Some of them even retain this after the failover.

Enabling GTID

In MySQL replication, it is essential to start the slave from the correct position in the binary logs. Obtaining this position can be done when making a backup (xtrabackup and mysqldump support this) or when you have stopped slaving on a node that you are making a copy of. Starting replication with the CHANGE MASTER TO command would look like this:

mysql> CHANGE MASTER TO MASTER_HOST='x.x.x.x',
MASTER_USER='replication_user', 
MASTER_PASSWORD='password', 
MASTER_LOG_FILE='master-bin.00001', 
MASTER_LOG_POS=4;

Starting replication at the wrong spot can have disastrous consequences: data may be double written or not updated. This causes data drift between the master and the slave node.

Also when failing over a master to a slave involves finding the correct position and changing the master to the appropriate host. MySQL doesn’t retain the binary logs and positions from its master, but rather creates its own binary logs and positions. For re-aligning a slave node to the new master this could become a serious problem: the exact position of the master on failover has to be found on the new master, and then all slaves can be realigned.

To solve this issue, the Global Transaction Identifier (GTID) has been implemented by both Oracle MySQL and MariaDB. GTIDs allow auto aligning of slaves and the server figures out by itself what the correct position is. However, both have implemented the GTID in a different way and are therefore incompatible. If you need to set up replication from one to another, the replication should be set up with traditional binary log positioning. Also, your failover software should be made aware not to make use of GTIDs.

Crash-Safe Slave

Crash safe means even if a slave MySQL/OS crashes, you can recover the slave and continue replication without restoring MySQL databases onto the slave. To make crash-safe slave work, you have to use InnoDB storage engine only, and in 5.6 you need to set relay_log_info_repository=TABLE and relay_log_recovery=1.

Conclusion

Practice makes perfect indeed but without proper training and knowledge of these vital techniques, it could be troublesome or can lead to a disaster. These practices are commonly adhered to by experts in MySQL and are adapted by large industries as part of their daily routine job when administering the MySQL Replication in the production database servers. We have some additional white papers that might be useful if you’d like to read more about MySQL replication.

More from This Author

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.