blog
Multi-Source Replication with Galera Cluster for MySQL
Multi-source replication means that one server can have multiple masters from which it replicates. Why multi-source? One good reason is to consolidate databases (e.g. merge your shards) for analytical reporting or as a centralized backup server. MariaDB 10 already has this feature, and MySQL 5.7 will also support it.
It is possible to set up your Galera Cluster as an aggregator of your masters in a multi-source replication setup, we’ll walk you through the steps in this blog. Note that the howto is for Galera Cluster for MySQL (Codership) and Percona XtraDB Cluster. In a separate post, we’ll show you how to configure MariaDB Cluster 10 instead. If you would like to use MySQL Cluster (NDB) as aggregator, then check out this blog.
Galera Cluster as Aggregator/Slave
Galera cluster can operate both as MySQL master and slave. Each Galera node can act as a slave channel accepting replication from a master. The number of slave channels should be equal or less to the number of Galera master nodes in the cluster. So, if you have a three-node Galera cluster, you can have up to three different replication sources connected to it. Note that in MariaDB Galera Cluster 10, you can configure as many sources as you want since each node supports multi-source replication.
To achieve multi-source replication in MySQL 5.6, you cannot have GTID enabled for Galera Cluster. GTID will cause our Galera cluster to work as a single unit (imagine one single slave server), since it globally preserves the MySQL GTID events on the cluster. So the cluster will not be able to replicate from more than one master. Hence, we will use the “legacy” way to determine the starting binary log file and position. On a side note, enabling GTID is highly recommended if your Galera Cluster acts as a MySQL master, as described in this blog post.
We will setup multi-source replication as below:
We have 3 standalone MySQL servers (masters), and each master has a separate database: mydb1, mydb2 and mydb3. We would like to consolidate all 3 databases into our Galera cluster.
Setting Up Masters
1. On each standalone MySQL server, configure it as a master by adding a server ID, enabling binary logging with ROW format:
# mysql1 my.cnf server-id=101 log-bin=binlog binlog-format=ROW
# mysql2 my.cnf server-id=102 log-bin=binlog binlog-format=ROW
# mysql3 my.cnf server-id=103 log-bin=binlog binlog-format=ROW
2. Then, create and grant a replication user:
mysql> GRANT REPLICATION SLAVE ON *.* TO 'slave'@'%' IDENTIFIED BY 'slavepassword' mysql> FLUSH PRIVILEGES
Setting Up Slaves
The asynchronous replication slave thread is stopped when a node tries to apply replication events and it is in a non-primary state. It remains stopped after successfully re-joining the cluster as default. It is recommended to configure wsrep_restart_slave=1 which enables the MySQL slave to be restarted automatically when the node rejoins the cluster.
1. On each of the Galera node, configure MySQL configuration as below:
# galera1 my.cnf server-id=201 log-bin=binlog log-slave-updates=1 wsrep-restart-slave=1
# galera2 my.cnf server-id=202 log-bin=binlog log-slave-updates=1 wsrep-restart-slave=1
# galera3 my.cnf server-id=203 log-bin=binlog log-slave-updates=1 wsrep-restart-slave=1
** Perform a rolling restart of the cluster to apply the new changes. For ClusterControl users, go to ClusterControl > Upgrades > Rolling Restart.
2. Assume that you already granted the database user on Galera hosts from MySQL nodes, dump each MySQL database on the respective Galera node:
galera1:
$ mysqldump -u mydb1 -p -h mysql1 --single-transaction --master-data=1 mydb1 > mydb1.sql
galera2:
$ mysqldump -u mydb2 -p -h mysql2 --single-transaction --master-data=1 mydb2 > mydb2.sql
galera3:
$ mysqldump -u mydb3 -p -h mysql3 --single-transaction --master-data=1 mydb3 > mydb3.sql
** To ensure Galera replicates data smoothly, ensure all tables are running on InnoDB. Before you restore, you can use the following command to convert the dump file if it contains MyISAM tables:
$ sed -i 's|MyISAM|InnoDB|g' [the dump file]
3. On each Galera node, create the corresponding database and restore the dump files into the Galera Cluster:
galera1:
$ mysql -uroot -p -e 'CREATE SCHEMA mydb1' $ mysql -uroot -p mydb1 < mydb1.sql
galera2:
$ mysql -uroot -p -e 'CREATE SCHEMA mydb2' $ mysql -uroot -p mydb2 < mydb2.sql
galera3:
$ mysql -uroot -p -e 'CREATE SCHEMA mydb3' $ mysql -uroot -p mydb3 < mydb3.sql
** Above steps should be performed on each of the Galera node so that each slave can position correctly in the binary log as written in the dump file.
4. Point each Galera node to its respective master:
galera1:
mysql> STOP SLAVE mysql> CHANGE MASTER TO MASTER_HOST = 'mysql1', MASTER_PORT=3306, MASTER_USER='slave', MASTER_PASSWORD='slavepassword' mysql> START SLAVE
galera2:
mysql> STOP SLAVE mysql> CHANGE MASTER TO MASTER_HOST = 'mysql2', MASTER_PORT=3306, MASTER_USER='slave', MASTER_PASSWORD='slavepassword' mysql> START SLAVE
galera3:
mysql> STOP SLAVE mysql> CHANGE MASTER TO MASTER_HOST = 'mysql3', MASTER_PORT=3306, MASTER_USER='slave', MASTER_PASSWORD='slavepassword' mysql> START SLAVE
Verify if slaves start correctly:
And ensure you get:
... Slave_IO_Running: Yes Slave_SQL_Running: Yes
At this point, our Galera cluster has starting to accept replication events from three different sources.
ClusterControl will detect if the Galera node is running as a slave node automatically. A new node indicator for slave and Slave Nodes table grid will appear showing the slave monitoring data in the Overview page:
We can now see that our databases from multiple sources have been replicated into the cluster, as shown in the DB Growth screenshot below:
Caveats
- Since MySQL replication is single threaded, Galera node will apply replication events as fast as native MySQL slave. As a workaround, you can configure wsrep_mysql_replication_bundle=n, to group n MySQL replication transactions in one large transaction.
- MySQL replication events are treated as regular MySQL clients, they must go through Galera replication pipeline at commit time. This adds some delay before commit.