blog

Multi-Source Replication with Galera Cluster for MySQL

Ashraf Sharif

Published: November 20, 2014
Last Updated: May 4, 2022

Multi-source replication means that one server can have multiple masters from which it replicates. Why multi-source? One good reason is to consolidate databases (e.g. merge your shards) for analytical reporting or as a centralized backup server. MariaDB 10 already has this feature, and MySQL 5.7 will also support it.

It is possible to set up your Galera Cluster as an aggregator of your masters in a multi-source replication setup, we’ll walk you through the steps in this blog. Note that the howto is for Galera Cluster for MySQL (Codership) and Percona XtraDB Cluster. In a separate post, we’ll show you how to configure MariaDB Cluster 10 instead. If you would like to use MySQL Cluster (NDB) as aggregator, then check out this blog.

Galera Cluster as Aggregator/Slave

Galera cluster can operate both as MySQL master and slave. Each Galera node can act as a slave channel accepting replication from a master. The number of slave channels should be equal or less to the number of Galera master nodes in the cluster. So, if you have a three-node Galera cluster, you can have up to three different replication sources connected to it. Note that in MariaDB Galera Cluster 10, you can configure as many sources as you want since each node supports multi-source replication.

To achieve multi-source replication in MySQL 5.6, you cannot have GTID enabled for Galera Cluster. GTID will cause our Galera cluster to work as a single unit (imagine one single slave server), since it globally preserves the MySQL GTID events on the cluster. So the cluster will not be able to replicate from more than one master. Hence, we will use the “legacy” way to determine the starting binary log file and position. On a side note, enabling GTID is highly recommended if your Galera Cluster acts as a MySQL master, as described in this blog post.

We will setup multi-source replication as below:

We have 3 standalone MySQL servers (masters), and each master has a separate database: mydb1, mydb2 and mydb3. We would like to consolidate all 3 databases into our Galera cluster.

Setting Up Masters

1. On each standalone MySQL server, configure it as a master by adding a server ID, enabling binary logging with ROW format:

# mysql1 my.cnf
server-id=101
log-bin=binlog
binlog-format=ROW

# mysql2 my.cnf
server-id=102
log-bin=binlog
binlog-format=ROW

# mysql3 my.cnf
server-id=103
log-bin=binlog
binlog-format=ROW

2. Then, create and grant a replication user:

mysql> GRANT REPLICATION SLAVE ON *.* TO 'slave'@'%' IDENTIFIED BY 'slavepassword'
mysql> FLUSH PRIVILEGES

Setting Up Slaves

The asynchronous replication slave thread is stopped when a node tries to apply replication events and it is in a non-primary state. It remains stopped after successfully re-joining the cluster as default. It is recommended to configure wsrep_restart_slave=1 which enables the MySQL slave to be restarted automatically when the node rejoins the cluster.

1. On each of the Galera node, configure MySQL configuration as below:

# galera1 my.cnf
server-id=201
log-bin=binlog
log-slave-updates=1
wsrep-restart-slave=1

# galera2 my.cnf
server-id=202
log-bin=binlog
log-slave-updates=1
wsrep-restart-slave=1

# galera3 my.cnf
server-id=203
log-bin=binlog
log-slave-updates=1
wsrep-restart-slave=1

** Perform a rolling restart of the cluster to apply the new changes. For ClusterControl users, go to ClusterControl > Upgrades > Rolling Restart.

2. Assume that you already granted the database user on Galera hosts from MySQL nodes, dump each MySQL database on the respective Galera node:

galera1:

$ mysqldump -u mydb1 -p -h mysql1 --single-transaction --master-data=1 mydb1 > mydb1.sql

galera2:

$ mysqldump -u mydb2 -p -h mysql2 --single-transaction --master-data=1 mydb2 > mydb2.sql

galera3:

$ mysqldump -u mydb3 -p -h mysql3 --single-transaction --master-data=1 mydb3 > mydb3.sql

** To ensure Galera replicates data smoothly, ensure all tables are running on InnoDB. Before you restore, you can use the following command to convert the dump file if it contains MyISAM tables:

$ sed -i 's|MyISAM|InnoDB|g' [the dump file]

3. On each Galera node, create the corresponding database and restore the dump files into the Galera Cluster:

galera1:

$ mysql -uroot -p -e 'CREATE SCHEMA mydb1'
$ mysql -uroot -p mydb1 < mydb1.sql

galera2:

$ mysql -uroot -p -e 'CREATE SCHEMA mydb2'
$ mysql -uroot -p mydb2 < mydb2.sql

galera3:

$ mysql -uroot -p -e 'CREATE SCHEMA mydb3'
$ mysql -uroot -p mydb3 < mydb3.sql

** Above steps should be performed on each of the Galera node so that each slave can position correctly in the binary log as written in the dump file.

4. Point each Galera node to its respective master:

galera1:

mysql> STOP SLAVE
mysql> CHANGE MASTER TO MASTER_HOST = 'mysql1', MASTER_PORT=3306, MASTER_USER='slave', MASTER_PASSWORD='slavepassword'
mysql> START SLAVE

galera2:

mysql> STOP SLAVE
mysql> CHANGE MASTER TO MASTER_HOST = 'mysql2', MASTER_PORT=3306, MASTER_USER='slave', MASTER_PASSWORD='slavepassword'
mysql> START SLAVE

galera3:

mysql> STOP SLAVE
mysql> CHANGE MASTER TO MASTER_HOST = 'mysql3', MASTER_PORT=3306, MASTER_USER='slave', MASTER_PASSWORD='slavepassword'
mysql> START SLAVE

Verify if slaves start correctly:

mysql> SHOW SLAVE STATUSG

And ensure you get:

...
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

At this point, our Galera cluster has starting to accept replication events from three different sources.

ClusterControl will detect if the Galera node is running as a slave node automatically. A new node indicator for slave and Slave Nodes table grid will appear showing the slave monitoring data in the Overview page:

We can now see that our databases from multiple sources have been replicated into the cluster, as shown in the DB Growth screenshot below:

Caveats

Since MySQL replication is single threaded, Galera node will apply replication events as fast as native MySQL slave. As a workaround, you can configure wsrep_mysql_replication_bundle=n, to group n MySQL replication transactions in one large transaction.
MySQL replication events are treated as regular MySQL clients, they must go through Galera replication pipeline at commit time. This adds some delay before commit.

ClickHouse Schema Design and Data Modeling

Building a Modern Analytics Stack Around ClickHouse

Managing ClickHouse Resources in Multi-Tenant Environments

Advanced Partitioning Strategies for PostgreSQL OLTP and Analytics Datasets at Scale

Galera Cluster as Aggregator/Slave

Setting Up Masters

Setting Up Slaves

Caveats

Recommended

ClickHouse Schema Design and Data Modeling

Building a Modern Analytics Stack Around ClickHouse

Managing ClickHouse Resources in Multi-Tenant Environments

Advanced Partitioning Strategies for PostgreSQL OLTP and Analytics Datasets at Scale

Subscribe below to be notified of fresh posts