Severalnines Blog
The automation and management blog for open source databases

Multi-source Replication with MariaDB Galera Cluster

Severalnines
Posted in:

MariaDB 10 supports multi-source replication, and each MariaDB Galera node can have up to 64 masters connected to it. So it is possible to use a MariaDB Cluster as an aggregator for many single-instance MariaDB master servers.

In this blog post, we are going to show you how to setup multi-source replication with MariaDB Galera Cluster, where one of the Galera nodes is acting as slave to 3 MariaDB masters (see diagram below). If you would like to set this up with Percona XtraDB Cluster or Galera Cluster (Codership), please read this post instead.

 

MariaDB GTID vs MySQL GTID

 

MariaDB has a different implementation of Global Transaction ID (GTID), and is enabled by default starting from MariaDB 10.0.2. Multi-source replication in MariaDB works with both GTID and the legacy binlog file and position, as compared to the MySQL implementation

A GTID consists of three separated values:

  • Domain ID - Replication domain. A replication domain is a server or group of servers that generate a single, strictly ordered replication stream.
  • Server ID - Server identifier number to enable master and slave servers to identify themselves uniquely.
  • Event Group ID - A sequence number for a collection of events that are always applied as a unit. Every binlog event group (eg. transaction, DDL, non-transactional statement) is annotated with its GTID.

The figure below illustrates the differences between the two GTIDs:

In MariaDB, there is no special configuration needed on the server to start using GTID. Some of MariaDB GTID advantages:

  • It is easy to identify which server or domain the event group is originating from
  • You do not necessarily need to turn on binary logging on slaves
  • It allows multi-source replication with distinct domain ID
  • Enabling GTID features is dynamic, you don’t have to restart the MariaDB server
  • The state of the slave is recorded in a crash-safe way

Despite the differences between these two, it is still possible to replicate from MySQL 5.6 to MariaDB 10.0 or vice versa. However, you will not be able to use the GTID features to automatically pick the correct binlog position when switching to a new master. Old-style MySQL replication will work. We highly recommend you to read the MariaDB GTID knowledge base.

 

MariaDB Galera Cluster as Slave

 

In our setup, we used MariaDB 10.0.14 on masters and MariaDB Galera Cluster 10.0.14 as slave. We have three master servers (mariadb1, mariadb2, mariadb3) and each master has a separate database: mydb1, mydb2 and mydb3. The 3 servers replicate to a Galera node (mgc1) in multi-source mode.

When using multi-source replication, where a single slave connects to multiple masters, each master needs to be configured with its own distinct domain ID.

MariaDB provides a function to easily determine the GTID value according to the binary log file and position, which is usually recorded by the MySQL backup applications (mysqldump or Xtrabackup) :

MariaDB> SELECT BINLOG_GTID_POS('mysql-bin.000003',155267212);
+-----------------------------------------------+
| BINLOG_GTID_POS('mysql-bin.000003',155267212) |
+-----------------------------------------------+
| 1-101-340                                     |
+-----------------------------------------------+

 

Setting up Masters

 

1. On each standalone MariaDB server, configure it as a master by adding a server ID, domain ID and enable binary logging with ROW format under [mysqld] directive:

# mariadb1 my.cnf
server-id=101
log-bin=binlog
gtid-domain-id=1
binlog-format=ROW

 

# mariadb2 my.cnf
server-id=102
log-bin=binlog
gtid-domain-id=2
binlog-format=ROW

 

# mariadb3 my.cnf
server-id=103
log-bin=binlog
gtid-domain-id=3
binlog-format=ROW

 

2. Perform reset master so we can get a correct binary log entry with assigned domain ID:

MariaDB> RESET MASTER;

 

3. Then, create and grant a replication user:

MariaDB> GRANT REPLICATION SLAVE ON *.* TO 'slave'@'%' IDENTIFIED BY 'slavepassword';
MariaDB> FLUSH PRIVILEGES;

 

Setting up Slaves

 

The asynchronous replication slave thread is stopped when a node tries to apply replication events and it is in a non-primary state. By default, it remains stopped after successfully re-joining the cluster. It is recommended to configure wsrep_restart_slave=1 which enables the MySQL slave to be restarted automatically when the node rejoins the cluster.

 

1. On the corresponding Galera nodes, set the configuration as below:

# mgc1 my.cnf
server-id=201
binlog_format=ROW
log_slave_updates=1
log_bin=binlog
wsrep-restart-slave=1

 

# mgc2 my.cnf
server-id=202
binlog_format=ROW
log_slave_updates=1
log_bin=binlog
wsrep-restart-slave=1

 

# mgc3 my.cnf
server-id=203
binlog_format=ROW
log_slave_updates=1
log_bin=binlog
wsrep-restart-slave=1

 

** Perform a rolling restart of the cluster to apply the new changes. For ClusterControl users, go to ClusterControl > Upgrades > Rolling Restart.

 

2. Assuming that you already granted the database user on Galera hosts from the MariaDB nodes, dump each MariaDB database on the Galera node (mgc1):

$ mysqldump -u mydb1 -p -h 10.0.0.61 --single-transaction --master-data=2 mydb1 > mydb1.sql
$ mysqldump -u mydb2 -p -h 10.0.0.62 --single-transaction --master-data=2 mydb2 > mydb2.sql
$ mysqldump -u mydb3 -p -h 10.0.0.63 --single-transaction --master-data=2 mydb3 > mydb3.sql

 

** To ensure Galera replicates data smoothly, ensure all tables are running on InnoDB. Before you restore, you can use the following command to convert the dump file if it contains MyISAM tables:

$ sed -i 's|MyISAM|InnoDB|g' [the dump file]

 

3. On mgc1, create the corresponding databases and restore the dump files:

$ mysql -uroot -p -e 'CREATE SCHEMA mydb1; CREATE SCHEMA mydb2; CREATE SCHEMA mydb3'
$ mysql -uroot -p mydb1 < mydb1.sql
$ mysql -uroot -p mydb2 < mydb2.sql
$ mysql -uroot -p mydb3 < mydb3.sql

 

4. Before we start replication, we need to determine the GTID value on each master. From the dump file itself, we can get both the binary log position and gtid_slave_pos (generated by --master-data=2). Extract the gtid_slave_pos value:

$ head -100 mydb1.sql | grep gtid_slave_pos
-- SET GLOBAL gtid_slave_pos='1-101-676';
 
$ head -100 mydb2.sql | grep gtid_slave_pos
-- SET GLOBAL gtid_slave_pos='2-102-338';
 
$ head -100 mydb3.sql | grep gtid_slave_pos
-- SET GLOBAL gtid_slave_pos='3-103-338';

 

5. Combine all retrieved GTID values and set them as gtid_slave_pos on mgc1:

MariaDB> SET GLOBAL gtid_slave_pos = "1-101-676,2-102-338,3-103-338";

 

6. Configure master connection for each replication stream, distinguish by default_master_connection session variable:

MariaDB> SET @@default_master_connection='mariadb1';
MariaDB> CHANGE MASTER 'mariadb1' TO MASTER_HOST='10.0.0.61', MASTER_PORT=3306, MASTER_USER='slave', MASTER_PASSWORD='slavepassword', MASTER_USE_GTID=slave_pos;
 
MariaDB> SET @@default_master_connection='mariadb2';
MariaDB> CHANGE MASTER 'mariadb2' TO MASTER_HOST='10.0.0.62', MASTER_PORT=3306, MASTER_USER='slave', MASTER_PASSWORD='slavepassword', MASTER_USE_GTID=slave_pos;
 
MariaDB> SET @@default_master_connection='mariadb3';
MariaDB> CHANGE MASTER 'mariadb3' TO MASTER_HOST='10.0.0.63', MASTER_PORT=3306, MASTER_USER='slave', MASTER_PASSWORD='slavepassword', MASTER_USE_GTID=slave_pos;

 

7. Start all slaves:

MariaDB> START ALL SLAVES;
MariaDB> SHOW WARNINGS;
+-------+------+--------------------------+
| Level | Code | Message                  |
+-------+------+--------------------------+
| Note  | 1937 | SLAVE 'mariadb2' started |
| Note  | 1937 | SLAVE 'mariadb3' started |
| Note  | 1937 | SLAVE 'mariadb1' started |
+-------+------+--------------------------+

 

8. Verify that all slaves started correctly:

MariaDB> SHOW ALL SLAVES STATUS\G

 

And ensure you get on each of the connection:

Connection_name: mariadb1
...
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
...
Connection_name: mariadb2
...
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
...
Connection_name: mariadb3
...
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

 

At this point, our MariaDB Galera Cluster has started to accept replication events from three different sources via mgc1.

 

From your ClusterControl dashboard, you will notice the incoming replication load on mgc1 (10.0.0.71) is propagated to the other nodes of the cluster:

 

We can now see that our databases from our three master sources have been replicated into the cluster, as shown in ClusterControl > Performance >DB Growth: