High Availability Configuration for ClusterControl Nodes Using CMON HA

Paul Namuag

In our previous blog, we have discussed ClusterControl CMON HA for Distributed Database High Availability written by Krzysztof Ksiazek in two separate posts. In this blog, we'll cover the distribution of nodes via on-prem and on a public cloud (using Google Cloud Platform (GCP)).

The reason we wrote this blog is because we have received questions about how to implement a high availability instance of ClusterControl having CMON node(s) running on-prem and another CMON node(s) running on a different data center (such as a public cloud). In our previous blog ClusterControl CMON HA for Distributed Database High Availability, we were using Galera Cluster nodes, but this time we'll use MySQL Replication using Percona Server 5.7. An ideal setup for this is to always encapsulate the communication of nodes from your on-prem and your nodes residing in a public cloud via VPN or a secure channel. 

ClusterControl CMON HA is at the early stages for which we believe it's not yet mature enough. Yet, our CMON HA is able to provide you the sense of functionality for deploying a ClusterControl to make it highly available. Let's proceed on how you can deploy and setup distributing the nodes via on-prem  through the public cloud.

What is a CMON?

Before going to the main topic, let us introduce to you what is CMON. CMON stands for ClusterControl Controller, which is the “primary brain” of ClusterControl. A backend service performing automation, management, monitoring scheduling tasks, and also the HA availability. Data that are collected are stored into the CMON database, for which we're using MySQL compatible databases as the datastore.

The Architectural Setup

Some of you might not have known the capabilities of ClusterControl that it can perform and be set up for high-availability. If you have multiple ClusterControl (or CMON) nodes running, that is possible at no cost. You might be able to run tons of ClusterControl nodes whenever you needed. 

For this setup, we'll have ClusterControl nodes on-top of a ClusterControl in order to create or deploy the database nodes and manage an auto failover whenever a failure occurs, for example. Although you can use MHA, Orchestrator, or Maxscale to manage the auto-failover, but for efficiency and speed, I'll use ClusterControl to do the special things that other tools I have mentioned do not have.

So let's have a look at the diagram for this setup:

The setup based on that diagram shows that on top of three-node CMON, a running CMON (ClusterControl) is on-top of them which will monitor the automatic failover. Then, HAProxy will be able to load balance between the monitored three CMON nodes, wherein one node is located in a separate region hosted in GCP for this blog. You might notice that we didn't include Keepalived, that's because we cannot place a VIP under GCP since it's on a different network.

As you might have noticed, we place a total of three nodes. CMON HA requires that we need at least 3 nodes in order to proceed a voting process or so called quorum. So for this setup, we require that you have at least 3 nodes to have higher availability.

Deploying an On-Prem ClusterControl Nodes

In this section, we do expect that you have already setup or installed your ClusterControl UI which we will use to deploy a three node MySQL Replication cluster using Percona Server.

Let's first create the cluster by deploying a new MySQL Replication as shown below.

Take note that I am using Percona Server 5.7 here, for which the default setup by ClusterControl works efficiently.

Then define the hostname or IP of your nodes,

At this point, we expect that you have already set up a two node Master/Slave replication which is hosted or running on-prem. The screenshot below should show how your nodes will look like:

Setup & Install ClusterControl and Enable CMON HA On The First Node

From this previous blog  ClusterControl CMON HA for Distributed Database High Availability, we have briefly provided the steps on how to do this. Let's go down again and do the steps as stated but for this particular Master/Slave replication setup.

First thing to do, pick one node you want first ClusterControl to be installed (on this setup, I end up installing first on 192.168.70.80 node) and do the steps below.

Step One

Install ClusterControl

$ wget http://www.severalnines.com/downloads/CMON/install-cc

$ chmod +x install-cc

$ sudo ./install-cc   # omit sudo if you run as root

Take note that once you are prompted that a current mysql instance is detected, you need to let ClusterControl use the existing mysqld running since that's one of our goals here for CMON HA and for this setup to use the already setup MySQL.

Step Two

Bind CMON not only to allow via localhost, but also on the specific IP address (since we'll be enabling HA)

## edit /etc/default/CMON  and modify the line just like below or add the line if it doesn't exist

RPC_BIND_ADDRESSES="127.0.0.1,192.168.70.80"

Step Three

Then restart CMON,

service CMON restart

Step Four

Install s9s CLI tools

$ wget http://repo.severalnines.com/s9s-tools/install-s9s-tools.sh

$ chmod 755 install-s9s-tools.sh

$ ./install-s9s-tools.sh

During this installation, the s9s tool will setup an admin user for which you can use when dealing with s9s command, just like enabling CMON HA.

Step Five

Enable the CMON HA

$ s9s controller --enable-CMON-ha

Step Six

Lastly, modify the /etc/my.cnf and add,

slave-skip-errors = 1062

under the [mysqld] section. Once added, do not forget to restart mysql as,

service mysql restart

or

systemctl restart mysql

Currently, this is the limitation we're facing with CMON HA since it tries to insert log entries to the slave but this can be fine for now.

Setup, Install ClusterControl and Enable CMON HA On The Second Node

Simple as that for the first node. Now, on the 2nd node (192.168.70.70),  we need to do the same steps but instead we need to do some adjustments in the steps to make this HA possible.

Step One

Copy the configuration to the 2nd node (192.168.70.70) from first node (192.168.70.80)

$ scp -r /etc/CMON* 192.168.70.70:/etc/

Step Two

In the 2nd node, edit the /etc/CMON.cnf and ensure that the host is correctly configured. e.g.

vi /etc/CMON.cnf

Then assign hostname param as,

hostname=192.168.70.70

Step Three

Install ClusterControl,

$ wget http://www.severalnines.com/downloads/CMON/install-cc

$ chmod +x install-cc

$ sudo ./install-cc   # omit sudo if you run as root

However, skip the installation of CMON (or ClusterControl Controller) once you encounter this line,

=> An existing Controller installation detected!

=> A re-installation of the Controller will overwrite the /etc/CMON.cnf file

=> Install the Controller? (y/N):

The rest, just do as what you've done on the first node such as setting up the hostname, use the existing mysqld running instance, providing the MySQL password, and password for your CMON which must be both has the same password with the first node.

Step Four

Install s9s CLI tools

$ wget http://repo.severalnines.com/s9s-tools/install-s9s-tools.sh

$ chmod 755 install-s9s-tools.sh

$ ./install-s9s-tools.sh

Step Five

Copy the remaining configuration from 1st node to the 2nd node.

$ scp -r ~/.s9s/ 192.168.70.70:/root/

$ scp /etc/s9s.conf 192.168.70.70:/etc/

$ scp /var/www/html/clustercontrol/bootstrap.php 192.168.70.70:/var/www/html/clustercontrol/

Step Six

Install clustercontrol-controller package,

For Ubuntu/Debian,

$ apt install -y clustercontrol-controller

For RHEL/CentOS/Fedora,

$ yum install -y clustercontrol-controller

Step Seven

Copy the /etc/default/CMON file and modify the IP address for the RPC bind address

scp /etc/default/CMON 192.168.70.70:/etc/default

RPC_BIND_ADDRESSES="127.0.0.1,10.0.0.103"

Then restart CMON as follows,

service CMON restart

Step Eight

Modify the /etc/my.cnf and add,

slave-skip-errors = 1062

under the [mysqld] section. Once added, do not forget to restart mysql as,

service mysql restart

or

systemctl restart mysql

Currently, this is the limitation we're facing with CMON HA since it tries to insert log entries to the slave but this can be fine for now.

Step Nine

Finally, check how the CMON HA nodes look like,

[[email protected] ~]#  s9s controller --list --long

S VERSION    OWNER GROUP NAME            IP PORT COMMENT

l 1.7.5.3735 system admins 192.168.70.80   192.168.70.80 9501 Acting as leader.

f 1.7.5.3735 system admins 192.168.70.70   192.168.70.70 9501 Accepting heartbeats.

Total: 2 controller(s)

Deploying Your ClusterControl Node In the Cloud

As we have mentioned earlier, the ideal setup for communication is to encapsulate the packets over the VPN or other means of secure channel. If you have concerns about how to do this, check our previous blog Multi-DC PostgreSQL: Setting Up a Standby Node at a Different Geo-Location Over a VPN for which we have tackled how you can create a simple VPN setup using OpenVPN. 

So in this section, we expect that you have already set up the VPN connection. Now, what we're going to do is add a slave that we're supposed to distribute the availability of CMON into Google Cloud Platform. To do this, just go to Add Replication Slave which can be found by clicking the cluster icon near the right corner. See how it looks like below:

Now, this is how we'll end up with:

Now, since we have a new slave added which is hosted under GCP, you may need to follow again on what we did earlier on the 2nd node. I'll relay you to follow those steps and follow the instructions on how we did on the 2nd node.

Once you have it correctly, you'll end up with the following result:

[[email protected] ~]# s9s controller --list --long

S VERSION    OWNER GROUP NAME            IP PORT COMMENT

l 1.7.5.3735 system admins 192.168.70.80   192.168.70.80 9501 Acting as leader.

f 1.7.5.3735 system admins 192.168.70.70   192.168.70.70 9501 Accepting heartbeats.

f 1.7.5.3735 system admins 10.142.0.39     10.142.0.39 9501 Accepting heartbeats.

where in nodes 

  • 192.168.70.80 -  (node8) and is residing in my on-prem
  • 192.168.70.70 - (node7) and is residing in my on-prem
  • 10.142.0.39  - (gnode1) is hosted in GCP and on different region

CMON HA In Action

My colleague Krzysztof Ksiazek already provided the setup for HA using HAProxy here on this blog ClusterControl CMON HA for Distributed Database High Availability - Part Two (GUI Access Setup)

To follow the procedure stated in the blog, ensure you have xinetd and pathlib packages. You can install xinetd and pathlib as follows,

$ sudo yum install -y xinetd python-pathlib.noarch

Ensure also that you have the CMONhachk defined in /etc/services just as below:

[[email protected] ~]# grep 'CMONhachk' /etc/services 

CMONhachk       9201/tcp

and ensure changes and restart xinetd,

service xinetd restart

I'll skip the Keepalived and HAProxy procedure and expect you have set up accordingly. One take away you have to consider on this setup is that using Keepalived cannot be applicable if you are dispersing the VIP from on-prem to the public cloud network because they're totally a different network.

Now, let's see how CMON HA reacts if nodes are down. As shown earlier, node 192.168.70.80 (node8), was acting as a leader just like as shown below:

Wherein the master node database also shows node8 is the master from ClusterControl topology view. Let's try to kill node8 and see how CMON HA proceeds,

As you see, gnode1 (GCP node) is taking over as a leader as node8 goes down. Checking the HAProxy results to the following,

and our ClusterControl nodes shows that node8 is down, whereas GCP node is taking over as the master,

Lastly, accessing my HAProxy node which is running on host 192.168.10.100 at port 81 shows the following UI,

Conclusion

Our ClusterControl CMON HA has been since version 1.7.2 but it has also been a challenge for us since various questions and preferences of how to deploy this such as using MySQL Replication over Galera Cluster

Our CMON HA is not mature yet but it is now ready to cater your high availability needs. Different approaches can be applicable as long as your checks will determine the right node that is up and running.

We encourage you to setup and deploy using CMON HA and let us know how well suits your needs or if the problem persists, please let us know how to help you cater your high availability necessities.

 
ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.