How to Monitor Your Databases with ClusterControl and Opsgenie

Krzysztof Ksiazek

ClusterControl is a platform for monitoring and managing open source databases. It will provide you with a  single pane of glass to understand what is happening in the system, it will let you know via alerts that something is wrong and it will give you tools to ensure that you can bring the state of the cluster back to normal. There are still some functionalities that are not available directly in ClusterControl, one of them is managing the on call rotation and working with alerts. That’s why ClusterControl gives you an option to integrate with external tools and extend what you can achieve using those dedicated tools. One of the integrations that are directly supported in ClusterControl is OpsGenie - an oncall management tool. Let’s take a look at how we can easily integrate ClusterControl with it.

There are a couple of prerequisites on the OpsGenie part that we will not be going through. Basically, you have to have teams defined with members. You may also want to have an on call rotation in place - in general, you want to have OpsGenie configured to your liking. Once it’s done, the next step will be to setup the integration in ClusterControl.

How to Monitor Your Databases with ClusterControl and Opsgenie

You can do it in the “Integrations” section in the left hand side menu - if you don’t have any existing integration, you will see the screen exactly as on the screenshot above. Just click on “Add your first service” and we can proceed further.

How to Monitor Your Databases with ClusterControl and Opsgenie

You want to pick OpsGenie from the list of the integrations.

How to Monitor Your Databases with ClusterControl and Opsgenie

Then we fill the data - integration name, region in which our OpsGenie setup is working as well as the name of the teams that should get the notifications. We should also fill in the API key for the team that we can create in OpsGenie - instructions that you can see on the screenshot above are quite clear and should be enough to get this done.

How to Monitor Your Databases with ClusterControl and Opsgenie

In short, in the “Teams” menu we picked our team and then used “Integrations” to add new API integration. Then we can get the API key to use with ClusterControl.

How to Monitor Your Databases with ClusterControl and Opsgenie

Once we set up everything in ClusterControl we can test this by clicking on “Test”. If everything is ok, you will see a notification. Please keep in mind that you have to fill all the forms here, we removed the API key only for the purpose of getting this screenshot.

How to Monitor Your Databases with ClusterControl and Opsgenie

As a next step we have to decide which alerts will be sent to OpsGenie. We can pick all or only some of the clusters defined in ClusterControl.

How to Monitor Your Databases with ClusterControl and Opsgenie

We can also pick the events that we want to be sent to OpsGenie based on their severity and category.

How to Monitor Your Databases with ClusterControl and Opsgenie

Once it is done, we can see our integration added into ClusterControl.

Now, let’s see if it actually works for real. For that we are going to simulate a master failure on our MariaDB 10.5 replication cluster. We will kill mariadbd process:

[email protected]:~# killall -9 mariadbd

[email protected]:~# killall -9 mariadbd

mariadbd: no process found

Next, we’ll wait a bit and see what has been sent to OpsGenie:

How to Monitor Your Databases with ClusterControl and Opsgenie

As you can see on the screenshot above, multiple alerts have been opened. Some of them have been cleared but some are still open. Those are related to the failed master. We can check the state of the cluster in ClusterControl.

How to Monitor Your Databases with ClusterControl and Opsgenie

As you can see, we have a failed master running as a node with read_only=ON. It is not a part of the replication topology. What we can do here is to bring it back into the replication chain as a slave. This step can happen automatically if you want but the default behavior of ClusterControl allows you to investigate the failed master before it is rebuilt. In this case we will trigger the rebuild process manually.

How to Monitor Your Databases with ClusterControl and Opsgenie

You can trigger it from Nodes Actions menu.

How to Monitor Your Databases with ClusterControl and Opsgenie

We picked the master node to get the data from and clicked “Proceed”.

How to Monitor Your Databases with ClusterControl and Opsgenie

After a bit of time, which depends on the size of the data as well as disk and network speed, the job should complete and we will see a new, clean replication topology.

How to Monitor Your Databases with ClusterControl and Opsgenie

In the meantime, in OpsGenie, we see that all of the alerts are now cleared and closed as the situation is back to normal.

Conclusion

As you can see, the integration with OpsGenie is really easy to set up and it will allow your team to develop and manage the on call rotation while using ClusterControl to manage your databases.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.