Why database automation?

Kyle Buzzell


Databases are the repositories of a company’s most critical information, and are typically the most complex part of the application stack. They are cumbersome to install and manage, especially when clustering or high availability is involved. However, high-availability cluster configurations are becoming the norm for today’s mission-critical databases.

There are a number of ways that databases can be made highly available, either with features that exist within the database or with the help of external components like the operating system, storage infrastructure or third-party clustering middleware. The huge number of knobs and dials to control these features means that there are many ways for databases to be deployed in non-standard ways. This can jeopardize the stability and performance of applications.

High availability database configurations tend to be highly complex, but once they are designed, they also tend to be duplicated many times with minimal variation. Therefore, automation can be applied to provisioning, upgrading, patching, failover, recovery, scaling and a number of other database procedures. Automating these monotonous tasks is a good way to make sure they are done correctly, and on schedule. Otherwise, no matter how perfectly designed the database, the system will very soon cease to function correctly without proper maintenance. Database and System Administrators can then focus on more critical tasks, such as performance tuning, query design, data modeling or providing architectural advice to application developers.

At the core of ClusterControl is its automation functionality that lets you automate many of the database tasks you have to perform regularly, from deployments, upgrades, discovering and troubleshooting anomalies, recovering from failures, topology changes, running backups and verifying data integrity, scaling and more.

Database automation with ClusterControl:

  • Ensures tasks and procedures are approached the same way, which increases business and IT agility
  • Centralizes the database management into a single interface
  • Ensures DBAs, SysAdmins and developers manage entire clusters efficiently with minimal risks, using industry best practices

What is the difference between ClusterControl and tools like phpMyAdmin, Nagios, Zabbix, Cacti?

phpMyAdmin is a database administration tool, allowing users to administer their databases. Nagios, Zabbix and Cacti are system monitoring tools that collect system data from hosts and devices via e.g. SNMP. The output is presented in the form of graphs through a web interface.

They try to make it easy to determine “what’s different today” when a performance problem crops up, and to see how resources are being utilized. Alerts and Notifications are raised when something changes unexpectedly on a monitored resource, e.g. metrics collected outside of standard baselines, or a service shuts down without warning.

These are all great tools and companies have plenty of monitoring, but what they don’t have is control and automation. A monitoring tool will not configure and deploy a highly available database cluster, or keep it up to date with the latest patch, or scale up or down by adding/removing nodes to it, or keep it up and running by automatically recovering from failures.

The reality of the situation is that there are no features in ClusterControl that couldn’t – with time and effort – be replicated via the command line or third party scripting or monitoring tools. However, we’ve already done the work on these specific operations, such as templated repeatable database server and cluster deployments, deployment and integration of proxy servers, monitoring and alerting, backups, restores & backup scheduling, automated cluster and node recovery, among others.

We also already support integration with third party tools such as PagerDuty and others, further reducing your development and maintenance burden.

While it is possible to create and maintain internal scripts and applications to perform each individual operation as necessary, this is labour-intensive and error prone, often requiring dedicated resources; and it will be difficult and expensive to obtain external support in emergency situations.

ClusterControl is deployed across hundreds of companies in many use cases worldwide, so the operations it performs are “battle-tested” and stable. ClusterControl gives you back the time and effort of maintaining ad-hoc internal solutions, and reduces the cost of running your database operations team.

Learning from the Automobile Industry

Ford created the automobile industry as we know it today through standardisation and automation of their manufacturing process. He famously said “You can have any color you want, as long as it’s black.” As other manufacturers took up his methods, this inevitably lead to certain standards across the automobile industry, creating a comfortable homogeneity for motorists, superficial design variations and additional bells and whistles notwithstanding.

Severalnines’ ClusterControl product offers a similar level of comfort and a configurable level of homogeneity to administrators of Open Source databases. With it’s support for templated deployments across MySQL, MariaDB, PostgreSQL, MongoDB Redis, Elastic, SQL Server and Timescale, as well as support for ProxySQL, HAproxy and MaxScale, DBAs and Operations engineers benefit from confidence that deployments are stable and repeatable across their environments.

So Why Should You Automate Your Database?

The old phrase “Everything is possible with time and money” is truest when speaking about databases. Most skilled DBAs can deploy a database manually and get it up-and-running, but automation removes the human factor by using proven methods that are known to work and are repeatable. Even the most skilled DBA can make a mistake and when you factor in the time to manually deploy databases coupled with the potential time for troubleshooting an issue the value of automation is apparent. When dealing with a complex setup the above becomes even more true, adding time and cost in relation to the complexity of your environment.

Subscribe below to be notified of fresh posts