How to use the ClusterControl REST API to Automate Your Database Cluster

Ashraf Sharif


For ops folks with multiple environments and instances to manage, a fully programmable infrastructure is the basis for automation. ClusterControl exposes all functionality through a REST API. The web UI also interacts with the REST API to retrieve monitoring data (cluster load, alarms, backup status, etc.) or to send management commands (add/remove nodes, run backups, upgrade a cluster, add/remove load balancer, etc.). The API is written in PHP and runs under Apache. The diagram below illustrates the architecture of ClusterControl.

Figure: ClusterControl – Agentless Architecture

In this blog post, we will show you how to interact directly with the ClusterControl API to retrieve monitoring data or to perform management tasks.

All requests against ClusterControl API URL should include the ClusterControl API Token as HTTP header (CMON_TOKEN) for authentication. The ClusterControl API URL and token can be retrieved from the Cluster Registrations page in the ClusterControl UI.

Retrieving Monitoring Data

The request URI with query strings should be in the following format:


Which represents the following URL example:

A simple Curl command as below can be used to retrieve all alarms from ClusterControl:

$ curl -s -g -X GET -H 'CMON_TOKEN: 7c9e53faed237b5c5b010fb31a19f7dade175001' ''


HTTP Method : GET
API URL : //.json
Cluster ID : 1
Unix Timestamp : 1394101811
Extra Options : start from 0, and limit the result to 2 (similar to MySQL LIMIT)

Following list are some of the notable API groups available in ClusterControl 1.2.5:

  • Backup
    backups/all, backups/glacier_job_status, backups/reports, backups/storage_location
  • Clusters
    clusters/all, clusters/aws_metadata, clusters/dbload, clusters/info, clusters/mongodb_info, clusters/host_dbload, clusters/master, clusters/settings, clusters/dbconnections
  • Hosts
    hosts/hosts_stats, hosts/all_mysql_host, hosts/galera_stat_all, hosts/ram_history, hosts/network_history, hosts/cpu_history, hosts/disk_history, hosts/cpu_info
  • Alarms
    alarms/all, alarms/unread, alarms/check_cmon_db
  • Counters
    counters/all, counters/mongo_all
  • Nodes
    nodes/all, nodes/repl_info, nodes/datanodestats, nodes/clusternodes, nodes/hostnames, nodes/innodb_status
  • Performance
    performance/performance_meta, performance/performance_data, performance/probe_info, performance/probe_graph_data
  • Queries
    queries/all, queries/top, queries/mysql_histogram
  • Health
    health/criticalmodules, health/memoryusage, health/status, health/modules
  • Jobs
    jobs/all, jobs/unread, jobs/message
  • Processes
    processes/all, processes/top, processes/mongo_processes
  • Email

Some API groups can be called with extra query strings (alarms/all, jobs/all):

start=, limit=, sort=[{“property”:””,”direction”:””}]

The JSON output will be something like this:

    "data": [
            "alarm_count": "9", 
            "alarm_id": "1", 
            "alarm_name": "Excessive RAM Usage", 
            "alarm_sent": "1", 
            "alarm_sent_count": "3", 
            "alarm_type": "host", 
            "cid": "1", 
            "component": "RAM", 
            "description": "RAM Utilization for is 84 percent", 
            "hostid": "2", 
            "hostname": "", 
            "recommendation": "Upgrade Node with more RAM", 
            "report_ts": "1394434707", 
            "severity": "WARNING"
            "alarm_count": "32", 
            "alarm_id": "1", 
            "alarm_name": "System Time is drifting.", 
            "alarm_sent": "1", 
            "alarm_sent_count": "0", 
            "alarm_type": "simple", 
            "cid": "1", 
            "component": "ALARM_SOURCE_CLUSTER", 
            "description": "System time is drifting between servers and the distance between the highest system time and lowest is 5933 seconds.", 
            "hostid": "0", 
            "hostname": "", 
            "recommendation": "Synchronize the system clock on the servers using e.g NTP or make sure NTP is working. Time drifting may lead to unexpected failures or problems, and makes debugging hard.", 
            "report_ts": "1394434710", 
            "severity": "WARNING"
    "success": true, 
    "total": 1

Sending Management Commands

To create a job or task, we can easily do this by sending a POST method with data containing a job command with the respective cluster ID. The job command can be retrieved from the list of Jobs in the Cluster inside ClusterControl UI, as shown in the following screenshot:

The request URI to post a job should be in the following format:


The POST data should contain the cluster ID and job command with or without URL encoded format. A simple Curl command below can be used to post a job to stop MySQL service on node (node ID is 3) under cluster ID 1:

$ curl -s -X POST -H 'CMON_TOKEN: 7c9e53faed237b5c5b010fb31a19f7dade175001' -d "clusterid=1&job_command=' 3 stop mysqld'" ''


HTTP Method : POST
API URL : /jobs/job_command
Cluster ID : 1
Job Command : ‘ 3 stop mysql’

If the job is accepted by ClusterControl, following JSON output should be returned:

    "msg": "", 
    "success": true

Operations Automation

By leveraging the REST API, we can automate a number of things on our cluster – adding a node, creating a full backup, upgrading the cluster to a new version or performing a rolling restart.

We have written some examples in Bash to interact with the ClusterControl API. You can check out the code in the appendix section further down. You can use the client as below:

  • To query data from API group:
    $ ./ -r backups/reports
  • To post a job to the controller:
    $ ./ -j ' 1 backup xtrabackupfull /root/backups'
  • To query data from API group with extra option:
    $ ./ -r alarms/all -o 'start=0&limit=5&sort=[{"property":"alarm_count","direction":"ASC"}]'

We will use the Bash API client ( to build an auto-scaling feature trigger by a simple application logic ( Here is the brief scenario:

We have a two-node Galera cluster with a garbd co-located inside ClusterControl host. If the total queries is higher than 200/s for one minute, then a standby MySQL server (primarily, it is a DNS server) will be automatically started and added to the Galera cluster. Once the QPS is lower than 200/s for more than one minute, this node will be automatically removed from the Galera cluster.

The architecture can be illustrated as in following figure:

In this test scenario, we have actually added Galera #3 into the HAproxy load balancing set and when it joins into the cluster, it will automatically perform SST/IST and start serving up the cluster once synced.

The auto scale operation is recorded in following Asciinema screencast:

Good luck with automating your database infrastructure!


Simple Bash API Client –

# ClusterControl API Client
# curl, openssl, python2.6+ must be installed
# usage: [-r|-j] [-o]
## ClusterControl details
check_binary ()
  for binary in $binaries
    bin_path=`command -v $binary`
    [ -z "$bin_path" ] && echo "Error: Unable to find binary path for $binary" && exit 1
check_var ()
  for var in $vars
    [ -z "$var" ] && echo "Error: $var is empty" && exit 1
check_binary curl openssl python
check_var ccapi_url ccapi_token
## calculate token with sha1
cmon_token=`echo -n "$ccapi_token" | openssl dgst -sha1 | awk {'print $2'}`
unix_timestamp=$(date -d "today" "+%s")
usage() {
  echo "-i : cluster ID [default=1]"
  echo "-r : retrieve data in JSON format"
  echo "-j : post job to ClusterControl"
  echo "-o : extra parameters when retrieve data"
while getopts ":i:r:j:o:" arg; do
  case "$arg" in
    i) cluster_id="$OPTARG" <span>;;< span="">
    r) retrieve="$OPTARG" <span>;;< span="">
    j) job="$OPTARG" <span>;;< span="">
    o) options="$OPTARG" <span>;;< span="">
    -) break <span>;;< span="">
    ?) usage
    exit 1<span>;;< span="">
[ -z "$cluster_id" ] && cluster_id=1
([ -z "$retrieve" ] && [ -z "$job" ]) && echo "Please specify either -r to retrieve data or -j to post job" && exit 1
([ ! -z "$retrieve" ] && [ ! -z "$job" ]) && echo "Error: -r and -j cannot be used simultaneously" && exit 1
if [ ! -z "$retrieve" ]then
  group="backups/all backups/glacier_job_status backups/schedules backups/reports backups/storage_location 
  clusters/all clusters/aws_metadata clusters/dbload clusters/info clusters/mongodb_info clusters/host_dbload clusters/master clusters/settings clusters/dbconnections 
  hosts/hosts_stats hosts/all_mysql_host hosts/galera_stat_all hosts/ram_history hosts/network_history hosts/cpu_history hosts/disk_history hosts/cpu_info 
  alarms/all alarms/unread alarms/check_cmon_db 
  counters/all counters/mongo_all 
  nodes/all nodes/repl_info nodes/datanodestats nodes/clusternodes nodes/hostnames nodes/innodb_status 
  performance/performance_meta performance/performance_data performance/probe_info performance/probe_graph_data 
  queries/all queries/top queries/mysql_histogram 
  health/criticalmodules health/memoryusage health/status health/modules 
  jobs/all jobs/unread jobs/message 
  processes/all processes/top processes/mongo_processes 
  for f in $group
    [ "$f" == "$retrieve" ] && i=0 && break
  [ $i -ne 0 ] && echo "Error: Unknown options "$retrieve"" && exit 1 
if [ ! -z "${job}" ]then
  post_data="-d clusterid=$cluster_id&jobcommand=${job}"
result=`curl -g -s -H "CMON_TOKEN: $cmon_token" "$post_data" "$url"`
echo $result | python -mjson.tool</span>;;<></span>;;<></span>;;<></span>;;<></span>;;<></span>;;<>

Automatic Scaling –

# Perform auto scale if Queries Per Second (QPS) is more than 200 for 1 minute (5 sec interval)
[ -z "$1" ] && echo "Error: Please specify a target host" && exit 1
threshold=200 # QPS threshold
interval=5    # check interval in seconds
duration=60   # duration in seconds
# location
# get nodeid
nodeid=$($cc_api -r hosts/all_mysql_host | grep -A10 $ip | grep nodeid | awk {'print $2'} | sed 's|"||g' | sed 's|,||g')
check=$(( $duration/$interval ))
while truedo
  for n in { 1 .. $check }
    date=$(date +"%T")
    result=$($cc_api -r clusters/info | grep qps | awk {'print $2'} | sed 's|"||g' | sed 's|,||g')
    # convert to int
    if [ $qps -gt $threshold ]then
      # count +1 if over threshold
      echo "Counter: $i - $qps/$threshold - $date"
      [ $i -ne 0 ] && ((i--)) || ((j++))
      echo "Counter: $i - $qps/$threshold - $date"
    sleep $interval
  date=$(date +"%T")
  if [ $i -ge $check ]then
    # check node status
    node_status=$($cc_api -r hosts/all_mysql_host | grep -A10 $ip | grep status | awk {'print $2'} | sed 's|"||g' | sed 's|,||g')
    if [ "$node_status" != 0 ]then
      # start up
      echo "!!! Firing up $ip - $date !!!"
      $cc_api -j "$ip $nodeid start mysqld"
      echo "Host is already up"
  if [ $j -ge $check ]then
    # check node status
    node_status=$($cc_api -r hosts/all_mysql_host | grep -A10 $ip | grep status | awk {'print $2'} | sed 's|"||g' | sed 's|,||g')
    if [ "$node_status" == 0 ]then
      # bring down
      echo "!!! Bringing down $ip - $date !!!"
      $cc_api -j "$ip $nodeid stop mysqld"
      echo "Host is already down"

Subscribe below to be notified of fresh posts