blog
How to use the ClusterControl REST API to Automate Your Database Cluster
For ops folks with multiple environments and instances to manage, a fully programmable infrastructure is the basis for automation. ClusterControl exposes all functionality through a REST API. The web UI also interacts with the REST API to retrieve monitoring data (cluster load, alarms, backup status, etc.) or to send management commands (add/remove nodes, run backups, upgrade a cluster, add/remove load balancer, etc.). The API is written in PHP and runs under Apache. The diagram below illustrates the architecture of ClusterControl.
Figure: ClusterControl – Agentless Architecture
In this blog post, we will show you how to interact directly with the ClusterControl API to retrieve monitoring data or to perform management tasks.
All requests against ClusterControl API URL should include the ClusterControl API Token as HTTP header (CMON_TOKEN) for authentication. The ClusterControl API URL and token can be retrieved from the Cluster Registrations page in the ClusterControl UI.
Retrieving Monitoring Data
The request URI with query strings should be in the following format:
/.json?clusterid=&_dc=&
Which represents the following URL example:
http://192.168.197.200/cmonapi/alarms/all.json?clusterid=1&_dc=1394101811&start=0&limit=2
A simple Curl command as below can be used to retrieve all alarms from ClusterControl:
$ curl -s -g -X GET -H 'CMON_TOKEN: 7c9e53faed237b5c5b010fb31a19f7dade175001' 'http://192.168.197.200/cmonapi/alarms/all.json?clusterid=1&_dc=1394101811&start=0&limit=2'
Where,
HTTP Method | : GET |
HTTP Header | : CMON_TOKEN: |
API URL | : //.json |
Cluster ID | : 1 |
Unix Timestamp | : 1394101811 |
Extra Options | : start from 0, and limit the result to 2 (similar to MySQL LIMIT) |
Following list are some of the notable API groups available in ClusterControl 1.2.5:
- Backup
s
backups/all, backups/glacier_job_status, backups/reports, backups/storage_location - Clusters
clusters/all, clusters/aws_metadata, clusters/dbload, clusters/info, clusters/mongodb_info, clusters/host_dbload, clusters/master, clusters/settings, clusters/dbconnections - Hosts
hosts/hosts_stats, hosts/all_mysql_host, hosts/galera_stat_all, hosts/ram_history, hosts/network_history, hosts/cpu_history, hosts/disk_history, hosts/cpu_info - Alarms
alarms/all, alarms/unread, alarms/check_cmon_db - Counters
counters/all, counters/mongo_all - Nodes
nodes/all, nodes/repl_info, nodes/datanodestats, nodes/clusternodes, nodes/hostnames, nodes/innodb_status - Performance
performance/performance_meta, performance/performance_data, performance/probe_info, performance/probe_graph_data - Queries
queries/all, queries/top, queries/mysql_histogram - Health
health/criticalmodules, health/memoryusage, health/status, health/modules - Jobs
jobs/all, jobs/unread, jobs/message - Processes
processes/all, processes/top, processes/mongo_processes - Email
email/all
Some API groups can be called with extra query strings (alarms/all, jobs/all):
start=, limit=, sort=[{“property”:””,”direction”:””}]
The JSON output will be something like this:
{ "data": [ { "alarm_count": "9", "alarm_id": "1", "alarm_name": "Excessive RAM Usage", "alarm_sent": "1", "alarm_sent_count": "3", "alarm_type": "host", "cid": "1", "component": "RAM", "description": "RAM Utilization for 192.168.197.201 is 84 percent", "hostid": "2", "hostname": "192.168.197.201", "recommendation": "Upgrade Node with more RAM", "report_ts": "1394434707", "severity": "WARNING" }, { "alarm_count": "32", "alarm_id": "1", "alarm_name": "System Time is drifting.", "alarm_sent": "1", "alarm_sent_count": "0", "alarm_type": "simple", "cid": "1", "component": "ALARM_SOURCE_CLUSTER", "description": "System time is drifting between servers and the distance between the highest system time and lowest is 5933 seconds.", "hostid": "0", "hostname": "", "recommendation": "Synchronize the system clock on the servers using e.g NTP or make sure NTP is working. Time drifting may lead to unexpected failures or problems, and makes debugging hard.", "report_ts": "1394434710", "severity": "WARNING" } ], "success": true, "total": 1 }
Sending Management Commands
To create a job or task, we can easily do this by sending a POST method with data containing a job command with the respective cluster ID. The job command can be retrieved from the list of Jobs in the Cluster inside ClusterControl UI, as shown in the following screenshot:
The request URI to post a job should be in the following format:
/jobs/job_command
The POST data should contain the cluster ID and job command with or without URL encoded format. A simple Curl command below can be used to post a job to stop MySQL service on node 192.168.197.202 (node ID is 3) under cluster ID 1:
$ curl -s -X POST -H 'CMON_TOKEN: 7c9e53faed237b5c5b010fb31a19f7dade175001' -d "clusterid=1&job_command='192.168.197.202 3 stop mysqld'" 'http://192.168.197.200/cmonapi/jobs/job_command'
Where,
HTTP Method | : POST |
HTTP Header | : CMON_TOKEN: |
API URL | : /jobs/job_command |
Cluster ID | : 1 |
Job Command | : ‘192.168.197.202 3 stop mysql’ |
If the job is accepted by ClusterControl, following JSON output should be returned:
{ "msg": "", "success": true }
Operations Automation
By leveraging the REST API, we can automate a number of things on our cluster – adding a node, creating a full backup, upgrading the cluster to a new version or performing a rolling restart.
We have written some examples in Bash to interact with the ClusterControl API. You can check out the code in the appendix section further down. You can use the client as below:
- To query data from API group:
$ ./cc_api.sh -r backups/reports
- To post a job to the controller:
$ ./cc_api.sh -j '192.168.197.201 1 backup xtrabackupfull /root/backups'
- To query data from API group with extra option:
$ ./cc_api.sh -r alarms/all -o 'start=0&limit=5&sort=[{"property":"alarm_count","direction":"ASC"}]'
We will use the Bash API client (cc_api.sh) to build an auto-scaling feature trigger by a simple application logic (auto_scale.sh). Here is the brief scenario:
We have a two-node Galera cluster with a garbd co-located inside ClusterControl host. If the total queries is higher than 200/s for one minute, then a standby MySQL server (primarily, it is a DNS server) will be automatically started and added to the Galera cluster. Once the QPS is lower than 200/s for more than one minute, this node will be automatically removed from the Galera cluster.
The architecture can be illustrated as in following figure:
In this test scenario, we have actually added Galera #3 into the HAproxy load balancing set and when it joins into the cluster, it will automatically perform SST/IST and start serving up the cluster once synced.
The auto scale operation is recorded in following Asciinema screencast:
Good luck with automating your database infrastructure!
Appendix
Simple Bash API Client – cc_api.sh
#!/bin/bash # ClusterControl API Client # curl, openssl, python2.6+ must be installed # usage: cc_api.sh [-r|-j] [-o] ## ClusterControl details ccapi_url='http://192.168.197.200/cmonapi' ccapi_token='29648bc849fd2888c79ed35943ef839154935f2b' ## check_binary () { binaries=$* for binary in $binaries do bin_path=`command -v $binary` [ -z "$bin_path" ] && echo "Error: Unable to find binary path for $binary" && exit 1 done } check_var () { vars=$* for var in $vars do [ -z "$var" ] && echo "Error: $var is empty" && exit 1 done } check_binary curl openssl python check_var ccapi_url ccapi_token ## calculate token with sha1 cmon_token=`echo -n "$ccapi_token" | openssl dgst -sha1 | awk {'print $2'}` unix_timestamp=$(date -d "today" "+%s") usage() { echo "-i : cluster ID [default=1]" echo "-r : retrieve data in JSON format" echo "-j : post job to ClusterControl" echo "-o : extra parameters when retrieve data" } while getopts ":i:r:j:o:" arg; do case "$arg" in i) cluster_id="$OPTARG" <span>;;< span=""> r) retrieve="$OPTARG" <span>;;< span=""> j) job="$OPTARG" <span>;;< span=""> o) options="$OPTARG" <span>;;< span=""> -) break <span>;;< span=""> ?) usage exit 1<span>;;< span=""> esac done [ -z "$cluster_id" ] && cluster_id=1 ([ -z "$retrieve" ] && [ -z "$job" ]) && echo "Please specify either -r to retrieve data or -j to post job" && exit 1 ([ ! -z "$retrieve" ] && [ ! -z "$job" ]) && echo "Error: -r and -j cannot be used simultaneously" && exit 1 if [ ! -z "$retrieve" ]then group="backups/all backups/glacier_job_status backups/schedules backups/reports backups/storage_location clusters/all clusters/aws_metadata clusters/dbload clusters/info clusters/mongodb_info clusters/host_dbload clusters/master clusters/settings clusters/dbconnections hosts/hosts_stats hosts/all_mysql_host hosts/galera_stat_all hosts/ram_history hosts/network_history hosts/cpu_history hosts/disk_history hosts/cpu_info alarms/all alarms/unread alarms/check_cmon_db counters/all counters/mongo_all nodes/all nodes/repl_info nodes/datanodestats nodes/clusternodes nodes/hostnames nodes/innodb_status performance/performance_meta performance/performance_data performance/probe_info performance/probe_graph_data queries/all queries/top queries/mysql_histogram health/criticalmodules health/memoryusage health/status health/modules jobs/all jobs/unread jobs/message processes/all processes/top processes/mongo_processes email/all " i=0 for f in $group do [ "$f" == "$retrieve" ] && i=0 && break ((i++)) done [ $i -ne 0 ] && echo "Error: Unknown options "$retrieve"" && exit 1 url="$ccapi_url/$retrieve.json?clusterid=$cluster_id&_dc=$unix_timestamp&$options" fi post_data= if [ ! -z "${job}" ]then post_data="-d clusterid=$cluster_id&jobcommand=${job}" url="$ccapi_url/jobs/job_command" fi result=`curl -g -s -H "CMON_TOKEN: $cmon_token" "$post_data" "$url"` echo $result | python -mjson.tool</span>;;<></span>;;<></span>;;<></span>;;<></span>;;<></span>;;<>
Automatic Scaling – auto_scale.sh
#!/bin/bash # Perform auto scale if Queries Per Second (QPS) is more than 200 for 1 minute (5 sec interval) [ -z "$1" ] && echo "Error: Please specify a target host" && exit 1 ip=$1 threshold=200 # QPS threshold interval=5 # check interval in seconds duration=60 # duration in seconds # cc_api.sh location cc_api=/vagrant/cc_api.sh # get nodeid nodeid=$($cc_api -r hosts/all_mysql_host | grep -A10 $ip | grep nodeid | awk {'print $2'} | sed 's|"||g' | sed 's|,||g') i=0j=0 check=$(( $duration/$interval )) while truedo for n in { 1 .. $check } do date=$(date +"%T") result=$($cc_api -r clusters/info | grep qps | awk {'print $2'} | sed 's|"||g' | sed 's|,||g') # convert to int qps=${result/.*} if [ $qps -gt $threshold ]then # count +1 if over threshold ((i++)) echo "Counter: $i - $qps/$threshold - $date" else [ $i -ne 0 ] && ((i--)) || ((j++)) echo "Counter: $i - $qps/$threshold - $date" fi sleep $interval done date=$(date +"%T") if [ $i -ge $check ]then # check node status node_status=$($cc_api -r hosts/all_mysql_host | grep -A10 $ip | grep status | awk {'print $2'} | sed 's|"||g' | sed 's|,||g') if [ "$node_status" != 0 ]then # start up echo "!!! Firing up $ip - $date !!!" $cc_api -j "$ip $nodeid start mysqld" i=0 else echo "Host is already up" fi fi if [ $j -ge $check ]then # check node status node_status=$($cc_api -r hosts/all_mysql_host | grep -A10 $ip | grep status | awk {'print $2'} | sed 's|"||g' | sed 's|,||g') if [ "$node_status" == 0 ]then # bring down echo "!!! Bringing down $ip - $date !!!" $cc_api -j "$ip $nodeid stop mysqld" else echo "Host is already down" fi j=0 fi done