blog
How to Cluster Your ProxySQL Load Balancers
A proxy layer between applications and databases would typically consist of multiple proxy nodes for high availability. This is no different for ProxySQL. ProxySQL, just like other modern proxies, can be used to build complex logic for routing queries. You can add query rules to send queries to particular hosts, you can cache queries, you can add and remove backend servers, or manage users that are allowed to connect to the ProxySQL and MySQL. However, numerous ProxySQL nodes in the proxy layer introduces another problem – synchronization across distributed instances. Any rules or other logic need to be synchronized across instances, to ensure they behave in the same way. Even if not all of the proxies are handling traffic, they still work as a standby. In case they would need to take over the work, you don’t want any surprises if the instance used does not have the most recent configuration changes.
It is quite cumbersome to ensure this manually – to make the changes by hand on all of the nodes. You can utilize tools like Ansible, Chef or Puppet to manage configurations, but the sync process has to be coded and tested. ClusterControl can help you here through an option to sync configurations between ProxySQL instances, but it also can set up and manage the other components required for high availability, e.g., Virtual IP. But starting from version 1.4.2, ProxySQL offers a native clustering and configuration syncing mechanism. In this blog post, we will discuss how to set it up with a mix of actions taken in ClusterControl and ProxySQL commandline admin interface.
First of all, let’s take a look at a typical replication environment deployed by ClusterControl.
As you can see from the screenshot, this is a MySQL replication setup with three ProxySQL instances. ProxySQL high availability is implemented through Keepalived and Virtual IP that is always assigned to one of the ProxySQL nodes. There are a couple of steps we have to take in order to configure ProxySQL clustering. First, we have to define which user ProxySQL should be using to exchange information between the nodes. Let’s define a new one on top of the existing administrative user:
Next, we need to define that user in admin-cluster_password and admin-cluster_username settings.
This has been done on just one of the nodes (10.0.0.126). Let’s sync this configuration change to the remaining ProxySQL nodes.
As we stated, ClusterControl allows you to synchronize configuration between ProxySQL nodes with just a couple of steps. When the job ended syncing 10.0.0.127 with 10.0.0.126, there’s just the last node we need to sync.
After this, we need to make a small change in the ProxySQL administrative command line interface, which is typically reachable on port 6032. We have to create entries in the ‘proxysql_servers’ table which would define the nodes in our ProxySQL cluster.
mysql> INSERT INTO proxysql_servers (hostname) VALUES ('10.0.0.126'), ('10.0.0.127'), ('10.0.0.128');
Query OK, 3 rows affected (0.00 sec)
mysql> LOAD PROXYSQL SERVERS TO RUNTIME;
Query OK, 0 rows affected (0.01 sec)
mysql> SAVE PROXYSQL SERVERS TO DISK;
Query OK, 0 rows affected (0.01 sec)
After loading the change to runtime, ProxySQL should start syncing the nodes. There are a couple of places where you can track the state of the cluster.
mysql> SELECT * FROM stats_proxysql_servers_checksums;
+------------+------+-------------------+---------+------------+--------------------+------------+------------+------------+
| hostname | port | name | version | epoch | checksum | changed_at | updated_at | diff_check |
+------------+------+-------------------+---------+------------+--------------------+------------+------------+------------+
| 10.0.0.128 | 6032 | admin_variables | 0 | 0 | | 0 | 1539773916 | 0 |
| 10.0.0.128 | 6032 | mysql_query_rules | 2 | 1539772933 | 0x3FEC69A5C9D96848 | 1539773546 | 1539773916 | 0 |
| 10.0.0.128 | 6032 | mysql_servers | 4 | 1539772933 | 0x3659DCF3E53498A0 | 1539773546 | 1539773916 | 0 |
| 10.0.0.128 | 6032 | mysql_users | 2 | 1539772933 | 0xDD5F0BB01235E930 | 1539773546 | 1539773916 | 0 |
| 10.0.0.128 | 6032 | mysql_variables | 0 | 0 | | 0 | 1539773916 | 0 |
| 10.0.0.128 | 6032 | proxysql_servers | 2 | 1539773835 | 0x8EB13E2B48C3FDB0 | 1539773835 | 1539773916 | 0 |
| 10.0.0.127 | 6032 | admin_variables | 0 | 0 | | 0 | 1539773916 | 0 |
| 10.0.0.127 | 6032 | mysql_query_rules | 3 | 1539773719 | 0x3FEC69A5C9D96848 | 1539773546 | 1539773916 | 0 |
| 10.0.0.127 | 6032 | mysql_servers | 5 | 1539773719 | 0x3659DCF3E53498A0 | 1539773546 | 1539773916 | 0 |
| 10.0.0.127 | 6032 | mysql_users | 3 | 1539773719 | 0xDD5F0BB01235E930 | 1539773546 | 1539773916 | 0 |
| 10.0.0.127 | 6032 | mysql_variables | 0 | 0 | | 0 | 1539773916 | 0 |
| 10.0.0.127 | 6032 | proxysql_servers | 2 | 1539773812 | 0x8EB13E2B48C3FDB0 | 1539773813 | 1539773916 | 0 |
| 10.0.0.126 | 6032 | admin_variables | 0 | 0 | | 0 | 1539773916 | 0 |
| 10.0.0.126 | 6032 | mysql_query_rules | 1 | 1539770578 | 0x3FEC69A5C9D96848 | 1539773546 | 1539773916 | 0 |
| 10.0.0.126 | 6032 | mysql_servers | 3 | 1539771053 | 0x3659DCF3E53498A0 | 1539773546 | 1539773916 | 0 |
| 10.0.0.126 | 6032 | mysql_users | 1 | 1539770578 | 0xDD5F0BB01235E930 | 1539773546 | 1539773916 | 0 |
| 10.0.0.126 | 6032 | mysql_variables | 0 | 0 | | 0 | 1539773916 | 0 |
| 10.0.0.126 | 6032 | proxysql_servers | 2 | 1539773546 | 0x8EB13E2B48C3FDB0 | 1539773546 | 1539773916 | 0 |
+------------+------+-------------------+---------+------------+--------------------+------------+------------+------------+
18 rows in set (0.00 sec)
The stats_proxysql_servers_checksums table contains, among others, a list of nodes in the cluster, tables that are synced, versions and checksum of the table. If the checksum is not in line, ProxySQL will attempt to get the latest version from a cluster peer. More detailed information about the contents of this table can be found in ProxySQL documentation.
Another source of information about the process is ProxySQL’s log (by default it is located in /var/lib/proxysql/proxysql.log).
2018-10-17 11:00:25 [INFO] Cluster: detected a new checksum for mysql_query_rules from peer 10.0.0.126:6032, version 2, epoch 1539774025, checksum 0xD615D5416F61AA72 . Not syncing yet …
2018-10-17 11:00:27 [INFO] Cluster: detected a peer 10.0.0.126:6032 with mysql_query_rules version 2, epoch 1539774025, diff_check 3. Own version: 2, epoch: 1539772933. Proceeding with remote sync
2018-10-17 11:00:28 [INFO] Cluster: detected a peer 10.0.0.126:6032 with mysql_query_rules version 2, epoch 1539774025, diff_check 4. Own version: 2, epoch: 1539772933. Proceeding with remote sync
2018-10-17 11:00:28 [INFO] Cluster: detected peer 10.0.0.126:6032 with mysql_query_rules version 2, epoch 1539774025
2018-10-17 11:00:28 [INFO] Cluster: Fetching MySQL Query Rules from peer 10.0.0.126:6032 started
2018-10-17 11:00:28 [INFO] Cluster: Fetching MySQL Query Rules from peer 10.0.0.126:6032 completed
2018-10-17 11:00:28 [INFO] Cluster: Loading to runtime MySQL Query Rules from peer 10.0.0.126:6032
2018-10-17 11:00:28 [INFO] Cluster: Saving to disk MySQL Query Rules from peer 10.0.0.126:6032
As you can see, we have here information that a new checksum has been detected and the sync process is in place.
Let’s stop for a moment here and discuss how ProxySQL handles configuration updates from multiple sources. First of all, ProxySQL tracks checksums to detect when a configuration has changed. It also stores when it happened – this data is stored as a timestamp, so it has one second resolution. ProxySQL has two variables which also impacts how changes are being synchronized.
Cluster_check_interval_ms – it determines how often ProxySQL should check for configuration changes. By default it is 1000ms.
Cluster_mysql_servers_diffs_before_sync – it tells us how many times a check should detect a configuration change before it will get synced. Default setting is 3.
This means that, even if you will make a configuration change on the same host, if you will make it less often than 4 seconds, the remaining ProxySQL nodes may not be able to synchronize it because a new change will show up before the previous one was synchronized. It also means that if you make configuration changes on multiple ProxySQL instances, you should make them with at least a 4 second break between them as otherwise some of the changes will be lost and, as a result, configurations will diverge. For example, you add Server1 on Proxy1 , and after 2 seconds you add Server2 on Proxy2 . All other proxies will reject the change on Proxy1 because they will detect that Proxy2 has a newer configuration. 4 seconds after the change on Proxy2, all proxies (including Proxy1) will pull the configuration from Proxy2.
As the intra-cluster communication is not synchronous and if a ProxySQL node you made the changes to failed, changes may not be replicated on time. The best approach is to make the same change on two ProxySQL nodes. This way, unless both fail exactly at the same time, one of them will be able to propagate new configuration.
Also worth noting is that the ProxySQL cluster topology can be quite flexible. In our case we have three nodes, all have three entries in the proxysql_servers table. Such nodes form the cluster where you can write to any node and the changes will be propagated. On top of that, it is possible to add external nodes which would work in a “read-only” mode, which means that they would only synchronize changes made to the “core” cluster but they won’t propagate changes that were performed directly on themselves. All you need on the new node is to have just the “core” cluster nodes configured in proxysql_servers and, as a result, it will connect to those nodes and get the data changes, but it will not be queried by the rest of the cluster for its configuration changes. This setup could be used to create a source of truth with several nodes in the cluster, and other ProxySQL nodes, which just get the configuration from the main “core” cluster.
In addition to all of that, ProxySQL cluster supports automatic rejoining of the nodes – they will sync their configuration while starting. It can also be easily scaled out by adding more nodes.
We hope this blog post gives you an insight into how ProxySQL cluster can be configured. ProxySQL cluster will be transparent to ClusterControl and it will not impact any of the operations you may want to execute from the ClusterControl UI.