Severalnines Blog
The automation and management blog for open source databases

Sharding MySQL with MySQL Fabric and ProxySQL

Posted in:

There are numerous ways to migrate into MySQL Fabric. Most of them require serious changes in your application - you have to switch to the MySQL Fabric connector to actually benefit from the MySQL Fabric infrastructure. You may also have to implement some kind of data-awareness in your application as it will have to pass a sharding key (some value from the column which is used to shard a given table) when connecting to MySQL Fabric. This process may be not possible to do in a single step, so we prepared a proof of concept to show you how to migrate without application changes.

The solution consists of several elements. At the backend, there’s MySQL Fabric with its sharding system of high availability groups and tools around it. Another layer is built using MySQL Router - it allows regular MySQL clients to connect to different high availability groups created in MySQL Fabric. It doesn’t solve the problem of routing queries to the correct shards though. This is where ProxySQL comes in to help - it allows us to perform a flawless failover (from the old database to the sharded setup), but also, it will route the queries to the correct shards exposed by the MySQL Router. Routing is based on ProxySQL’s ability to parse SQL and perform actions (including rerouting) on queries based on whether the query matches a regex or not.

Initial environment will look as described in the chart below. We have two shards configured using MySQL Fabric - this brings us to the total of three high availability groups (two shards and a global group). Each such group is built on top of two MySQL nodes in master - slave replication. Within those groups MySQL Fabric will manage high availability by monitoring the status of MySQL, and promoting a slave to master if required. Each high availability group is configured in MySQL Router, which exposes all of them on different ports. We will split the sharded tables in half, each shard storing half of the sharded data set. Last remaining part of the setup is routing.

ProxySQL gives us ability to change query routing based on regular expression matches. What we need to do is prepare a set of rules - one for every type of query which hits the sharded table. We need to prepare regular expressions which will distinguish which queries are to be routed to the first shard or the second shard. This process may be error prone, you need to test your rules before you actually apply them to production. Limitations may apply here too - your queries have to explicitly use a sharding key to identify rows - similar limitations apply to MySQL Fabric or any other sharding solution, so this is not a constraint induced by our setup. Additional query rules may have to be created for non-sharded, global tables.

Once all query rules are ready and tested, a switchover has to happen - traffic has to be moved from the old setup based on MySQL replication and routed to our sharded setup, based on the rules we created - this again can be done using ProxySQL, without any interruption to your application (as long as you don’t use long transactions).

When traffic hit the sharded setup, another phase begins - maintaining the sharded environment. MySQL Fabric gives us different tools for that - you can add hosts to high availability groups, you can promote hosts within a high availability group. You may also need to split shards. Some of those operations are transparent in our setup, some (like splitting or moving the shard) require detailed preparations as they cannot be executed easily using existing MySQL Fabric CLI commands.

More details about our setup and migration from MySQL replication into a MySQL Fabric sharded environment can be found in our ebook.

The ebook covers the entire process, and also includes an example script showing how you can extend MySQL Fabric and execute a shard split without affecting your traffic.

Database Sharding with MySQL Fabric

Why do we shard? How does sharding work? What are the different ways I can shard my database? This whitepaper goes through some of the theory behind sharding. It also discusses three different tools which are designed to help users shard their MySQL databases. And last but not least, it shows you how to set up a sharded MySQL setup based on MySQL Fabric and ProxySQL.

Download Here

Related Post

Video: Interview with Riaan Nolan on the Upcoming DevOps Webinar

We sat down with Riaan Nolan, DevOps Manager for Mukuru, to discuss the upcoming webinar “DevOps Tutorial: How to Automate Your Database Infrastructure”.

Posted in:

An Expert's Introduction to PostgreSQL, Database Management and ClusterControl

Basic concepts of PostgreSQL and how ClusterControl fills the gap. Besides ClusterControl a range of related tools for PostgreSQL usage are highlighted.

MySQL on Docker: Running Galera Cluster in Production with ClusterControl on Kubernetes

In our previous posts, we showed how one can run Galera Cluster on Kubernetes. This blog post covers how ClusterControl being part of the stack helps in running your Galera Cluster in production environment.

Webinar - DevOps Tutorial: How to Automate Your Database Infrastructure

This is a live “DevOps Tutorial: How to Automate Your Database Infrastructure” with guest speaker Riaan Nolan of, the first Puppet Labs Certified Professional in South Africa; Riaan saves the companies he works for 50% of their monthly IT bills on average. In this new webinar we will talk you through the facets of DevOps integrations and the mission-critical advantages that database automation can bring to your database infrastructure.