Hash Slot Resharding and Rebalancing for Redis Cluster

Ashraf Sharif

Redis Cluster with automatic partitioning uses a cool approach in organizing and storing the data, where the keys are stored in a hash slot and the keyspace is split into 16384 slots. Each master node in a cluster handles a subset of the 16384 hash slots. This sets an upper limit for the cluster to be a maximum of 16384 master nodes, where each Redis node can serve a minimum of one slot, albeit the suggested cluster size is in the order of around 1000 nodes.

Auto resharding is only supported on Redis Enterprise. This includes re-sharding, shard migration, and setting up triggers for auto-balancing without impacting your application. If you are running on the Redis Community version, then you need to perform the maintenance operations manually.

Generally, if you are not using the hash tags (a way to ensure that multiple keys are allocated in the same hash slot for multi-key operations), the keys shall be evenly distributed across all hash slots on all nodes by calculation using the CRC16 algorithm. However, as your data grows and becomes more demanding, you probably want to:

  • Add a new node to the cluster as an empty node when some sets of hash slots are moved from existing nodes to the new node.

  • Remove a node from the cluster so that the hash slots assigned to that node are moved to other existing nodes.

  • Rebalance the cluster so that a given set of hash slots are moved between nodes.

  • Upgrade to a new server by moving all the hash slots from the old node to the new more powerful node.

Consequently, if we skip rebalancing, we could end up with an unbalanced cluster, or worse, one full node and many empty nodes.

This blog post will look into how to perform hash slot maintenance operations on a Redis Cluster. 

Determining the Hash Slots

Consider we are having a 3-node Redis Cluster (all masters) as illustrated below:

Ideally, one would have at least one slave connected to each master for high availability and redundancy purposes in a production setup. But we are going to skip that for this blog post.

First of all, connect to the Redis Cluster and use the "--cluster check" command to get the information on the allocated slots:

(redis1)$ redis-cli --cluster check 192.168.11.131:6379
192.168.11.131:6379 (6ac62aa8...) -> 12651 keys | 5461 slots | 0 slaves.
192.168.11.132:6379 (92385b2e...) -> 12658 keys | 5462 slots | 0 slaves.
192.168.11.133:6379 (9026f2af...) -> 12655 keys | 5461 slots | 0 slaves.
[OK] 37964 keys in 3 masters.
2.32 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.11.133:6379)
M: 6ac62aa8dbb80f982ab1b0fa0623fc54d2bbd77b 192.168.11.131:6379
   slots:[0-5460] (5461 slots) master
M: 92385b2ea26a1a27c786cbea34a75f155f87f762 192.168.11.132:6379
   slots:[5461-10922] (5462 slots) master
M: 6ac62aa8dbb80f982ab1b0fa0623fc54d2bbd77b 192.168.11.133:6379
   slots:[10923-16383] (5461 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

From the above information, we can summarize that our hash slots distribution of the 3-node Redis Cluster as the following:

Host

Hash Slots #

Total Slots

redis1, 192.168.11.131

0 - 5460

5,461

redis2, 192.168.11.132

5461 - 10922

5,462

redis3, 192.168.11.133

10923 - 16383

5,461

TOTAL

16,384

It looks like our hash slots are evenly distributed between our nodes, where each node holds 5461 or 5462 slots.

Hash Slots Rebalancing

Now, let's add 2 more Redis masters into the picture:

  • redis4, 192.168.11.134

  • redis5, 192.168.11.135

Once added, how would the hash slots be allocated? To evenly distribute the hash slots, divide the total slots with the number of Redis masters:

Slots per master  = 16,384 / 5
                              = 3,276.8 (or 3,276 remainder 4)

Therefore, the first 4 nodes should hold an extra 1 to account for the remainder 4 in the above division, which comes to the following conclusion:

Total slots, 16,384 = (3277 x 4) + 3276

Now, back to the distribution table, our slots distribution estimation after adding two new Redis masters should be like below:

Host

Old Hash Slots #

Old Total Slots

New Hash Slots #

New Total Slots

redis1,

192.168.11.131

0 - 5460

5,461

0 - 3276

3,277

redis2,

192.168.11.132

5461 - 10922

5,462

3277 - 6554

3,277

redis3,

192.168.11.133

10923 - 16383

5,461

6555 - 9831

3,277

redis4,

192.168.11.134

   

9832 - 13,108

3,277

redis5,

192.168.11.135

   

13,109 - 16,383

3,276

TOTAL

16,384

 

16,384

The above calculation and estimation show our attempt to visualize the hash slots distribution before and after the scale-out operation. Luckily, we don't have to perform the calculation and manually rebalance every hash slot individually since Redis Cluster comes with the necessary tools and features to deal with this.

Let's put this into an actual real-life exercise. The following architecture illustrates what we are going to achieve:

To get started, we add the fourth and fifth nodes into the Redis cluster:

$ redis-cli --cluster add-node 192.168.11.134:6379 192.168.11.131:6379 # add redis4
$ redis-cli --cluster add-node 192.168.11.135:6379 192.168.11.131:6379 # add redis5

When looking at the cluster info, we should notice that the cluster_known_nodes is now 5 but the cluster_size is still 3:

(redis1)$ redis-cli cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:5
cluster_size:3
cluster_current_epoch:4
cluster_my_epoch:1
cluster_stats_messages_ping_sent:227
cluster_stats_messages_pong_sent:238
cluster_stats_messages_sent:465
cluster_stats_messages_ping_received:234
cluster_stats_messages_pong_received:227
cluster_stats_messages_meet_received:4
cluster_stats_messages_received:465

Now, check the cluster nodes to understand the slots distribution:

(redis1)$ redis-cli --cluster check 192.168.11.131:6379
192.168.11.131:6379 (6ac62aa8...) -> 12651 keys | 5461 slots | 0 slaves.
192.168.11.132:6379 (92385b2e...) -> 12658 keys | 5462 slots | 0 slaves.
192.168.11.133:6379 (9026f2af...) -> 12655 keys | 5461 slots | 0 slaves.
192.168.11.134:6379 (a987d7a9...) -> 0 keys | 0 slots | 0 slaves.
192.168.11.135:6379 (9644ed11...) -> 0 keys | 0 slots | 0 slaves.
[OK] 37964 keys in 3 masters.
2.32 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.11.133:6379)
M: 6ac62aa8dbb80f982ab1b0fa0623fc54d2bbd77b 192.168.11.131:6379
   slots:[0-5460] (5461 slots) master
M: 92385b2ea26a1a27c786cbea34a75f155f87f762 192.168.11.132:6379
   slots:[5461-10922] (5462 slots) master
M: 6ac62aa8dbb80f982ab1b0fa0623fc54d2bbd77b 192.168.11.133:6379
   slots:[10923-16383] (5461 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

Notice redis4 (192.168.11.134) and redis5 (192.168.11.135) have no hash slots assigned to them. This means these two new nodes are not serving any data and part of the working cluster yet. We need to rebalance the slots across all 5 nodes first.

Take note that by default, the rebalance option will try to rebalance slots on the participating Redis nodes in the cluster (ignoring the newly added masters). For example, when you run the command explicitly as below, you would get the "No rebalancing needed" response:

(redis1)$ redis-cli --cluster rebalance 192.168.11.135:6379
>>> Performing Cluster Check (using node 127.0.0.1:6379)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
*** No rebalancing needed! All nodes are within the 2.00% threshold.

To force Redis to rebalance the hash slots on the new empty masters (redis4 and redis5), we have to use the --cluster-use-empty-masters flag:

(redis1)$ redis-cli --cluster rebalance 192.168.11.135:6379 --cluster-use-empty-masters
>>> Performing Cluster Check (using node 192.168.11.135:6379)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Rebalancing across 5 nodes. Total weight = 5.00
Moving 2186 slots from 192.168.11.132:6379 to 192.168.11.135:6379
#########################################################################################
Moving 1092 slots from 192.168.11.131:6379 to 192.168.11.135:6379
#########################################################################################
Moving 1093 slots from 192.168.11.131:6379 to 192.168.11.134:6379
#########################################################################################
Moving 2185 slots from 192.168.11.133:6379 to 192.168.11.134:6379
#########################################################################################

At this point, the hash slots are now balanced out across 5 nodes. We can further see the cluster slots on one of the servers by using the check option:

(redis5)$ redis-cli --cluster check 192.168.11.131:6379
192.168.11.131:6379 (6ac62aa8...) -> 7586 keys | 3276 slots | 0 slaves.
192.168.11.132:6379 (92385b2e...) -> 7617 keys | 3276 slots | 0 slaves.
192.168.11.133:6379 (9026f2af...) -> 7602 keys | 3276 slots | 0 slaves.
192.168.11.134:6379 (a987d7a9...) -> 7580 keys | 3278 slots | 0 slaves.
192.168.11.135:6379 (9644ed11...) -> 7579 keys | 3278 slots | 0 slaves.
[OK] 37964 keys in 5 masters.
2.32 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.11.131:6379)
M: 6ac62aa8dbb80f982ab1b0fa0623fc54d2bbd77b 192.168.11.131:6379
   slots:[2185-5460] (3276 slots) master
M: 92385b2ea26a1a27c786cbea34a75f155f87f762 192.168.11.132:6379
   slots:[7647-10922] (3276 slots) master
M: 9026f2af5a683123abfdd7494da2c73a61803dd3 192.168.11.133:6379
   slots:[13108-16383] (3276 slots) master
M: a987d7a94052c6e9426c96ebcd896c2003923eb4 192.168.11.134:6379
   slots:[1092-2184],[10923-13107] (3278 slots) master
M: 9644ed114a59dfb8dc95c5d87d9d4d10ca6c4f1b 192.168.11.135:6379
   slots:[0-1091],[5461-7646] (3278 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

From the above output, we can tell that we are now having a total of 5 Redis nodes, and each of them is having around 3276 slots with a total of 7500-ish keys per node. It also reports that we have 0 slaves connected to every master (not recommended for production though). Further down, it reports an average of 2.32 keys per slot and also the summary of the configuration of the slots, where all nodes agree about the slot configuration and all 16,384 slots are covered. 

At this point, our hash slots balancing exercise is complete.

Hash Slots Resharding and Rebalancing

Now, let's say we want to remove redis1 and redis2 from the picture and leave only redis3, redis4 and redis5 (back to the three-master cluster topology), as illustrated in the following diagram:

We can simply reshard redis1 and redis2 as follows:

To reshard (migrate out) all slots from redis1 to redis3:

(redis1)$ redis-cli --cluster reshard 192.168.11.131:6379 \
--cluster-from 6ac62aa8dbb80f982ab1b0fa0623fc54d2bbd77b \
--cluster-to  9026f2af5a683123abfdd7494da2c73a61803dd3 \
--cluster-slots 3276 \
--cluster-yes

Where, 6ac62 is the node ID of redis1 and 9026f is the node ID of redis3. The --cluster-slots value is how many slots that we want to move, as in this case, we want to move all of them to redis3. Use the --cluster check option to retrieve the total number of slots on a particular node.

To reshard (migrate out) all slots from redis2 to redis4:

(redis1)$ redis-cli --cluster reshard 192.168.11.132:6379 \
--cluster-from 92385b2ea26a1a27c786cbea34a75f155f87f762 \
--cluster-to a987d7a94052c6e9426c96ebcd896c2003923eb4 \
--cluster-slots 3276 \
--cluster-yes

Where, 92385 is the node ID of redis2 and a987d is the node ID of redis4. The --cluster-slots value is how many slots that we want to move, as in this case, we want to move all of them to redis4. Use the --cluster check option to retrieve the total number of slots on a particular node.

Now, when we check the current slot distribution, we can see that redis1 and redis2 should have 0 keys and slots:

(redis1)$ redis-cli --cluster check 192.168.11.131:6379
192.168.11.131:6379 (6ac62aa8...) -> 0 keys | 0 slots | 0 slaves.
192.168.11.132:6379 (92385b2e...) -> 0 keys | 0 slots | 0 slaves.
192.168.11.133:6379 (9026f2af...) -> 15188 keys | 6552 slots | 0 slaves.
192.168.11.134:6379 (a987d7a9...) -> 15197 keys | 6554 slots | 0 slaves.
192.168.11.135:6379 (9644ed11...) -> 7579 keys | 3278 slots | 0 slaves.

You should also notice that the hash slots are not evenly distributed, where redis3 and redis4 hold a far bigger ratio of hash slots if compared to redis5. Therefore, it is a good idea to rebalance the hash slots once more between redis3, redis4 and redis5:

(redis3)$ redis-cli --cluster rebalance 192.168.11.135:6379
>>> Performing Cluster Check (using node 192.168.11.135:6379)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Rebalancing across 3 nodes. Total weight = 3.00
Moving 1093 slots from 192.168.11.134:6379 to 192.168.11.135:6379
###########################################################################
Moving 1091 slots from 192.168.11.133:6379 to 192.168.11.135:6379
###########################################################################

The hash slots are now balanced out evenly across the 3 active Redis nodes that we wanted:

(redis3)$ redis-cli --cluster check 192.168.11.131:6379
192.168.11.131:6379 (6ac62aa8...) -> 0 keys | 0 slots | 0 slaves.
192.168.11.132:6379 (92385b2e...) -> 0 keys | 0 slots | 0 slaves.
192.168.11.133:6379 (9026f2af...) -> 12651 keys | 5461 slots | 0 slaves.
192.168.11.134:6379 (a987d7a9...) -> 12658 keys | 5461 slots | 0 slaves.
192.168.11.135:6379 (9644ed11...) -> 12655 keys | 5462 slots | 0 slaves.

We can now remove redis1 (192.168.11.131) and redis2 (192.168.11.132) from the Redis Cluster since it has no slots anymore:

Removing redis1 from the cluster:

(redis3)$ redis-cli --cluster del-node 192.168.11.131:6379 6ac62aa8dbb80f982ab1b0fa0623fc54d2bbd77b
>>> Removing node 6ac62aa8dbb80f982ab1b0fa0623fc54d2bbd77b from cluster 192.168.11.131:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> Sending CLUSTER RESET SOFT to the deleted node.

Removing redis2 from the cluster:

(redis3)$ redis-cli --cluster del-node 192.168.11.132:6379 92385b2ea26a1a27c786cbea34a75f155f87f762
>>> Removing node 92385b2ea26a1a27c786cbea34a75f155f87f762 from cluster 192.168.11.132:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> Sending CLUSTER RESET SOFT to the deleted node.

Verify our final Redis node topology and the distribution of the hash slots:

(redis3)$ redis-cli --cluster check 192.168.11.133:6379
192.168.11.133:6379 (9026f2af...) -> 12651 keys | 5461 slots | 0 slaves.
192.168.11.135:6379 (9644ed11...) -> 12655 keys | 5462 slots | 0 slaves.
192.168.11.134:6379 (a987d7a9...) -> 12658 keys | 5461 slots | 0 slaves.

At this point, we may safely decommission the removed nodes - redis1 and redis2. All of the above operations can be performed live without interruption to the Redis service. However, the application must use the supported Redis clients/drivers/connectors from Redis so the request can be redirected to the right shards while rebalancing or resharding is happening.

Final Thoughts

Redis Cluster comes with live cluster resharding and rebalancing capabilities out-of-the-box. These operations can be performed online without interruption to the Redis service as a whole, allowing the cluster to scale transparently to the applications.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.