A Guide to Configuring a Load Balancer in a MongoDB Sharded Cluster

Akash Kathiriya

For any database, the load balancing of all the requests coming from clients is an important and fundamental mechanism to ensure scalability. A proper load balancing solution spreads all the client requests evenly across all of the database resources. If the database cluster is not guarded with a proper load balancing solution, your database won’t be able to handle increased traffic load on it.

Fortunately, MongoDB provides in-built support for load balancing the high traffic by supporting horizontal scaling through sharding. You can distribute the data of your collections across multiple servers using sharding. You can also add new servers/machines to your cluster to handle the increased traffic on the database. You can follow this guide to convert your MongoDB replica cluster into a sharding cluster.

In this article, we will learn about the behavior of the balancer process which runs in the MongoDB sharded clusters and how to modify its behavior. The MongoDB balancer process takes care of distributing your collections evenly across the shards. For example, if one shard of your cluster contains too many chunks of your sharded collection, that particular shard can receive more traffic in comparison to other shards. Therefore, the balancer process balances the chunks of collections properly across the shards. In most of the MongoDB deployments, the default configurations of the balancer process are sufficient enough for normal operations. But, in some situations, database administrators might want to alter the default behavior of this process. If you want to modify the default behavior of the balancer process for any application-level needs or operational requirements then you can follow this guide.

Let’s start with some basic commands to get some information about the balancer process state and status.

Balancer State Status

This command checks whether the balancer is enabled or permitted to run or not. If the balancer process is not running then this command will return false. This will not check whether the balancer process is running or not.

sh.getBalancerState()

Enable the Balancer Process

If the balancer is not enabled by default then you can enable it by running the following command. This command will not start the balancer process but it will enable the process and ensures that chunk balancing won’t be blocked when the balancer process runs the next time.

sh.enableBalancing(<collection_name/namespace>)

Disable the Balancer Process

The balancer process runs at any time by default. Therefore, if you want to disable the balancer process for some specific time period then you can use the following command. One ideal scenario to use this command is when you are taking a backup of your database. 

sh.stopBalancer()

Make sure that the balancer process is stopped before taking the backup. If the process is enabled while taking the database backup, you may end up with some inconsistent replica of your database. This can happen when the balancer process moves some chunks across the shards for load balancing during the backup process.

You can also disable the balancing on some specific collections by providing the full namespace of a collection as a parameter using the following command.

sh.disableBalancing("<db_name>.<collection_name>")

Balancer Running Status

This command checks whether the balancer process is running or not. It also checks whether it is actively managing the sharding chunks or not. Returns true if the process is running otherwise returns false.

sh.isBalancerRunning()

Default Chunk Size Configurations

By default, the chunk size in any MongoDB sharded cluster is 64MB. For most of the scenarios, this is good enough for migrating or splitting the sharded chunks. However, sometimes the normal migration process involves more no of I/O operations than your hardware can process. In these types of situations, you may want to reduce the size of chunks. You can do so by running the following set of commands. 

use config

db.settings.save( { _id:"chunksize", value: <sizeInMB> } )

If you change the default the chunk size in the sharded cluster, keep the following things in mind

  • You can specify the chunk size only between 1 to 1024 MB
  • Automatic splitting will only happen on insert or update
  • Lower chunk sizes will lead to more time during the splitting process.

Schedule Balancing for a Specific Time

When your database size is huge, balancing or migration processes can impact the overall performance of your database. Therefore, it is wise to schedule the balancing process during a specific time window when the load on the database is very less. You can use the following commands to set the time window for the balancer process to run.

use config

db.settings.update({ _id : "balancer" }, { $set : { activeWindow : { start : "<start-time>", stop : "<stop-time>" } } }, true )

Example

Following command will set the time window from 1: 00 AM to 5: 00 AM for the balancing process to run. 

db.settings.update({ _id : "balancer" }, { $set : { activeWindow : { start : "01:00", stop : "05:00" } } }, true )

Make sure that the given timeframe is sufficient enough for a complete balancing process.

You can also remove any existing balancing process time window by running the following command.

db.settings.update({ _id : "balancer" }, { $unset : { activeWindow : true } })

Apart from the above commands, you can also change the replication behavior while doing the chunk migration process by using the _secondaryThrottle parameter. Also, you can use the _waitForDelete property with moveChunk command to tell the balancing process to wait for the current migration’s delete phase before starting with the new chunk migration phase.

Conclusion

Hopefully, this will be all you need while changing the default behavior of the MongoDB balancer process. Balancing is a very important aspect of any MongoDB sharded cluster. So, if you know about the balancing process in detail, it becomes very easy to modify the default behavior of the balancer process according to your needs and use cases.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.