blog

A Development & Operations Checklist for MongoDB

Onyancha Brian Henry

Published: October 2, 2020
Last Updated: June 8, 2022

MongoDB operation and development checklists are meant to help database administrators avoid encountering issues in the MongoDB production environment. A Development Checklist should address issues such as…

Schema design
Data durability
Replication
Drives
Sharding

An Operation Checklist, on the other hand, addresses…

Replication
Filesystem
Sharding
Hardware
Journaling (WiredTiger Storage Engine)
Operating system configurations
Deployment to cloud hardware
Monitoring
Backups and load balancing

Before initiating a project it is advisable to work on operation and development checklist to enable smooth operation of MongoDB in production. This article explains operation and development checklist before deploying MongoDB.

MongoDB Operations Checklist

Replication

All replica member sets that are not hidden should be provisioned identically in regard to disk, RAM, network setup, and CPU.

Oplog size should be configured properly to address operational needs such that:

To avoid the need for a full resync, the replica oplog application window should cover the regular downtime and maintenance window.
To restore a replication set member, the replica oplog window should always cover the needed time.

The production set at minimum should incorporate three data-bearing nodes that run with journaling enabled. Besides, writes should be issued with w: “majority” write concern for the purposes of ensuring data availability and durability.

The deployment should contain an odd number of voting members to facilitate the voting process whenever the primary node in the cluster fails.

Rather than using IP addresses that may require one to change configurations due to changing IP, use of logical DNS hostnames is recommended.

Make sure that mongod instances have 0 or 1 votes.

All Mongod instances should be fully and bidirectionally connected such that there is ease of communication of data between the involved nodes.

Journaling

This is a write ahead logging on-disk journal files strategy employed to ensure data durability in an event of failure. All instances should have journaling enabled for this reason especially when dealing with write-intensive workloads.

However, take into consideration that this affects snapshot-style reinforcements as the records constituting the state of the database will dwell on partitioned volumes.

File System

Do not use News File System (NFS) drives for dbPath. NFS drives can possibly result in a destabilized performance. VMware virtual drives are recommended for use by VMware users.

Ensure that your disk partitions are aligned with your RAIDON configurations.

For Linux/Unix users, use of XFS is recommended. XFS is known to perform better with MongoDB.

For windows operating system users, the NTFS file system is recommended. You should avoid using any FAT file system.

Deployment to Cloud Hardware

Windows Azure: Alter the TCP keepalive (tcp_keepalive_time) to 100-120. The TCP sit out of gear timeout on the Azure stack balancer is as well moderate for MongoDB’s association pooling behavior

Use MongoDB 2.6.4 or newer versions on frameworks with high-latency storage, such as Windows Azure, as these versions incorporate execution enhancements for those frameworks.

Sharding

Place your config servers on dedicated hardware for ideal execution in expansive clusters.

Ensure that the hardware has sufficient RAM to hold the information records entirely in memory and has devoted storage.

Deploy mongos routers in agreement with the Generation Setup guidelines.

Synchronize the clocks on all components of your sharded cluster by using NTP.

Ensure full bidirectional network between mongos, mongod, and config servers.

Use CNAMEs to recognize your config servers to the cluster so that you can rename and renumber your config servers without downtime.

Monitoring

You can utilize tools like MongoDB Cloud Manager, ClusterControl, or another monitoring framework to screen key database metrics and set up alarms. Incorporate alerts for the metrics:

Queues
Replication oplog window
Assertions
Page faults
Replication lag

Monitor hardware metrics for your servers. Particularly pay attention to available disk space, disk use, CPU

Hardware

Utilize RAID10 and SSD drives for ideal performance.

SAN and Virtualization:

Ensure that each of the mongod instances has provisioned IOPS for its dbPath, or has its claim physical drive or LUN.

Avoid dynamic memory highlights, such as memory swelling, when running in virtual environments.

Avoid setting all copy set individuals on the same SAN, as the SAN can be a single point of disappointment.

Load Balancing

Design load balancers to enable “sticky sessions” or “client affinity” with an adequate timeout for existing connections.

Avoid putting load balancers between the MongoDB cluster or replica set components.

Backups

Plan intermittent tests of your backup and restore process to have time gauges on hand and confirm its usefulness.

Operating System Configuration

Windows

Consider deactivating NTFS “last access time” upgrades.

Format NTFS disks by making use of the default Allotment unit size of 4096 bytes.

Linux

Switch off the huge transparent pages.

Make adjustments to the readhead settings of the dices where your database files are stored. The readahead of the WiredTiger storage engine should be set between 8 and 32.

If utilizing tuned on RHEL / CentOS, you must customize your adjusted profile. Numerous of the tuned profiles that ship with RHEL / CentOS can adversely affect execution with their default settings. Customize your chosen tuned profile to:

Disable straightforward hugepages.

Set readahead between 8 and 32 in any case of capacity media sort.

Utilize the noop or deadline disk schedulers for SSD drives.

Use the noop disk scheduler for virtualized drives in guest VMs.

Disable NUMA or set vm.zone_reclaim_mode to 0 and run mongod occurrences with node interleaving.

Adjust the ulimit values on your hardware to match your use case. In the event that different mongod or mongos occurrences are running beneath the same client, scale the ulimit values in like manner.

Design adequate record handles (fs.file-max), part pid constrain (kernel.pid_max), maximum thread per process (kernel.threads-max), and maximum number of memory outline areas per process (vm.max_map_count) for your sending. For expansive frameworks, the following values give a great beginning point:

fs.file-max value of 98000,

kernel.pid_max value of 64000,

kernel.threads-max value of 64000, and vm.max_map_count value of 128000

Ensure that your framework has swap space configured.

Allude to your operating system’s documentation for points of interest on the correct sizing.

Ensure that the system default TCP keepalive is set accurately. A value of 300 oftenly gives superior performance for replica sets and sharded clusters.

MongoDB Development Checklist

Replication

Utilize an odd number of voting individuals to guarantee that elections continue effectively. You’ll have up to 7 voting individuals. In the event that you’ve got an even number of voting individuals, and constraints, such as cost, disallow including another secondary to be a voting member, you’ll be able to include an arbiter to guarantee an odd number of votes.

Guarantee that your secondaries stay up-to-date by utilizing monitoring tools and by indicating suitable write concern.

Don’t utilize auxiliary reads to scale overall read throughput.

Schema Design

Data in MongoDB contains a dynamic pattern. Collections don’t uphold report structure. This encourages iterative improvement and polymorphism. In any case, collections frequently hold records with exceedingly homogeneous structures.

Decide the set of collections that you just will require and the indexes required to back your queries. With the special case of the _id index, you must make all indexes expressly: MongoDB does not naturally make any indexes other than _id.

Guarantee that your schema plan supports your deployment sort: in case you are planning to utilize sharded clusters for horizontal scaling, plan your schema to incorporate a strong shard key. The shard key influences read and write execution by deciding how MongoDB segments data. You cannot alter the shard key once it is set.

Make sure that your schema plan does not depend on indexed clusters that grow in length without bound. Ordinarily, best execution can be accomplished when such indexed clusters have less than 1000 components.

Consider the document estimate limits when designing your schema. The BSON Document Estimate restrain is 16MB per document. In the event that you require bigger reports, use GridFS.

Drivers

Make the use of association pooling. Most MongoDB drivers support association pooling. Alter the association pool size to suit your use case, starting at 110-115% of the normal number of concurrent database demands.

Make sure that your applications handle temporal write and read mistakes amid replica set elections.

Guarantee that your applications handle failed requests and retry them if appropriate. Drivers don’t

naturally retry failed requests.

Utilize exponential backoff rationale for database request retries.

Utilize cursor.maxTimeMS() for reads and wtimeout for writes in case you wish to cap execution period for database operations.

Data Durability

Make sure that your replica set incorporates at slightest three data-bearing hubs with w:majority compose concern. Three data-bearing hubs are required for replica-set wide data solidness.

Guarantee that all instances utilize journaling.

Sharding

Guarantee that your shard key conveys the load equally on your shards.

Utilize targeted operations for workloads that got to scale with the number of shards.

For MongoDB 3.6 and afterward, secondaries no longer return orphaned data unless utilizing read concern “available” (which is the default read concern for reads against secondaries when not related with causally reliable sessions).

Beginning in MongoDB 3.6, all members of the shard replica set keep up chunk metadata, permitting them to filter out orphans when not utilizing “available”. As such, non-targeted or broadcast inquiries that are not utilizing “available” can be securely run on any member and will not return orphaned information.

The “accessible” read concern can return orphaned documents from auxiliary members since it does not check for overhauled chunk metadata. In any case, in case the return of orphaned documents is insignificant to an application, the “available” read concern gives the least inactivity reads possible among the different read concerns.

Pre-split and manually adjust chunks when embedding expansive data sets into a new non-hashed sharded collection. Pre-splitting and physically adjusting empowers the embed stack to be dispersed among the shards, expanding execution for the starting load.

Conclusion

Operation and development checklist management is a crucial step that developers must incorporate when using MongoDB in production. They are key considerations because they enhance the flow of tasks for a project in production. MongoDB production environment necessitates stable and reliable database features because the database in production stores real worldworking data. Integrity of data depends on stability of the database which is enabled by ensuring that all items on the operation and development checklist are worked on before production.

ClickHouse Schema Design and Data Modeling

Building a Modern Analytics Stack Around ClickHouse

Managing ClickHouse Resources in Multi-Tenant Environments

Advanced Partitioning Strategies for PostgreSQL OLTP and Analytics Datasets at Scale

MongoDB Operations Checklist

Replication

Journaling

File System

Deployment to Cloud Hardware

Sharding

Monitoring

Hardware

Load Balancing

Backups

Operating System Configuration

Windows

Linux

MongoDB Development Checklist

Replication

Schema Design

Drivers

Data Durability

Sharding

Conclusion

Recommended

ClickHouse Schema Design and Data Modeling

Building a Modern Analytics Stack Around ClickHouse

Managing ClickHouse Resources in Multi-Tenant Environments

Advanced Partitioning Strategies for PostgreSQL OLTP and Analytics Datasets at Scale

Subscribe below to be notified of fresh posts