Severalnines Blog
The automation and management blog for open source databases

How to Deploy MongoDB for High Availability

Introduction

MongoDB has great support for high availability through ReplicaSets. However, deploying a ReplicaSet is not enough for a production-ready system. The latter requires a bit of planning. Deployment is just the initial step, we then need to arm the operational teams with monitoring, alerting, security, anomaly or failure detection, automatic recovery/failover, backup management, and other tools to keep the environment up and running.

Prerequisites

Before you can start with your MongoDB deployment with ClusterControl, some preparations are required. The supported platforms are RedHat/CentOS 6.x/7.x, Ubuntu 12.04/14.04/16.04 LTS, and Debian 7.x/8.x The minimal OS resource requirements are 2GB of RAM, 2CPU and 20GB disk space running on x86 architecture. ClusterControl itself can run on regular VMs or barebone hosts running on-prem, behind a firewall, or on Cloud VMs.

Additionally, ClusterControl requires ports used by the following services to be opened/enabled:
ICMP (echo reply/request)
SSH (default is 22)
HTTP (default is 80)
HTTPS (default is 443)
MySQL (default is 3306) (internal database)
CMON RPC (default is 9500)
CMON RPC TLS (default is 9501)
CMON Events (default is 9510)
CMON SSH (default is 9511)
CMON Cloud (default is 9518)

Streaming port for backups through netcat (default is 9999)

The easiest and most convenient way to install ClusterControl is to use the installation script provided by Severalnines. Simply download the script and execute as the root user or user with sudo root permission. If you need a more manual approach, for instance if your servers are entirely without internet access, you can follow instructions provided in ClusterControl documentation.

$ wget http://www.severalnines.com/downloads/cmon/install-cc 
$ chmod +x install-cc
$ ./install-cc   # as root or sudo user

Follow the installation wizard where you will be guided with setting up an internal ClusterControl database server and it’s credentials,, the cmon password for ClusterControl usage and so on. You should get the following line once the installation has completed:

Determining network interfaces. This may take a couple of minutes. Do NOT press any key.
Public/external IP => http://{public_IP}/clustercontrol
Installation successful. 

Next step is to generate an SSH key which we will use to set up the passwordless SSH later on. If you have a key pair which you would like to use, you can skip creation of a new one.

You can use any user in the system but it must have the ability to perform super-user operations (sudoer). In this example, we picked the root user:

$ whoami
root
$ ssh-keygen -t rsa #generates ssh key

Set up passwordless SSH to all nodes that you would like to monitor/manage via ClusterControl. In this case, we will set this up on all nodes in the stack (including ClusterControl node itself). On ClusterControl node, run the following commands to copy ssh keys and specify the root password when prompted:

ssh-copy-id root@clustercontrolhost # clustercontrol
ssh-copy-id root@mongodbnode1
ssh-copy-id root@mongodbnode2
ssh-copy-id root@mongodbnode3
...

You can then verify if it's working by running the following command on ClusterControl node:

$ ssh root@192.168.55.151 "ls /root"

Make sure you are able to see the result of the command above without the need to enter a password.
When the installation is completed you should be able to login to the web interface via

https://<your_vm_name>/clustercontrol/#

After the first login, you will see a window with options to start with your first deployment or import an existing cluster.

ClusterControl Deploy and import existing cluster
ClusterControl Deploy and import existing cluster

Configure repositories

Before we start deploying, let’s take a look at the package management system. The ClusterControl deployment process supports the entire process of cluster installation. That includes OS adjustments and package download and installation. If your database nodes have limited access to the internet and you can not download packages directly from the node, you can create a package repository directly on the ClusterControl host.

ClusterControl package repository
ClusterControl package repository

There are three ways to maintain MongoDB packages in ClusterControl.

Use Vendor Repositories

Install the software by setting up and using the database vendor’s favored software repository. ClusterControl will install the latest version of what is stored by the MongoDB repository.

Do Not Setup Vendor Repositories

Install the software by using the pre-existing software repository already set up on the OS. The user has to set up the software repository manually on each database node, and ClusterControl will use this repository for package deployment. This is good if the database nodes are running without internet connections and your company has an external package system with MongoDB packages in place.

Use Mirrored Repositories (Create new repository)

Create and mirror the current vendor’s repository and then deploy using the local mirrored repository. It also allows you to “freeze” the recent versions of the software packages used to provision a database cluster for a specific vendor (i.e., use only Percona packages).

ClusterControl automates the creation of internal package repository
ClusterControl automates the creation of internal package repository

Deploy ReplicaSet

ClusterControl supports MongoDB/Percona Server for MongoDB 3.x ReplicaSet. To start with the deployment of the new cluster, go to the deploy option in the top right corner. When you install your database nodes, always use clean and minimal VMs. Existing package dependencies might be removed if required. New packages will be installed and existing packages can be removed when provisioning the node with the required software.

The very first step of the deployment process is to provide ssh credentials appropriate to the hosts on which you are deploying your cluster. As ClusterControl uses password-less ssh to connect to and configure your hosts, an ssh key is required.

ClusterControl deploy MongoDB cluster wizard
ClusterControl deploy MongoDB cluster wizard

It is advisable to use an unprivileged user account to log into the hosts, so a sudo password can be provided to facilitate administrative tasks. If the user account does not prompt for a sudo password, this is not needed. You also have the option to disable iptables and AppArmor or SELinux on the host to avoid the issue with initial deployment.

On the following screen, you can choose to install MongoDB binaries from either MongoDB Inc or from Percona. Here also, you must specify your MongoDB administrative user account and password as user-level security is mandated.

ClusterControl deploy MongoDB wizard, ReplicaSet
ClusterControl deploy MongoDB wizard, ReplicaSet

On this screen, you can also see which configuration template is being used. ClusterControl uses configuration file templates to ensure repeatable deployments. Templates are stored on the ClusterControl host and can be edited directly using command line, or through the ClusterControl UI. You can also choose to use the vendor repositories, if you wish, or choose your own repository. In addition, you can automatically create a new repository on the ClusterControl host. This allows to freeze the version of MongoDB that ClusterControl will deploy to the current release. Once you carried out the appropriate configuration here, click Deploy to proceed.

Deploy sharding

ClusterControl can also deploy Sharded Clusters. Two methods of doing so are supported. First, you can convert an existing MongoDB ReplicaSet into a Sharded Cluster, as shown below.

ClusterControl Deploy MongoDB shards
ClusterControl Deploy MongoDB shards

When “Convert to Shard” is clicked, you are prompted to add at least one Config server (for production environments, you should add three), and a router, also known as a “mongos” process. The final stage is to choose your MongoDB configuration templates for config server and router, as well as your data directory. Finally, click deploy. When complete, it will show up in your Database Clusters view. It will show your shard health instead of individual instances. It is also possible to add additional shards as needed.

Convert to shard

If you happen to run into scaling issues you can scale this ReplicaSet by either adding more secondaries or scaling out by sharding. You can convert an existing ReplicaSet into a sharded cluster, but this is a long process where you could easily make errors. In ClusterControl we have automated this process, where we automatically add the Config servers, shard routers and enable sharding.

To convert a ReplicaSet into a sharded cluster, you can simply trigger it via the actions drop down:

ClusterControl Convert to shard
ClusterControl Convert to shard

Schedule backup policy

It is essential to keep the backup of your database and that your database has a good and easy process for backup. ClusterControl has support for fully consistent backup and restores of your MongoDB replica set or sharded cluster.

Backups can be taken manually or can be scheduled. The centralization of backups is supported, with backups stored either on the Controller filesystem, including network-mounted directories or uploaded to a pre-configured Cloud provider - currently supported providers are Google Cloud Platform, Amazon Web Services and Microsoft Azure. This allows you to take full advantage of advanced lifecycle management functionality provided by Amazon and Google for such features as custom retention schedules, long-term archival, and encryption at rest, among others.

Backup retention is configurable; you can choose to retain your backup for any time period or to never delete backups. AES256 encryption is employed to secure your backups against rogue elements.

For rapid recovery, backups can be restored directly into the backup cluster - ClusterControl handles the full restore process from launch to cluster recovery, removing error-prone manual steps from the process.

Enable operational reports

With ClusterControl you can schedule cross environment reports like "Daily System Report," "Package Upgrade Report," "Schema Change Report" as well as "Backups" and "Availability." These reports will help you keep your environment secure and operational. You will also see recommendations on how to fix gaps. Reports in HTML format can be emailed to SysOps, DevOps or even managers who would like to get regular status updates about a given system’s health.

Performance Advisors

Advisors provide specific advice on how to address issues in areas such as performance, security, log management, configuration, storage space, and others. ClusterControl comes with a list of pre-defined advisors that are intended to track the state of different metrics and state of your databases. When needed, an alert is created.They can be extended with manual scripts. For more information please follow our recent blog on “How to Automate Database Workload Analysis with ClusterControl Performance Advisors”.

Among various operating system performance advisor, you can find the below related to MongoDB.

MongoDB sharding advisors
connections used
replication check
replication window

Deploy in the cloud

Starting in version 1.6, ClusterControl enables you to create a MongoDB 3.4 ReplicaSets in the cloud. The supported cloud platforms are Amazon AWS, Google Cloud and Microsoft Azure.

The wizard will walk you through the VM machine creation and MongoDB settings, all in one place.

ClusterControl deploy MongoDB ReplicaSet in cloud
ClusterControl deploy MongoDB ReplicaSet in cloud

The process let’s you choose OS parameters including network setup. There is no need to copy SSH keys, they will be added automatically. After the job is done, you will see your cluster in the main dashboard. From now on, you can manage your MongoDB cluster like any other in ClusterControl.

ClusterControl deploy MongoDb RelicaSet in cloud, VM network settings
ClusterControl deploy MongoDb RelicaSet in cloud, VM network settings

Security Tips

At this point your new cluster should be up and running. Before you allow users and application processes to access data, you need to define cluster security settings. In our previous blogs, we raised concerns about default security configuration. Here are some of the main things you need to consider before you pass your new cluster to other teams.

Change default ports - by default, MongoDB will bind to standard ports: 27017 for MongoDB ReplicaSets or Shard Routers, 27018 for shards and 27019 for Config servers. Using standard ports is not recommended as it simplifies the possibility of a hacker attack.

Enable authentication - without authentication, users can login without password. Enable authentication on all your environments (development, certification and production).

security:
    Authentication: on

Use strong passwords - if needed, use a password generator to generate complex passwords.

Add replicaton key file - with the keyfile enabled, the authentication of the replication stream will be encrypted.

Encrypt your backups - ClusterControl enables you to encrypt your backups.

For further reading, we have a blog on how to secure MongoDB.

Enable cluster auto recovery

The last but not least feature to enable would be node and cluster auto recovery.

ClusterControl can work for you as an extended 24/7 DBA team member. There are two main functions here. Automatic node recovery and automatic cluster recovery.

When node auto recovery is enabled ClusterControl will react to node issues and in the case of failures, it will attempt to recover individual nodes. This is to address things like a process that runs of memory or service that requires a start after a power outage, whatever is causing an issue with the service down.

The cluster recovery option is even more sophisticated. It will perform a switchover if needed.

In that case, rolling back any changes that are not replicated to the slaves will be placed in a ‘rollback’ folder, so it is up to the administrator to restore it.

To setup node and cluster auto recovery, you just need to enable them in the main dashboard.