Severalnines Blog
The automation and management blog for open source databases

Failover for PostgreSQL Replication 101

Working at the IT industry, we have probably heard the word "failover" many times, but it can also raise questions like: What is really a failover? What can we use it for? Is it important to have it? How can we do it?

While they may seem pretty basic questions, it is important to take them into account in any database environment. And more often than not, we do not take into account the basics...

To begin, let's look at some basic concepts.

What is Failover?

Failover is the ability of a system to continue functioning even if some failure occurs. It suggests that the functions of the system are assumed by secondary components if the primary components fail.

In the case of PostgreSQL, there are different tools that allow you to implement a database cluster that is resilient to failures. One redundancy mechanism available natively in PostgreSQL is replication. And the novelty in PostgreSQL 10 is the implementation of logical replication.

What is Replication?

It is the process of copying and keeping the data updated in one or more database nodes. It uses a concept of a master node that receives the modifications, and slave nodes where they are replicated.

We have several ways of categorizing replication:

  • Synchronous Replication: There is no loss of data even if our master node is lost, but the commits in the master must wait for a confirmation from the slave, which can affect the performance.
  • Asynchronous Replication: There is a possibility of data loss in case we lose our master node. If the replica for some reason is not updated at the time of the incident, the information that has not been copied may be lost.
  • Physical Replication: Disk blocks are copied.
  • Logical Replication: Streaming of the data changes.
  • Warm Standby Slaves: They do not support connections.
  • Hot Standby Slaves: Support read-only connections, useful for reports or queries.

What is failover used for?

There are several possible uses of failover. Let's see some examples.

Migration

If we want to migrate from one datacenter to another by minimizing our downtime, we can use failover.

Suppose that our master is in datacenter A, and we want to migrate our systems to datacenter B.

Migration Diagram 1
Migration Diagram 1

We can create a replica in datacenter B. Once it is synchronized, we must stop our system, promote our replica to new master and failover, before we point our system to the new master in datacenter B.

Migration Diagram 2
Migration Diagram 2

Failover is not just about the database, but also the application(s). How do they know which database to connect to? We certainly do not want to have to modify our application, as this will only extend our downtime.. So, we can configure a load balancer so that when we take down our master, it will automatically point to the next server that is promoted.

Another option is the use of DNS. By promoting the master replica in the new datacenter, we directly modify the IP address of the hostname that points to the master. In this way we avoid having to modify our application, and although it can not be done automatically, it is an alternative if we do not want to implement a load balancer.

Having a single load balancer instance is not great as it can become a single point of failure. Therefore, you can also implement failover for the load balancer, using a service such as keepalived. In this way, if we have a problem with our primary load balancer, keepalived is responsible for migrating the IP to our secondary load balancer, and everything continues to work transparently.

Maintenance

If we must perform any maintenance on our postgreSQL master database server, we can promote our slave, perform the task, and reconstruct a slave on our old master.

Maintenance Diagram 1
Maintenance Diagram 1

After this we can re-promote the old master, and repeat the reconstruction process of the slave, returning to the initial state.

Maintenance Diagram 2
Maintenance Diagram 2

In this way, we could work on our server, without running the risk of being offline or losing information while performing maintenance.

Upgrade

Although PostgreSQL 11 is not yet available, it would technically be possible to upgrade from PostgreSQL version 10, using logical replication, as it can be done with other engines.

The steps would be the same as to migrate to a new datacenter (See Migration Section), only that our slave would be in PostgreSQL 11.

Upgrade Diagram 1
Upgrade Diagram 1

Issues

The most important function of the failover is to minimize our downtime or avoid loss of information, when having a problem with our main database.

If for some reason we lose our master database, we can perform a failover promoting our slave to master, and keep our systems running.

To do this, PostgreSQL does not provide us with any automated solution. We can do it manually, or automate it by means of a script or an external tool.

To promote our slave to master:

  • Run pg_ctl promote

    bash-4.2$ pg_ctl promote -D /var/lib/pgsql/10/data/
    waiting for server to promote.... done
    server promoted
  • Create a file trigger_file that we must have added in the recovery.conf of our data directory.
    bash-4.2$ cat /var/lib/pgsql/10/data/recovery.conf
    standby_mode = 'on'
    primary_conninfo = 'application_name=pgsql_node_0 host=postgres1 port=5432 user=replication password=****'
    recovery_target_timeline = 'latest'
    trigger_file = '/tmp/failover.trigger'
    
    bash-4.2$ touch /tmp/failover.trigger

In order to implement a failover strategy, we need to plan it and thoroughly test through different failure scenarios. As failures can happen in different ways, and the solution should ideally work for most of the common scenarios. If we are looking for a way to automate this, we can have a look at what ClusterControl has to offer.

ClusterControl for PostgreSQL Failover

ClusterControl has a number of features related to PostgreSQL replication and automated failover.

Add Slave

If we want to add a slave in another datacenter, either as a contingency or to migrate your systems, we can go to Cluster Actions, and select Add Replication Slave.

ClusterControl Add Slave 1
ClusterControl Add Slave 1

We’ll need to enter some basic data, such as IP or hostname, data directory (optional), synchronous or asynchronous slave. We should have our slave up and running after a few seconds.

In the case of using another datacenter, we recommend creating an asynchronous slave, since otherwise the latency can affect the performance considerably.

ClusterControl Add Slave 2
ClusterControl Add Slave 2

Manual failover

With ClusterControl, failover can be done manually or automatically.

ClusterControl Failover 1
ClusterControl Failover 1

To perform a manual failover, go to ClusterControl -> Select Cluster -> Nodes, and in the Action Node of one of our slaves, select "Promote Slave". In this way, after a few seconds, our slave becomes master, and what was our master previously, is turned to a slave.

ClusterControl Failover 2
ClusterControl Failover 2

The above is useful for the tasks of migration, maintenance and upgrades that we saw previously.

Automatic failover

In the case of automatic failover, ClusterControl detects failures in the master and promotes a slave with the most current data as the new master. It also works on the rest of the slaves to have them replicate from the new master.

ClusterControl Failover 3
ClusterControl Failover 3

Having the “Autorecovery” option ON, our ClusterControl will perform an automatic failover as well as notify us of the problem. In this way, our systems can recover in seconds, and without our intervention.

Cluster Control offers us the possibility to configure a whitelist/blacklist to define how we want our servers to be taken (or not to be taken) into account when deciding on a master candidate.

From the ones available according to the above configuration, ClusterControl will pick the most advanced slave, using for this purpose the pg_current_xlog_location (PostgreSQL 9+) or pg_current_wal_lsn (PostgreSQL 10+) depending on the version of our database.

ClusterControl also performs several checks over the failover process, in order to avoid some common mistakes. One example is that if we manage to recover our old failed master, it will NOT be reintroduced automatically to the cluster, neither as a master nor as a slave. We need to do it manually. This will avoid the possibility of data loss or inconsistency in the case that our slave (that we promoted) was delayed at the time of the failure. We might also want to analyze the issue in detail, but when adding it to our cluster, we would possibly lose diagnostic information.

Also, if failover fails, no further attempts are made, manual intervention is required to analyze the problem and perform the corresponding actions. This is to avoid the situation where ClusterControl, as the high availability manager, tries to promote the next slave and the next one. There might be a problem, and we do not want to make things worse by attempting multiple failovers.

Load Balancers

As we mentioned earlier, the load balancer is an important tool to consider for our failover, especially if we want to use automatic failover in our database topology.

In order for the failover to be transparent for both the user and the application, we need a component in-between, since it is not enough to promote a master to a slave. For this, we can use HAProxy + Keepalived.

What is HAProxy?

HAProxy is a load balancer that distributes traffic from one origin to one or more destinations and can define specific rules and/or protocols for this task. If any of the destinations stops responding, it is marked as offline, and the traffic is sent to the rest of the available destinations. This prevents traffic from being sent to an inaccessible destination, and prevents the loss of this traffic by directing it to a valid destination.

What is Keepalived?

Keepalived allows you to configure a virtual IP within an active/passive group of servers. This virtual IP is assigned to an active “Primary” server. If this server fails, the IP is automatically migrated to the “Secondary” server that was found to be passive, allowing it to continue working with the same IP in a transparent way for our systems.

To implement this solution with ClusterControl, we started as if we were going to add a slave. Go to Cluster Actions and select Add Load Balancer (see ClusterControl Add Slave 1 image).

ClusterControl Load Balancer 1
ClusterControl Load Balancer 1

We add the information of our new load balancer and how we want it to behave (Policy).

In the case of wanting to implement failover for our load balancer, we must configure at least two instances.

Then we can configure Keepalived (Select Cluster -> Manage -> Load Balancer -> Keepalived).

ClusterControl Load Balancer 2
ClusterControl Load Balancer 2

After this, we have the following topology:

ClusterControl Load Balancer 3
ClusterControl Load Balancer 3

HAProxy is configured with two different ports, one read-write and one read-only.

In our read-write port, we have our master server as online and the rest of our nodes as offline. In the read-only port, we have both the master and the slaves online. In this way we can balance the reading traffic between our nodes. When writing, the read-write port will be used, which will point to the master.

ClusterControl Load Balancer 3
ClusterControl Load Balancer 3

When HAProxy detects that one of our nodes, either master or slave, is not accessible, it automatically marks it as offline. HAProxy will not send any traffic to it. This check is done by health check scripts that are configured by ClusterControl at the time of deployment. These check whether the instances are up, whether they are undergoing recovery, or are read-only.

When ClusterControl promotes a slave to master, our HAProxy marks the old master as offline (for both ports) and puts the promoted node online (in the read-write port). In this way, our systems continue to operate normally.

If our active HAProxy (which is assigned a Virtual IP address to which our systems connect) fails, Keepalived migrates this IP to our passive HAProxy automatically. This means that our systems are then able to continue to function normally.

Conclusion

As we could see, failover is a fundamental part of any production database. It can be useful when performing common maintenance tasks or migrations. We hope that this blog has been useful as an introduction to the topic, so you can continue researching and create your own failover strategies.