PostgreSQL can work separately on multiple machines with the same data structure, making the persistence layer of the application more resilient and prepared for some unexpected event that might compromise the continuity of the service.
The idea behind this is to improve the system response time by distributing the requests in a “Round Robin” network where each node present is a cluster. In this type of setup it is not important as to which one the requests will be delivered to be processed, as the response would always be the same.
In this blog, we will explain how to replicate a PostgreSQL cluster using the tools provided in the program installation. The version used is PostgreSQL 11.5, the current stable, generally-available version for the operating system Debian Buster. For the examples in this blog it is assumed that you are already familiar with Linux.
Inside the directory /usr/bin/ is the program responsible for managing the cluster.
# 1. Lists the files contained in the directory # 2. Filters the elements that contain 'pg_' in the name ls /usr/bin/ | grep pg_
Activities conducted through these programs can be performed sequentially, or even in combination with other programs. Running a block of these activities through a single command is possible thanks to a Linux program found in the same directory, called make.
To list the clusters present use the pg_lsclusters program. You can also use make to run it. Its work depends on a file named Makefile, which needs to be in the same directory where the command will run.
# 1. The current directory is checked pwd # 2. Creates a directory mkdir ~/Documents/Severalnines/ # 3. Enroute to the chosen directory cd ~/Documents/Severalnines/ # 4. Create the file Makefile touch Makefile # 5. Open the file for editing
The definition of a block is shown below, having as its name ls, and a single program to be run, pg_lsclusters.
# 1. Block name ls: # 2. Program to be executed pg_lsclusters
The file Makefile can contain multiple blocks, and each can run as many programs as you need, and even receive parameters. It is imperative that the lines belonging to a block of execution are correct, using tabulations for indenting instead of spaces.
The use of make to run the pg_lsclusters program is accomplished by using the make ls command.
# 1. Executes pg_lsclusters make ls
The result obtained in a recent PostgreSQL installation brings a single cluster called main, allocated on port 5432 of the operating system. When the pg_createcluster program is used, a new port is allocated to the new cluster created, having the value 5432 as the starting point, until another is found in ascending order.
Write Ahead Logging (WAL)
This replication procedure consists of making a backup of a working cluster which is continuing to receive updates. If this is done on the same machine, however, many of the benefits brought by this technique are lost.
Scaling a system horizontally ensures greater availability of the service, as if any hardware problems occur, it wouldn’t make much difference as there are other machines ready to take on the workload.
WAL is the term used to represent an internal complex algorithm to PostgreSQL that ensures the integrity of the transactions that are made on the system. However, only a single cluster must have the responsibility to access it with write permission.
The architecture now has three distinct types of clusters:
- A primary with responsibility for writing to WAL;
- A replica ready to take over the primary post;
- Miscellaneous other replicas with WAL reading duty.
Write operations are any activities that are intended to modify the data structure, either by entering new elements, or updating and deleting existing records.
PostgreSQL Cluster Configuration
Each cluster has two directories, one containing its configuration files and another with the transaction logs. These are located in /etc/postgresql/11/$(cluster) and /var/lib/postgresql/11/$(cluster), respectively (where $(cluster) is the name of the cluster).
The file postgresql.conf is created immediately after the cluster has been created by running the program pg_createcluster, and the properties can be modified for the customization of a cluster.
Editing this file directly is not recommended because it contains almost all properties. Their values have been commented out, having the symbol # at the beginning of each line, and several other lines commented out containing instructions for changing the property values.
Adding another file containing the desired changes is possible, simply edit a single property named include, replacing the default value #include = ‘’ with include = ‘postgresql.replication.conf’.
Before you start the cluster, you need the presence of the file postgresql.replication.conf in the same directory where you find the original configuration file, called postgresql.conf.
# 1. Block name create: # 2. Creates the cluster pg_createcluster 11 $(cluster) -- --data-checksums # 3. Copies the file to the directory cp postgresql.replication.conf /etc/postgresql/11/$(cluster)/ # 4. A value is assigned to the property sed -i "s|^#include = ''|include = 'postgresql.replication.conf'|g" /etc/postgresql/11/$(cluster)/postgresql.conf
The use of –data-checksums in the creation of the cluster adds a greater level of integrity to the data, costing a bit of performance but being very important in order to avoid corruption of the files when transferred from one cluster to another.
The procedures described above can be reused for other clusters, simply passing a value to $(cluster) as a parameter in the execution of the program make.
# 1. Executes the block 'create' by passing a parameter sudo make create cluster=primary
Now that a brief automation of the tasks has been established, what remains to be done is the definition of the file postgresql.replication.conf according to the need for each cluster.
Replication on PostgreSQL
Two ways to replicate a cluster are possible, one being complete the other involving the entire cluster (called Streaming Replication) and another could partial or complete (called Logical Replication).
The settings that must be specified for a cluster fall into four main categories:
- Master Server
- Standby Servers
- Sending Servers
As we saw earlier, WAL is a file that contains the transactions that are made on the cluster, and the replication is the transmission of these files from one cluster to another.
Inside the settings present in the file postgresql.conf, we can see properties that define the behavior of the cluster in relation to the WAL files, such as the size of those files.
# default values max_wal_size = 1GB min_wal_size = 80MB
Another important property called max_wal_senders. Belonging to a cluster with characteristic Sending Servers, is the amount of processes responsible for sending these files to other clusters, having to always a value more than the number of clusters that depend on their receipt.
WAL files can be stored for transmission to a cluster that connects late, or that has had some problems in receiving it, and need previous files in relation to the current time, having the property wal_keep_segments as the specification for how many WAL file segments are to be maintained by a cluster.
A Replication Slot is a functionality that allows the cluster to store WAL files needed to provide another cluster with all the records, having the max_replication_slots option as its property.
# default values max_wal_senders = 10 wal_keep_segments = 0 max_replication_slots = 10
When the intention is to outsource the storage of these WAL files, another method of processing these files can be used, called Continuous Archiving.
This concept allows you to direct the WAL files to a specific location, using a Linux program, and two variables representing the path of the file, and its name, such as %p, and %f, respectively.
This property is disabled by default, but its use can be easily implemented by withdrawing the responsibility of a cluster from storing such important files, and can be added to the file postgresql.replication.conf.
# 1. Creates a directory mkdir ~/Documents/Severalnines/Archiving # 2. Implementation on postgresql.replication.conf archive_mode = on archive_command = 'cp %p ~/Documents/Severalnines/Archiving/%f' # 3. Starts the cluster sudo systemctl start postgresql@11-primary
After the cluster initialization, some properties might need to be modified, and a cluster restart could be required. However, some properties can only be reloaded, without the need for a full reboot of a cluster.
Information on such subjects can be obtained through the comments present in the file postgresql.conf, appearing as # (note: change requires restart).
If this is the case, a simple way to resolve is with the Linux program systemctl, used previously to start the cluster, having only to override the option to restart.
When a full reboot is not required, the cluster itself can reassign its properties through a query run within itself, however, if multiple clusters are running on the same machine, it will be required to pass a parameter containing the port value that the cluster is allocated on the operating system.
# Reload without restarting sudo -H -u postgres psql -c ‘SELECT pg_reload_conf();’ -p 5433
In the example above, the property archive_mode requires a reboot, while archive_command does not. After this brief introduction to this subject, let’s look at how a replica cluster can backup these archived WAL files, using Point In Time Recovery (PITR).
PostgreSQL Replication Point-In-Time Recovery
This suggestive name allows a cluster to go back to its state from a certain period in time. This is done through a property called recovery_target_timeline, which expects to receive a value in date format, such as 2019-08-22 12:05 GMT, or the assignment latest, informing the need for a recovery up to the last existing record.
The program pg_basebackup when it runs, makes a copy of a directory containing the data from a cluster to another location. This program tends to receive multiple parameters, being one of them -R, which creates a file named recovery.conf within the copied directory, which in turn is not the same as that contains the other configuration files previously seen, such as postgresql.conf.
The file recovery.conf stores the parameters passed in the execution of the program pg_basebackup, and its existence is essential to the Streaming Replication implementation, because it is within it that the reverse operation to the Continuous Archiving can be performed.
# 1. Block name replicate: # 2. Removes the current data directory rm -rf /var/lib/postgresql/11/$(replica) # 3. Connects to primary cluster as user postgres # 4. Copies the entire data directory # 5. Creates the file recovery.conf pg_basebackup -U postgres -d postgresql://localhost:$(primaryPort) -D /var/lib/postgresql/11/$(replica) -P -R # 6. Inserts the restore_command property and its value echo "restore_command = 'cp ~/Documents/Severalnines/Archiving/%f %p'" >> /var/lib/postgresql/11/$(replica)/recovery.conf # 7. The same is done with recovery_target_timeline echo "recovery_target_timeline = 'latest'" >> /var/lib/postgresql/11/$(replica)/recovery.conf
This replicate block specified above needs to be run by the operating system’s postgres user, in order to avoid potential conflicts with who is the owner of the cluster data, postgres, or the user root.
The replica cluster is still standing, basting it to successfully start the replication, having the replica cluster process called pg_walreceiver interacting with the primary cluster called pg_walsender over a TCP connection.
# 1. Executes the block ‘replicate’ by passing two parameters sudo -H -u postgres make replicate replica=replica primaryPort=5433 # 2. Starts the cluster replica sudo systemctl start postgresql@11-replica
Verification of the health of this replication model, called Streaming Replication, is performed by a query that is run on the primary cluster.
# 1. Checks the Streaming Replication created sudo -H -u postgres psql -x -c ‘select * from pg_stat_replication;’ -p 5433
In this blog, we showed how to setup asynchronous Streaming Replication between two PostgreSQL clusters. Remember though, vulnerabilities exist in the code above, for example, using the postgres user to do such a task is not recommended.
The replication of a cluster provides several benefits when it is used in the correct way and has easy access to the APIs that come to interact with the clusters.