blog

PostgreSQL Replication Setup & Maintenance Using Ansible

Ashokraj M

Published: August 8, 2019
Last Updated: July 25, 2022

Replication is a key feature for most setups and it is supported by most database technologies on the market. The PostgreSQL community introduced replication in version 9.0 (called Streaming Replication or SR), since then the replication in PostgreSQL has evolved with additional features like Cascading Replication, Logical Decoding, and several more optimizations.

In this blog, we will look at using the Ansible role postgresql as developed by “Demonware” (a fork of the role “ANXS/postgresql”). I had already talked about using the “ANXS/postgresql” role in my previous blog but I did not discuss the replication feature. The Ansible role “postgresql” adds the ability to set up PostgreSQL replication using repmgr.

About Repmgr

Repmgr is an open-source command line tool developed and maintained by 2ndQuadrant. The tool automates most of the tasks related to managing the PostgreSQL replication cluster. Below is the list of tasks which can be performed smoothly using repmgr command and repmgrd daemon.

Bootstrapping the PostgreSQL replication cluster.
Doing auto-failover and manual switch-over of the primary instance.
Adding and removing the standby (read-replica) instances.

Preparing the Controller Node

Prepare the controller node with the Ansible PostgreSQL role, playbooks, inventories and custom PostgreSQL replication.

$ mkdir demo
$ pushd demo
$ mkdir roles
$ git clone https://github.com/Demonware/postgresql roles/postgresql
$ pushd roles/postgresql
$ git checkout add-repmgr-extension

In the downloaded role, there are two defaults variable files main.yml and repmgr.yml file. However, Ansible will consider only the main.yml file. To make the Ansible also use the repmgr.yml file we are moving both files under the directory defaults/main.

$ mkdir defaults/main
$ mv defaults/main.yml defaults/repmgr.yml defaults/main
$ popd

Ansible Inventory File

For the demo, we will setup the PostgreSQL replication cluster on three nodes. I created three CentOS VMs vm-01, vm-02 and vm-03, all of them are listed under the group postgres_cluster in the development.yaml file.

$ cat development.yaml
all:
  children:
    postgres_cluster:
      hosts:
        vm-01:
        vm-02:
        vm-03:
      vars:
        ansible_user: "vagrant"

Do Ansible ping and make sure we are able to reach all the hosts under the group postgres_cluster.

$ ansible -i development.yaml -m ping  postgres_cluster
vm-01 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
vm-03 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
vm-02 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Custom Variable File

In the custom variable file custom-vars.yaml, we will define the following things:

PostgreSQL version to install and encoding to use
Modifying the PostgreSQL configuration to enable replication, we will modify the parameters like wal_level, max_wal_senders, max_replication_slots, hot_standby, archive_mode, archive_command
Creating the necessary users and database
Modifying pg_hba.conf file to allow the necessary connection from the application and the repmgr replication
Some repmgr related variables

$ cat custom-vars.yaml 
# Basic settings
postgresql_version: 11
postgresql_encoding: "UTF-8"
postgresql_locale: "en_US.UTF-8"
postgresql_ctype: "en_US.UTF-8"
postgresql_admin_user: "postgres"
postgresql_default_auth_method: "peer"
postgresql_listen_addresses: "*"
postgresql_wal_level: "replica"
postgresql_max_wal_senders: 10
postgresql_max_replication_slots: 10
postgresql_wal_keep_segments: 100
postgresql_hot_standby: on
postgresql_archive_mode: on
postgresql_archive_command: "/bin/true"
postgresql_shared_preload_libraries:
  - repmgr

postgresql_users:
  - name: "{{repmgr_user}}"
    pass: "password"
postgresql_databases:
  - name: "{{repmgr_database}}"
    owner: "{{repmgr_user}}"
    encoding: "UTF-8"
postgresql_user_privileges:
  - name: "{{repmgr_user}}"
    db: "{{repmgr_database}}"
    priv: "ALL"
    role_attr_flags: "SUPERUSER,REPLICATION"
postgresql_pg_hba_custom:
  - { type: "host", database: "all", user: "all", address: "192.168.0.0/24", method: "md5" }
  - { type: "host", database: "replication", user: "repmgr", address: "192.168.0.0/24", method: "md5" }  
  - { type: "host", database: "replication", user: "repmgr", address: "127.0.0.1/32", method: "md5" }  

# repmgr related variables
postgresql_ext_install_repmgr: yes
repmgr_target_group: "postgres_cluster"
repmgr_target_group_hosts: "{{ groups[repmgr_target_group] }}"
repmgr_master: "vm-03"

Following are some of the notable variables defined in custom-vars.yaml:

postgresql_version: 11 – Installs PostgreSQL version 11
postgresql_ext_install_repmgr: yes – Installs repmgr extension on the PostgreSQL cluster
repmgr_target_group: “postgres_cluster” – Repmgr works on the hosts defined under the group “postgres_cluster” defined in the inventory file
repmgr_master: “vm-03” – Host vm-03 will be the PostgreSQL primary instance, vm-01 and vm–02 will replicate from vm-03

Ansible Playbook

In the below postgres-play.yaml playbook, I have assigned the role postgresql against the host group postgres_cluster. I have also included custom variable file custom-vars.yaml which has the configuration for PostgreSQL and repmgr.

$ cat postgres-play.yaml 
- hosts: postgres_cluster
  become: yes
  vars_files:
    - ./custom-vars.yaml
  roles:
    - postgresql

Running Ansible Playbook

We have now created the following Ansible artifacts and we are ready to run the Ansible playbook.

roles/postgresql, Ansible role directory.
custom-vars.yaml, Ansible variable file.
development.yaml, Ansible inventory file.
postgres-play.yam, Ansible playbook file.

Run the below ansible-playbook command from the controller node. Since the postgresql role expects the sudo access of the controller, we are specifying -K option in the command, which in-turn ask us to enter the SUDO password of the controller node.

$ ansible-playbook -Ki development.yaml postgres-play.yaml 
SUDO password: 

PLAY [postgres_cluster] ********************************************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************************************************************************************
ok: [vm-01]
ok: [vm-02]
ok: [vm-03]
...
...
PLAY RECAP *********************************************************************************************************************************************************************************************************************************************************************
vm-01                      : ok=41   changed=4    unreachable=0    failed=0
vm-02                      : ok=41   changed=5    unreachable=0    failed=0
vm-03                      : ok=43   changed=5    unreachable=0    failed=0

Check the PLAY RECAP in the command output and make sure the failed count is 0.

Check PostgreSQL Replication

With the below repmgr cluster show command we can check the status of the PostgreSQL replication cluster. It shows the role, status, timeline of all the PostgreSQL instance in the replication cluster.

$ sudo -u postgres /usr/pgsql-11/bin/repmgr -f /etc/postgresql/11/data/repmgr.conf cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                     
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------
 1  | vm-01 | standby |   running | vm-03    | default  | 100      | 1        | host=vm-01 user=repmgr dbname=repmgr connect_timeout=2
 2  | vm-02 | standby |   running | vm-03    | default  | 100      | 1        | host=vm-02 user=repmgr dbname=repmgr connect_timeout=2
 3  | vm-03 | primary | * running |          | default  | 100      | 1        | host=vm-03 user=repmgr dbname=repmgr connect_timeout=2

From the output of the above command, vm-03 is the primary and vm-01,vm02 are the standby instance replicating from the upstream node vm-03. All the PostgreSQL instances are in the running state.

Checking pg_stat_replication view on primary vm-03 to confirm the both vm-01 and vm-02 are replicating fine.

$ sudo -iu postgres /usr/pgsql-11/bin/psql -h vm-03 -c 'select * from pg_stat_replication'
Password for user postgres: 
 pid  | usesysid | usename | application_name |  client_addr  | client_hostname | client_port |         backend_start         | backend_xmin |   state   | sent_lsn  | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state 
------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+-----------+-----------+------------+---------------+------------
 8480 |    16384 | repmgr  | vm-02            | 192.168.0.122 |                 |       59972 | 2019-07-18 09:04:44.315859+00 |              | streaming | 0/A000870 | 0/A000870 | 0/A000870 | 0/A000870  |           |           |            |             0 | async
 8481 |    16384 | repmgr  | vm-01            | 192.168.0.121 |                 |       35598 | 2019-07-18 09:04:44.336693+00 |              | streaming | 0/A000870 | 0/A000870 | 0/A000870 | 0/A000870  |           |           |            |             0 | async
(2 rows)

Adding Another Standby Node to the Cluster

For adding another PostgreSQL node to the cluster, we have to just re-run the Ansible playbook after adding the particular host in the inventory. In the steps below, I am adding vm-04 to my existing Repmgr Postgresql replication cluster.

Adding vm-04 to the Ansible inventory file developmeb

$ cat development.yaml
all:
  children:
    postgres_cluster:
      hosts:
        vm-01:
        vm-02:
        vm-03:
        vm-04:
      vars:
        ansible_user: "vagrant"

Run Ansible playbook

$ ansible-playbook -Ki development.yaml postgres-play.yaml
SUDO password:

PLAY [postgres_cluster] ********************************************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************************************************************************************
ok: [vm-01]
ok: [vm-04]
ok: [vm-03]
ok: [vm-02]
...
...
RUNNING HANDLER [postgresql : restart postgresql] ******************************************************************************************************************************************************************************************************************************
changed: [vm-04]
changed: [vm-02]
changed: [vm-01]
changed: [vm-03]

PLAY RECAP *********************************************************************************************************************************************************************************************************************************************************************
vm-01                      : ok=41   changed=4    unreachable=0    failed=0
vm-02                      : ok=41   changed=5    unreachable=0    failed=0
vm-03                      : ok=43   changed=5    unreachable=0    failed=0
vm-04                      : ok=46   changed=32   unreachable=0    failed=0

Check replication cluster

$ sudo -u postgres /usr/pgsql-11/bin/repmgr -f /etc/postgresql/11/data/repmgr.conf cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                     
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------
 1  | vm-01 | standby |   running | vm-03    | default  | 100      | 1        | host=vm-01 user=repmgr dbname=repmgr connect_timeout=2
 2  | vm-02 | standby |   running | vm-03    | default  | 100      | 1        | host=vm-02 user=repmgr dbname=repmgr connect_timeout=2
 3  | vm-03 | primary | * running |          | default  | 100      | 1        | host=vm-03 user=repmgr dbname=repmgr connect_timeout=2
 4  | vm-04 | standby |   running | vm-03    | default  | 100      | 1        | host=vm-04 user=repmgr dbname=repmgr connect_timeout=2

Conclusion

So far we have seen on setting up the Repmgr PostgreSQL replication cluster using Ansible. Once the repmgr cluster has been setup we can use repmgr command to do other maintenance on the replication cluster like doing failover and switch-over of the primary node and setting up cascade replication. Please check the repmgr documentation for more details.

Why Cloud Repatriation Matters Now More Than Ever

Automating Day 2 operations: Scaling, upgrades and maintenance

PostgreSQL Bi-Directional Logical Replication — A Deep Dive

Beyond Semantics: Enhancing Retrieval Augmented Generation with Hybrid Search (pgvector + Elasticsearch)