Validating Your PostgreSQL Backups on Docker

Ashokraj M

Backups are the vital and important part of any disaster recovery plan, taking backups of the production database is also a basic and an important part of PostgreSQL administration. However, DBA’s don’t often validate that those backups are reliable.

Every organization takes PostgreSQL database backups in different form, some may take a file system (physical) backup of the PostgreSQL data directories (using tools like Barman, PGBackRest) while others may take only logical backups (using pg_dump), and even others may take block level snapshots using tools like EBS or VMWare snapshot.

In this blog, we will show you how to validate your PostgreSQL backup by restoring the backup onto a Docker container using the tool pgBackRest for taking and restoring the backup. We are assuming that you already have knowledge on how to use PostgreSQL, Docker and pgBackRest.

Why Should You Use Docker?

Docker makes automation simpler, it also eases the job of integrating our PostgreSQL Backup Validation task in a CI/CD tools like CircleCI, Travis, GitLab or Jenkins. Using Docker avoids the time and resource we have to spend on bringing the new environment for testing the backup.

Demo Setup

Host 

Role

Installed

Packages 

Crontab

node-1 192.168.0.111

CentOS-7

Posgresql-11 primary Instance.

Created user and database “pgbench“ and initialized with pgbench tables.

postgresql-11, pgbackrest-2.15

Running pgbench every 5mins to simulate the workload.

node-2
192.168.0.112
CentOS-7

Test Machine - we will run our Docker validation on this host. 

docker-ce-18.06, pgbackrest-2.15

 

node-3

192.168.0.113

CentOS-7

pgBackRest Repository Host

pgbackrest-2.15

Running pgbackrest to take Incr backup every 4 hour

Diff backup every day

Full backup weekly 

 

For pgbackrest to work, I have setup passwordless SSH access between these nodes.

User “postgres” on node-1 and node-2 can login passwordless to user “pgbackrest” on node-3.

[[email protected] ~]$ sudo -u postgres ssh [email protected] uptime

 13:31:51 up  7:00, 1 user,  load average: 0.00, 0.01, 0.05

[[email protected] ~]$ sudo -u postgres ssh [email protected] uptime

 13:31:27 up  7:00, 1 user,  load average: 0.00, 0.01, 0.05

User “pgbackrest” on node-3 can login passwordless to user “postgres” on node-1 and node-2.

[[email protected] ~]$ sudo -u pgbackrest ssh [email protected] uptime 

 13:32:29 up  7:02, 1 user,  load average: 1.18, 0.83, 0.58

[[email protected] ~]$ sudo -u pgbackrest ssh [email protected] uptime 

 13:32:33 up  7:01, 1 user,  load average: 0.00, 0.01, 0.05

Overview of Backup Validation

Below is a brief overview of the steps we will be following for our PostgreSQL Backup Validation.

  1. Using the pgbackrest restore command we will fetch the latest backup from the pgBackRest Repository Host (node-3) to the Test Machine (node-2) directory /var/lib/pgsql/11/data
  2. During the docker run, we mount the host machine (node-2) directory /var/lib/pgsql on the docker container and start the postgres/postmaster daemon from the mounted directory. We would also expose the port 5432 from container to host machine port 15432. 
  3. Once the docker container started running, we will connect to the PostgreSQL database via node-2:15432 and verify all tables and rows are restored. We would also check the PostgreSQL logs to make sure there is no ERROR message during the recovery and the instance has also reached the consistent state.

Most of the backup validation steps will be performed on host node-2.

Building the Docker Image

On node-2, create Dockerfile and build the docker image “postgresql:11”. In the below Dockerfile, we will apply the following changes over centos:7 base image.

  1. Installing postgresql-11, pgbackrest and openssh-clients. Openssh-clients is needed for pgbackrest.
  2. Configuring pgbackrest - We need pgbackrest configuration in the image to test PITR, without pgbackrest configuration restore_command would fail. As part of pgbackrest configuration 
    1. We are adding the pgbackrest repository host ip (192.168.0.113) in the config file /etc/pgbackrest.conf
    2. We also need password less SSH access between the docker container and pgbackrest repository host. For this, I am copying SSH_PRIVATE_KEY which I have already generated and I have also added it’s public key to the pgbackrest repository host ( [email protected] ) .
  3. VOLUME ["${PGHOME_DIR}"] - Defines the container directory /var/lib/pgsql as a mount point. While running docker run command we will specify node-2 host directory to this mount point.
  4. USER postgres - Any command, runs on the container will be executed as postgres user.
$ cat Dockerfile
FROM  centos:7

ARG PGBACKREST_REPO_HOST
ARG PGHOME_DIR=/var/lib/pgsql

## Adding Postgresql Repo for CentOS7
RUN yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

## Installing PostgreSQL
RUN yum -y install postgresql11 postgresql11-server postgresql11-devel postgresql11-contrib postgresql11-libs pgbackrest openssh-clients

## Adding configuration for pgbackrest, needed for WAL recovery and replication.
RUN echo -ne "[global]\nrepo1-host=${PGBACKREST_REPO_HOST}\n\n[pgbench]\npg1-path=/var/lib/pgsql/11/data\n" > /etc/pgbackrest.conf

## Adding Private Key to the Docker. Docker container would use this private key for pgbackrest wal recovery.
RUN mkdir -p ${PGHOME_DIR}/.ssh &&  chmod 0750 ${PGHOME_DIR}/.ssh
COPY --chown=postgres:postgres ./SSH_PRIVATE_KEY  ${PGHOME_DIR}/.ssh/id_rsa
RUN chmod 0600 ${PGHOME_DIR}/.ssh/id_rsa
RUN echo -ne "Host ${PGBACKREST_REPO_HOST}\n\tStrictHostKeyChecking no\n" >> ${PGHOME_DIR}/.ssh/config

## Making "/var/lib/pgsql" as a mountable directory in the container
VOLUME ["${PGHOME_DIR}"]

## Setting postgres as the default user for any remaining commands
USER postgres

We now have two files, Dockerfile used by docker build and SSH_PRIVATE_KEY which we will be copied to the docker image. 

$ ls

Dockerfile  SSH_PRIVATE_KEY

Run the below command on node-2 to build our docker image. I have mentioned the pgbackrest repository host IP in the command and this IP will be used in pgbackrest parameter “repo-host”. 

$ docker build --no-cache -t postgresql:11 --build-arg PGBACKREST_REPO_HOST=192.168.0.113 .

Sending build context to Docker daemon  230.4kB

Step 1/12 : FROM  centos:7

 ---> 9f38484d220f

Step 2/12 : ARG PGBACKREST_REPO_HOST

 ---> Running in 8b7b36c6f151

Removing intermediate container 8b7b36c6f151

 ---> 31510e46e286

Step 3/12 : ARG PGHOME_DIR=/var/lib/pgsql

...

Step 4/12 : RUN yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

...

...

Step 12/12 : USER postgres

 ---> Running in c91abcf46440

Removing intermediate container c91abcf46440

 ---> bebce78df5ae

Successfully built bebce78df5ae

Successfully tagged postgresql:11

Make sure the image is successfully built, and check “postgresql:11” image is created recently as shown below.

$ docker image ls postgresql:11

REPOSITORY          TAG IMAGE ID            CREATED SIZE

postgresql          11 2e03ed2a5946        3 minutes ago 482MB

Restoring the PostgreSQL Backup

We will now restore our PostgreSQL backup maintained in pgbackrest backup repository host node-3. 

Below is the pgbackrest configuration file present on host node-2 and I have mentioned node-3 as pgbackrest repository host. Directory mentioned in the param pg1-path is where the PostgreSQL data directory would get restored.

[[email protected] ~]$ cat /etc/pgbackrest.conf 

[global]

log-level-file=detail

repo1-host=node-3



[pgbench]

pg1-path=/var/lib/pgsql/11/data

Using below pgbackrest restore command, postgresql data directory will be restored at node-2:/var/lib/pgsql/11/data

To validate PITR with the pgbackrest backup I have set --type=time --target='2019-07-30 06:24:50.241352+00', so that the WAL recovery stops before the mentioned time. 

[[email protected] ~]$ sudo -u postgres bash -c "/usr/bin/pgbackrest --type=time --target='2019-07-30 06:24:50.241352+00' --target-action=promote --recovery-option='standby_mode=on' --stanza=pgbench restore"

Above command may take time depending on the backup size and network bandwidth. Once restored, verify the size of the data directory and also check recovery.conf.

[[email protected] ~]$ sudo -u postgres du -sh /var/lib/pgsql/11/data 

2.1G    /var/lib/pgsql/11/data



[[email protected] ~]$ sudo -u postgres cat /var/lib/pgsql/11/data/recovery.conf

standby_mode = 'on'

restore_command = '/usr/bin/pgbackrest --stanza=pgbench archive-get %f "%p"'

recovery_target_time = '2019-07-30 06:24:50.241352+00'

Disable archive mode for PostgreSQL docker container.

[[email protected] ~]$ sudo -u postgres bash -c "echo 'archive_mode = off' >> /var/lib/pgsql/11/data/postgresql.auto.conf"

Start the docker container with the image “postgresql:11”. In the command we are 

  1. Setting container name as “pgbench”

  2. Mounting docker host(node-2) directory /var/lib/psql to the docker container directory /var/lib/psql 

  3. Exposing container port 5432 to port 15432 on node-2.

  4. Starting the postgres daemon using the command /usr/pgsql-11/bin/postmaster -D /var/lib/pgsql/11/data

[[email protected] ~]$ docker run --rm --name "pgbench" -v /var/lib/pgsql:/var/lib/pgsql -p 15432:5432 -d postgresql:11  /usr/pgsql-11/bin/postmaster -D /var/lib/pgsql/11/data

e54f2f65afa13b6a09236a476cb1de3d8e499310abcec2b121a6b35611dac276

Verify “pgbench” container is created and running.

[[email protected] ~]$ docker ps -f name=pgbench

CONTAINER ID        IMAGE COMMAND                  CREATED STATUS PORTS                     NAMES

e54f2f65afa1        postgresql:11 "/usr/pgsql-11/bin/p…"   34 seconds ago Up 33 seconds 0.0.0.0:15432->5432/tcp   pgbench

Validating PostgreSQL 

Since the host directory /var/lib/pgsql is shared with docker container, the logs generated by the PostgreSQL service is also visible from node-2. Verify today’s log to make sure PostgreSQL has started fine without any ERROR and make sure below log lines are present. 

[[email protected] ~]$ sudo -u postgres tailf /var/lib/pgsql/11/data/log/postgresql-Tue.csv

..

2019-07-30 06:38:34.633 UTC,,,7,,5d3fe5e9.7,5,,2019-07-30 06:38:33 UTC,1/0,0,LOG,00000,"consistent recovery state reached at E/CE000210",,,,,,,,,""

2019-07-30 06:38:34.633 UTC,,,1,,5d3fe5e9.1,2,,2019-07-30 06:38:33 UTC,,0,LOG,00000,"database system is ready to accept read only connections",,,,,,,,,""

2019-07-30 06:38:35.236 UTC,,,7,,5d3fe5e9.7,6,,2019-07-30 06:38:33 UTC,1/0,0,LOG,00000,"restored log file ""000000010000000E000000CF"" from archive",,,,,,,,,""

2019-07-30 06:38:36.210 UTC,,,7,,5d3fe5e9.7,7,,2019-07-30 06:38:33 UTC,1/0,0,LOG,00000,"restored log file ""000000010000000E000000D0"" from archive",,,,,,,,,""

...

2019-07-30 06:39:57.221 UTC,,,7,,5d3fe5e9.7,37,,2019-07-30 06:38:33 UTC,1/0,0,LOG,00000,"recovery stopping before commit of transaction 52181192, time 2019-07-30 06:25:01.576689+00",,,,,,,,,""

...

2019-07-30 06:40:00.682 UTC,,,7,,5d3fe5e9.7,47,,2019-07-30 06:38:33 UTC,1/0,0,LOG,00000,"archive recovery complete",,,,,,,,,""

Message "consistent recovery state reached at E/CE000210", indicates that with the pgbackrest backup data directory we were able to reach a consistent state.

Message "archive recovery complete", indicates that we are able to replay the WAL file backed-up by pgbackrest and able to recover without any issue.

Connect to postgresql instance via local port 15432 and verify tables and row counts.

[[email protected] ~]$ sudo -iu postgres /usr/pgsql-11/bin/psql  -p 15432 -h localhost -U pgbench 

Password for user pgbench: 

psql (11.4)

Type "help" for help.



pgbench=> \dt

              List of relations

 Schema |       Name | Type  | Owner  

--------+------------------+-------+---------

 public | pgbench_accounts | table | pgbench

 public | pgbench_branches | table | pgbench

 public | pgbench_history  | table | pgbench

 public | pgbench_tellers  | table | pgbench

(4 rows)



pgbench=> select * from pgbench_history limit 1;

 tid | bid |   aid | delta |           mtime | filler 

-----+-----+---------+-------+----------------------------+--------

  98 |   3 | 2584617 |   507 | 2019-07-30 06:20:01.412226 | 

(1 row)



pgbench=> select max(mtime) from pgbench_history ;

            max             

----------------------------

 2019-07-30 06:22:01.402245

(1 row)



pgbench=> select count(1) from pgbench_history ;

 count 

-------

 90677

(1 row)



pgbench=> select count(1) from pgbench_accounts ;

  count   

----------

 10000000

(1 row)

We have now restored our PostgreSQL backup on a docker container and also verified PITR. Once validating the backup we can stop the container and remove the data directory.

[[email protected] ~]$ docker stop pgbench

pgbench

[[email protected] ~]$ sudo -u postgres bash -c "rm -rf /var/lib/pgsql/11/data && mkdir -p /var/lib/pgsql/11/data && chmod 0700 /var/lib/pgsql/11/data"

Conclusion

In this blog, I demonstrated the backup validation using a small database on a small VirtualBox VM. Because of this, the backup validation was completed in just a few minutes. It’s important to note that in production you will need to choose a proper VM with enough Memory, CPU, and Disk to allow the backup validation to complete successfully. You can also automate the whole validation process in a bash script or even by integrating with a CI/CD pipeline so that you can regularly validate our PostgreSQL backups.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.