blog
Validating Your PostgreSQL Backups on Docker
Backups are the vital and important part of any disaster recovery plan, taking backups of the production database is also a basic and an important part of PostgreSQL administration. However, DBA’s don’t often validate that those backups are reliable.
Every organization takes PostgreSQL database backups in different form, some may take a file system (physical) backup of the PostgreSQL data directories (using tools like Barman, PGBackRest) while others may take only logical backups (using pg_dump), and even others may take block level snapshots using tools like EBS or VMWare snapshot.
In this blog, we will show you how to validate your PostgreSQL backup by restoring the backup onto a Docker container using the tool pgBackRest for taking and restoring the backup. We are assuming that you already have knowledge on how to use PostgreSQL, Docker and pgBackRest.
Why Should You Use Docker?
Docker makes automation simpler, it also eases the job of integrating our PostgreSQL Backup Validation task in a CI/CD tools like CircleCI, Travis, GitLab or Jenkins. Using Docker avoids the time and resource we have to spend on bringing the new environment for testing the backup.
Demo Setup
Host |
Role |
Installed Packages |
Crontab |
node-1 192.168.0.111 CentOS-7 |
Posgresql-11 primary Instance. Created user and database “pgbench“ and initialized with pgbench tables. |
postgresql-11, pgbackrest-2.15 |
Running pgbench every 5mins to simulate the workload. |
node-2 |
Test Machine – we will run our Docker validation on this host. |
docker-ce-18.06, pgbackrest-2.15 |
|
node-3 192.168.0.113 CentOS-7 |
pgBackRest Repository Host |
pgbackrest-2.15 |
Running pgbackrest to take Incr backup every 4 hour Diff backup every day Full backup weekly |
For pgbackrest to work, I have setup passwordless SSH access between these nodes.
User “postgres” on node-1 and node-2 can login passwordless to user “pgbackrest” on node-3.
[vagrant@node-1 ~]$ sudo -u postgres ssh pgbackrest@node-3 uptime
13:31:51 up 7:00, 1 user, load average: 0.00, 0.01, 0.05
[vagrant@node-2 ~]$ sudo -u postgres ssh pgbackrest@node-3 uptime
13:31:27 up 7:00, 1 user, load average: 0.00, 0.01, 0.05
User “pgbackrest” on node-3 can login passwordless to user “postgres” on node-1 and node-2.
[vagrant@node-3 ~]$ sudo -u pgbackrest ssh postgres@node-1 uptime
13:32:29 up 7:02, 1 user, load average: 1.18, 0.83, 0.58
[vagrant@node-3 ~]$ sudo -u pgbackrest ssh postgres@node-2 uptime
13:32:33 up 7:01, 1 user, load average: 0.00, 0.01, 0.05
Overview of Backup Validation
Below is a brief overview of the steps we will be following for our PostgreSQL Backup Validation.
- Using the pgbackrest restore command we will fetch the latest backup from the pgBackRest Repository Host (node-3) to the Test Machine (node-2) directory /var/lib/pgsql/11/data
- During the docker run, we mount the host machine (node-2) directory /var/lib/pgsql on the docker container and start the postgres/postmaster daemon from the mounted directory. We would also expose the port 5432 from container to host machine port 15432.
- Once the docker container started running, we will connect to the PostgreSQL database via node-2:15432 and verify all tables and rows are restored. We would also check the PostgreSQL logs to make sure there is no ERROR message during the recovery and the instance has also reached the consistent state.
Most of the backup validation steps will be performed on host node-2.
Building the Docker Image
On node-2, create Dockerfile and build the docker image “postgresql:11”. In the below Dockerfile, we will apply the following changes over centos:7 base image.
- Installing postgresql-11, pgbackrest and openssh-clients. Openssh-clients is needed for pgbackrest.
- Configuring pgbackrest – We need pgbackrest configuration in the image to test PITR, without pgbackrest configuration restore_command would fail. As part of pgbackrest configuration
- We are adding the pgbackrest repository host ip (192.168.0.113) in the config file /etc/pgbackrest.conf.
- We also need password less SSH access between the docker container and pgbackrest repository host. For this, I am copying SSH_PRIVATE_KEY which I have already generated and I have also added it’s public key to the pgbackrest repository host ( pgbackrest@node-3 ) .
- VOLUME [“${PGHOME_DIR}”] – Defines the container directory /var/lib/pgsql as a mount point. While running docker run command we will specify node-2 host directory to this mount point.
- USER postgres – Any command, runs on the container will be executed as postgres user.
$ cat Dockerfile
FROM centos:7
ARG PGBACKREST_REPO_HOST
ARG PGHOME_DIR=/var/lib/pgsql
## Adding Postgresql Repo for CentOS7
RUN yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
## Installing PostgreSQL
RUN yum -y install postgresql11 postgresql11-server postgresql11-devel postgresql11-contrib postgresql11-libs pgbackrest openssh-clients
## Adding configuration for pgbackrest, needed for WAL recovery and replication.
RUN echo -ne "[global]nrepo1-host=${PGBACKREST_REPO_HOST}nn[pgbench]npg1-path=/var/lib/pgsql/11/datan" > /etc/pgbackrest.conf
## Adding Private Key to the Docker. Docker container would use this private key for pgbackrest wal recovery.
RUN mkdir -p ${PGHOME_DIR}/.ssh && chmod 0750 ${PGHOME_DIR}/.ssh
COPY --chown=postgres:postgres ./SSH_PRIVATE_KEY ${PGHOME_DIR}/.ssh/id_rsa
RUN chmod 0600 ${PGHOME_DIR}/.ssh/id_rsa
RUN echo -ne "Host ${PGBACKREST_REPO_HOST}ntStrictHostKeyChecking non" >> ${PGHOME_DIR}/.ssh/config
## Making "/var/lib/pgsql" as a mountable directory in the container
VOLUME ["${PGHOME_DIR}"]
## Setting postgres as the default user for any remaining commands
USER postgres
We now have two files, Dockerfile used by docker build and SSH_PRIVATE_KEY which we will be copied to the docker image.
$ ls
Dockerfile SSH_PRIVATE_KEY
Run the below command on node-2 to build our docker image. I have mentioned the pgbackrest repository host IP in the command and this IP will be used in pgbackrest parameter “repo-host”.
$ docker build --no-cache -t postgresql:11 --build-arg PGBACKREST_REPO_HOST=192.168.0.113 .
Sending build context to Docker daemon 230.4kB
Step 1/12 : FROM centos:7
---> 9f38484d220f
Step 2/12 : ARG PGBACKREST_REPO_HOST
---> Running in 8b7b36c6f151
Removing intermediate container 8b7b36c6f151
---> 31510e46e286
Step 3/12 : ARG PGHOME_DIR=/var/lib/pgsql
...
Step 4/12 : RUN yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
...
...
Step 12/12 : USER postgres
---> Running in c91abcf46440
Removing intermediate container c91abcf46440
---> bebce78df5ae
Successfully built bebce78df5ae
Successfully tagged postgresql:11
Make sure the image is successfully built, and check “postgresql:11” image is created recently as shown below.
$ docker image ls postgresql:11
REPOSITORY TAG IMAGE ID CREATED SIZE
postgresql 11 2e03ed2a5946 3 minutes ago 482MB
Restoring the PostgreSQL Backup
We will now restore our PostgreSQL backup maintained in pgbackrest backup repository host node-3.
Below is the pgbackrest configuration file present on host node-2 and I have mentioned node-3 as pgbackrest repository host. Directory mentioned in the param pg1-path is where the PostgreSQL data directory would get restored.
[vagrant@node-2 ~]$ cat /etc/pgbackrest.conf
[global]
log-level-file=detail
repo1-host=node-3
[pgbench]
pg1-path=/var/lib/pgsql/11/data
Using below pgbackrest restore command, postgresql data directory will be restored at node-2:/var/lib/pgsql/11/data.
To validate PITR with the pgbackrest backup I have set –type=time –target=’2019-07-30 06:24:50.241352+00′, so that the WAL recovery stops before the mentioned time.
[vagrant@node-2 ~]$ sudo -u postgres bash -c "/usr/bin/pgbackrest --type=time --target='2019-07-30 06:24:50.241352+00' --target-action=promote --recovery-option='standby_mode=on' --stanza=pgbench restore"
Above command may take time depending on the backup size and network bandwidth. Once restored, verify the size of the data directory and also check recovery.conf.
[vagrant@node-2 ~]$ sudo -u postgres du -sh /var/lib/pgsql/11/data
2.1G /var/lib/pgsql/11/data
[vagrant@node-2 ~]$ sudo -u postgres cat /var/lib/pgsql/11/data/recovery.conf
standby_mode = 'on'
restore_command = '/usr/bin/pgbackrest --stanza=pgbench archive-get %f "%p"'
recovery_target_time = '2019-07-30 06:24:50.241352+00'
Disable archive mode for PostgreSQL docker container.
[vagrant@node-2 ~]$ sudo -u postgres bash -c "echo 'archive_mode = off' >> /var/lib/pgsql/11/data/postgresql.auto.conf"
Start the docker container with the image “postgresql:11”. In the command we are
-
Setting container name as “pgbench”
-
Mounting docker host(node-2) directory /var/lib/psql to the docker container directory /var/lib/psql
-
Exposing container port 5432 to port 15432 on node-2.
-
Starting the postgres daemon using the command /usr/pgsql-11/bin/postmaster -D /var/lib/pgsql/11/data
[vagrant@node-2 ~]$ docker run --rm --name "pgbench" -v /var/lib/pgsql:/var/lib/pgsql -p 15432:5432 -d postgresql:11 /usr/pgsql-11/bin/postmaster -D /var/lib/pgsql/11/data
e54f2f65afa13b6a09236a476cb1de3d8e499310abcec2b121a6b35611dac276
Verify “pgbench” container is created and running.
[vagrant@node-2 ~]$ docker ps -f name=pgbench
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e54f2f65afa1 postgresql:11 "/usr/pgsql-11/bin/p…" 34 seconds ago Up 33 seconds 0.0.0.0:15432->5432/tcp pgbench
Validating PostgreSQL
Since the host directory /var/lib/pgsql is shared with docker container, the logs generated by the PostgreSQL service is also visible from node-2. Verify today’s log to make sure PostgreSQL has started fine without any ERROR and make sure below log lines are present.
[vagrant@node-2 ~]$ sudo -u postgres tailf /var/lib/pgsql/11/data/log/postgresql-Tue.csv
..
2019-07-30 06:38:34.633 UTC,,,7,,5d3fe5e9.7,5,,2019-07-30 06:38:33 UTC,1/0,0,LOG,00000,"consistent recovery state reached at E/CE000210",,,,,,,,,""
2019-07-30 06:38:34.633 UTC,,,1,,5d3fe5e9.1,2,,2019-07-30 06:38:33 UTC,,0,LOG,00000,"database system is ready to accept read only connections",,,,,,,,,""
2019-07-30 06:38:35.236 UTC,,,7,,5d3fe5e9.7,6,,2019-07-30 06:38:33 UTC,1/0,0,LOG,00000,"restored log file ""000000010000000E000000CF"" from archive",,,,,,,,,""
2019-07-30 06:38:36.210 UTC,,,7,,5d3fe5e9.7,7,,2019-07-30 06:38:33 UTC,1/0,0,LOG,00000,"restored log file ""000000010000000E000000D0"" from archive",,,,,,,,,""
...
2019-07-30 06:39:57.221 UTC,,,7,,5d3fe5e9.7,37,,2019-07-30 06:38:33 UTC,1/0,0,LOG,00000,"recovery stopping before commit of transaction 52181192, time 2019-07-30 06:25:01.576689+00",,,,,,,,,""
...
2019-07-30 06:40:00.682 UTC,,,7,,5d3fe5e9.7,47,,2019-07-30 06:38:33 UTC,1/0,0,LOG,00000,"archive recovery complete",,,,,,,,,""
Message “consistent recovery state reached at E/CE000210“, indicates that with the pgbackrest backup data directory we were able to reach a consistent state.
Message “archive recovery complete“, indicates that we are able to replay the WAL file backed-up by pgbackrest and able to recover without any issue.
Connect to postgresql instance via local port 15432 and verify tables and row counts.
[vagrant@node-2 ~]$ sudo -iu postgres /usr/pgsql-11/bin/psql -p 15432 -h localhost -U pgbench
Password for user pgbench:
psql (11.4)
Type "help" for help.
pgbench=> dt
List of relations
Schema | Name | Type | Owner
--------+------------------+-------+---------
public | pgbench_accounts | table | pgbench
public | pgbench_branches | table | pgbench
public | pgbench_history | table | pgbench
public | pgbench_tellers | table | pgbench
(4 rows)
pgbench=> select * from pgbench_history limit 1;
tid | bid | aid | delta | mtime | filler
-----+-----+---------+-------+----------------------------+--------
98 | 3 | 2584617 | 507 | 2019-07-30 06:20:01.412226 |
(1 row)
pgbench=> select max(mtime) from pgbench_history ;
max
----------------------------
2019-07-30 06:22:01.402245
(1 row)
pgbench=> select count(1) from pgbench_history ;
count
-------
90677
(1 row)
pgbench=> select count(1) from pgbench_accounts ;
count
----------
10000000
(1 row)
We have now restored our PostgreSQL backup on a docker container and also verified PITR. Once validating the backup we can stop the container and remove the data directory.
[vagrant@node-2 ~]$ docker stop pgbench
pgbench
[vagrant@node-2 ~]$ sudo -u postgres bash -c "rm -rf /var/lib/pgsql/11/data && mkdir -p /var/lib/pgsql/11/data && chmod 0700 /var/lib/pgsql/11/data"
Conclusion
In this blog, I demonstrated the backup validation using a small database on a small VirtualBox VM. Because of this, the backup validation was completed in just a few minutes. It’s important to note that in production you will need to choose a proper VM with enough Memory, CPU, and Disk to allow the backup validation to complete successfully. You can also automate the whole validation process in a bash script or even by integrating with a CI/CD pipeline so that you can regularly validate our PostgreSQL backups.