How Do I Know if My PostgreSQL Backup is Good?

Sebastian Insausti

Backups are a must in all Disaster Recovery Plan. It might not always be enough to guarantee an acceptable Recovery Point Objective, but is a good first approach. The problem is what happens if, in case of failure, you need to use this backup, and it’s not usable for some reason? Probably you don’t want to be in that situation, so, in this blog, we’ll see how to confirm if your backup is good to use.

Types of PostgreSQL Backups

Let’s start talking about the different types of backups. There are different types, but in general, we can separate it in two simple categories:

  • Logical: The backup is stored in a human-readable format like SQL.
  • Physical: The backup contains binary data.

Why are we mentioning this? Because we’ll see that there are some checks we can do for one type and not for the other one.

Checking the Backup Logs

The first way to confirm if everything goes fine is by checking the backup logs.

The simplest command to run a PostgreSQL backup could be for example:

$ pg_dumpall > /path/to/dump.sql

But, how can I know if there was an error when the command was running? You can just add to send the output to some specific log file:

$ pg_dumpall > /path/to/dump.sql > /var/log/postgres/pg_dump.log

So, you can add this line in the server cron to run it every day:

30 0 * * * pg_dumpall > /path/to/dump.sql > /var/log/postgres/pg_dump.log

And you should monitor the log file to look for errors, for example, adding it into some monitoring tool like Nagios.

Checking logs is not enough to confirm that the backup will work, because for example, if the backup file is corrupted for some reason, you probably won’t see that in the log file.

Checking the Backup Content

If you are using logical backups, you can verify the content of the backup file, to confirm you have all databases there.

You can list your current PostgreSQL databases using, for example, this command:

$ psql -l | awk '{ print $1 }'| awk 'FNR > 3' |grep '^[a-zA-Z0-9]' |grep -v 'template0'

postgres

template1

world

And check which databases you have in the backup file:

$ grep '^[\]connect' /path/to/dump.sql |awk '{print $2}'

template1

postgres

world

The problem with this check is you don’t check the size or data, so it could be possible that you have some data loss if there was some error when the backup was executed.

Restoring to Check the Backup Manually

The most secure way to confirm if a backup is working is restoring it and access the database.

After the backup is completed, you can restore it manually in another host by copying the dump file and running for example:

$ psql -f /path/to/dump.sql postgres

Then, you can access it and check the databases:

$ psql

postgres=# \l

                                  List of databases

   Name    | Owner   | Encoding |   Collate | Ctype    | Access privileges

-----------+----------+----------+-------------+-------------+-----------------------

 postgres  | postgres | UTF8     | en_US.utf-8 | en_US.utf-8 |

 template0 | postgres | UTF8     | en_US.utf-8 | en_US.utf-8 | =c/postgres          +

           |          | |             | | postgres=CTc/postgres

 template1 | postgres | UTF8     | en_US.utf-8 | en_US.utf-8 | =c/postgres          +

           |          | |             | | postgres=CTc/postgres

 world     | postgres | UTF8     | en_US.utf-8 | en_US.utf-8 |

(4 rows)

The problem with this method is, of course, you should run it manually, or find a way to automate this, which could be a time-consuming task.

Automatic ClusterControl Backup Verification

Now, let’s see how ClusterControl can automate the verification of PostgreSQL backups and help avoid any surprises or manual tasks.

In ClusterControl, select your cluster and go to the "Backup" section, then, select “Create Backup”.

The automatic verify backup feature is available for the scheduled backups. So, let’s choose the “Schedule Backup” option.

When scheduling a backup, in addition to selecting the common options like method or storage, you also need to specify schedule/frequency.

In the next step, you can compress and encrypt your backup and specify the retention period. Here, you also have the “Verify Backup” feature.

To use this feature, you need a dedicated host (or VM) that is not part of the cluster.

ClusterControl will install the software and it’ll restore the backup in this host. After restoring, you can see the verification icon in the ClusterControl Backup section.

Conclusion

As we mentioned, backups are mandatory in any environment, but backup is not a backup if you can’t use it. So, you should make sure that your backup is useful in case you need it one day. In this blog, we showed different ways to check your backup to avoid problems when you want to restore it.

 
ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.