How to verify a backup?

Sebastian Insausti

Backups are a must in all Disaster Recovery Plans, and in general, all companies have implemented them in some way. The question is, what happens if, in case of failure, you need to restore a backup, and it is not usable for some reason? Probably you don’t want to be in that situation, so, in this blog, we will see how to confirm if your backup is good to use and how to improve your backup policy to cope with this in an automated way with ClusterControl.

Types of Backups

Let’s start by talking about the different types of backups. In general, we can separate it into two simple categories:

  1. Logical: The backup is stored in a human-readable format like SQL.

  2. Physical: The backup contains binary data.

In this blog, we will focus on the verification of logical backups.

Checking the Backup Logs

The first way to confirm if everything goes fine is by checking the backup logs.

Depending on the database engine, you would need to run a backup job which could be as simple as:

$ pg_dumpall -U postgres > /path/to/dump.sql 2> /tmp/backup.log

Or even:

$ mysqldump -u root -p world > /path/to/dump.sql 2> /tmp/backup.log

So, you can monitor the output log file /tmp/backup.log looking for errors, for example, and adding it into some monitoring tool like Nagios.

Checking logs is not enough to confirm that the backup will work, because if the backup file is corrupted for some reason, you probably won’t see that in the log file.

Checking the Backup Content

If you are using logical backups, you can verify the content of the backup file, to confirm you have all the existing databases there.

You can list your current databases using:

$ psql -l | awk '{ print $1 }'| awk 'FNR > 3' |grep '^[a-zA-Z0-9]' |grep -v 'template0'
postgres
template1
world

And check which databases you have in the backup file:

$ grep '^[\]connect' /path/to/dump.sql |awk '{print $2}'
template1
postgres
world

You can do the same with other databases like MySQL. The problem with this way is you don’t check the size or data, so it could be possible that you have some data loss if there was some error when the backup was executed.

Restoring to Check the Backup Manually

The most secure way to confirm if a backup is working is by restoring it and accessing the databases.

After the backup is completed, you can restore it manually in another host by copying the dump file and running:

$ mysql -p < /path/to/dump.sql

Then, you can access it and check the restored databases:

$ mysql -p
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test1              |
| world              |
+--------------------+
5 rows in set (0.001 sec)

The problem with this method is, of course, you should run it manually, or find a way to automate this, which could be a time-consuming task, but what about using ClusterControl to help with this?

Automatic ClusterControl Backup Verification

Now, let’s see how ClusterControl can automate the verification of database backups and help to avoid any surprises or manual tasks.

In ClusterControl, select your cluster and go to the "Backup" section, then, select “Create Backup”.

The automatic verify backup feature is available for the scheduled backups. So, let’s choose the “Schedule Backup” option.

When scheduling a backup, in addition to selecting the common options like method or storage, you also need to specify schedule/frequency.

In the next step, you can compress and encrypt your backup and specify the retention period. Here, you also have the “Verify Backup” feature.

To use this feature, you need a dedicated host (or VM) that is not part of the cluster.

ClusterControl will install the software and restore the backup in this host. After restoring, you can see the verification icon in the ClusterControl Backup section.

Conclusion

As we mentioned, backups are mandatory in any environment, but backup is not a backup if you can’t use it. So, you should make sure that your backup is useful in case you need it one day. In this blog, we have gone through different manual ways to check your logical backups to avoid problems when you want to restore them and how to automate this task by using ClusterControl.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.