Backup Management Tips for TimescaleDB

Sebastian Insausti

Information is one of the most valuable assets in a company, and it goes without saying that one should have a Disaster Recovery Plan (DRP) to prevent data loss in the event of an accident or hardware failure. A backup is the simplest form of DR. It might not always be enough to guarantee an acceptable Recovery Point Objective (RPO), but is a good first approach.

Whether it is a 24x7 highly loaded server or a low-transaction-volume environment, you will need to make backups a seamless procedure without disrupting the performance of the server in a production environment.

If we talk about TimescaleDB, there are different types of backup for this new engine for time-series data. The type of backup that we should use depends on many factors, like the environment, infrastructure, load, etc.

In this blog, we’ll see these different types of backups that are available, and how ClusterControl can help us to centralize our backup management for TimescaleDB.

Backup Types

There are different types of backups for databases. Let’s look at each of them in detail.

  • Logical: The backup is stored in a human-readable format like SQL.
  • Physical: The backup contains binary data.
  • Full/Incremental/Differential: The definition of these three types of backups is implicit in the name. The full backup is a full copy of all your data. Incremental backup only backs up the data that has changed since the previous backup and the differential backup only contains the data that has changed since the last full backup executed. The incremental and differential backups were introduced as a way to decrease the amount of time and disk space usage that it takes to perform a full backup.
  • Point In Time Recovery compatible: PITR Involves restoring the database at any given moment in the past. For being able to do this, we will need to restore a full backup, and then apply all the changes that happened after the backup until right before the failure.

ClusterControl Backup Management Feature

Let’s see how ClusterControl can help us to manage different types of backups.

Creating a Backup

For this task, go to ClusterControl -> Select TimescaleDB Cluster -> Backup -> Create Backup.

We can create a new backup or configure a scheduled one. For our example, we will create a single backup instantly.

Here we have one method for each type of backup that we mentioned earlier.

Backup Type Tool Definition
Logical pg_dumpall It is a utility for writing out all TimescaleDB databases of a cluster into one script file. The script file contains SQL commands that can be used to restore the databases.
Physical pg_basebackup It is used to make a binary copy of the database cluster files, while making sure the system is put in and out of backup mode automatically. Backups are always taken of the entire database cluster of a running TimescaleDB database cluster. These are taken without affecting other clients to the database.
Full/Incr/Diff pgbackrest It is a simple, reliable backup and restore solution that can seamlessly scale up to the largest databases and workloads by utilizing algorithms that are optimized for database-specific requirements. One of the most important features is the support for Full, Incremental, and Differential Backups.
PITR pg_basebackup+WALs To create a PITR compatible backup, ClusterControl will use pg_basebackup and the WAL files, to be able to restore the database at any given moment in the past.

We must choose one method, the server from which the backup will be taken, and where we want to store the backup. We can also upload our backup to the cloud (AWS, Google or Azure) by enabling the corresponding button.

Keep in mind that if you want to create a backup compatible with PITR, we must use pg_basebackup in this step and we must take the backup from the master node.

Then we specify the use of compression, encryption and the retention of our backup.

On the backup section, we can see the progress of the backup, and information like the method, size, location, and more.

Enabling Point In Time Recovery

If we want to use the PITR feature, we must have the WAL Archiving enabled. For this we can go to ClusterControl -> Select TimescaleDB Cluster -> Node actions -> Enable WAL Archiving, or just go to ClusterControl -> Select TimescaleDB Cluster -> Backup -> Settings and enable the option “Enable Point-In-Time Recovery (WAL Archiving)” as we will see in the following image.

We must keep in mind that to enable the WAL Archiving, we must restart our database. ClusterControl can do this for us too.

In addition to the options common to all backups like the “Backup Directory” and the “Backup Retention Period”, here we can also specify the WAL Retention Period. By default is 0, which means forever.

To confirm that we have WAL Archiving enabled, we can select our Master node in ClusterControl -> Select TimescaleDB Cluster -> Nodes, and we should see the WAL Archiving Enabled message, as we can see in the following image.

Restoring a Backup

Once the backup is finished, we can restore it by using ClusterControl. For this, in our backup section (ClusterControl -> Select TimescaleDB Cluster -> Backup), we can select "Restore Backup", or directly "Restore" on the backup that we want to restore.

We have three options to restore the backup. We can restore the backup in an existing database node, restore and verify the backup on a standalone host or create a new cluster from the backup.

If we are trying to restore a PITR compatible backup, we also need to specify the time.

The data will be restored as it was at the time specified. Take into account that the UTC timezone is used and that our TimescaleDB service in the master will be restarted.

We can monitor the progress of our restore from the Activity section in our ClusterControl.

Automatic Backup Verification

A backup is not a backup if it's not restorable. Verifying backups is something that is usually neglected by many. Let’s see how ClusterControl can automate the verification of TimescaleDB backups and help avoid any surprises.

In ClusterControl, select your cluster and go to the "Backup" section, then, select “Create Backup”.

The automatic verify backup feature is available for the scheduled backups. So, let’s choose the “Schedule Backup” option.

When scheduling a backup, in addition to selecting the common options like method or storage, we also need to specify schedule/frequency.

In the next step, we can compress and encrypt our backup and specify the retention period. Here, we also have the “Verify Backup” feature.

To use this feature, we need a dedicated host (or VM) that is not part of the cluster.

ClusterControl will install the software and it’ll restore the backup in this host. After restoring, we can see the verification icon in the ClusterControl Backup section.

Conclusion

Nowadays, backups are mandatory in any environment. They help you protect your data. Incremental backups can help reduce the amount of time and storage space used for the backup process. Transaction logs are important for Point-in-Time-Recovery. ClusterControl can help automate the backup process for your TimescaleDB databases and, in case of failure, restore it with a few clicks. Also, you can minimize the RPO by using the PITR compatible backup and improve your Disaster Recovery Plan.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.