blog
Using Barman to Backup PostgreSQL – An Overview
Database backups play an imperative role in designing an effective disaster recovery strategy for production databases. Database Administrators and Architects must continuously work towards designing an optimal and effective backup strategy for real-time mission critical databases and further ensure Disaster Recovery SLAs are satisfied. As per my experience, this is not easy and can take from days to weeks to achieve an impeccable backup strategy. It is just not writing a good script to backup databases and make sure it works. There are several factors to consider, let us take a look at them:
- Database size: Database size plays in important role when designing backup strategies. In-fact, this is one of the core factors which defines
- Time taken by the backup
- The load on the infrastructure components like Disk, Network, CPU etc.
- Amount of backup storage required and the costs involved
- If the databases are hosted on cloud, then, the backup storage costs rely on the amount of storage required
- Also, database size impacts the RTO
- Infrastructure: Backup strategy heavily relies on infrastructure of the databases. The backup procedure would be different for databases hosted on a physical server in an on-prem data-centre as compared to those hosted on cloud.
- Backup Location: Where are the backups going? Generally, the backups will be placed at a remote location, for instance on tape or cloud specific storage like AWS S3.
- Backup Tool: Identify an optimal tool to perform online database backup which potentially ensures consistent backup has been taken.
A good database backup strategy must ensure RTO (Recovery Time Objective) and RPO (Recovery Point Objective) are met which in-turn help achieve Disaster Recovery objective. File-system level backups can be performed on PostgreSQL Databases in several ways. In this blog, my focus will be on a tool called Barman which is popularly used to perform PostgreSQL Database Backups.
Barman (backup and recovery manager) is an Python based open-source tool developed by developers at 2nd Quadrant. This tool is developed to achieve enterprise grade database backup strategy for mission critical PostgreSQL production databases. Its features and characteristics resemble that of Oracle’s RMAN. In my opinion, barman is one of the best options for PostgreSQL databases and can deliver several benefits from the operations perspective to DBAs and Infrastructure engineers.
Let us look at some capabilities of Barman:
I will start with configuration overview and then list out what kind of backups can be performed
Technically, barman-cli is python based tool and has two different configuration files to deal with. One file which is the actual configuration for the database to be backed-up resides in “/etc/barman.d” names as
Example contents of /etc/barman.conf file is shown below
[barman]
barman_user = barman ---------> barman user who performs backup/recovery of database
configuration_files_directory = /etc/barman.d -----> location for DB configuration files
barman_home = /dbbackups/barman ---> barman home directory
log_file = /dbbackups/barman/logs/barman.log ---> barman log file location
log_level = INFO -----> level of logging for barman operations
compression = gzip -----> backups must be compressed
Installation of Barman
Let us take a look at the installation procedure of barman –
Installing from the source
Download the barman from the https://www.pgbarman.org/
Untar / unzip the installer and execute the following command as root user –
[root@barman-server barman-2.4]# ./setup.py install
/usr/lib64/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'setup_requires'
warnings.warn(msg)
/usr/lib64/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires'
warnings.warn(msg)
/usr/lib64/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'tests_require'
warnings.warn(msg)
running install
running build
running build_py
creating build
creating build/lib
creating build/lib/barman
copying barman/utils.py -> build/lib/barman
copying barman/fs.py -> build/lib/barman
copying barman/retention_policies.py -> build/lib/barman
copying barman/diagnose.py -> build/lib/barman
copying barman/backup.py -> build/lib/barman
copying barman/recovery_executor.py -> build/lib/barman
copying barman/backup_executor.py -> build/lib/barman
copying barman/config.py -> build/lib/barman
copying barman/process.py -> build/lib/barman
copying barman/output.py -> build/lib/barman
copying barman/__init__.py -> build/lib/barman
copying barman/remote_status.py -> build/lib/barman
copying barman/xlog.py -> build/lib/barman
copying barman/lockfile.py -> build/lib/barman
copying barman/postgres.py -> build/lib/barman
copying barman/server.py -> build/lib/barman
copying barman/cli.py -> build/lib/barman
copying barman/version.py -> build/lib/barman
copying barman/compression.py -> build/lib/barman
copying barman/wal_archiver.py -> build/lib/barman
copying barman/infofile.py -> build/lib/barman
copying barman/exceptions.py -> build/lib/barman
copying barman/hooks.py -> build/lib/barman
copying barman/copy_controller.py -> build/lib/barman
copying barman/command_wrappers.py -> build/lib/barman
running build_scripts
creating build/scripts-2.7
copying and adjusting bin/barman -> build/scripts-2.7
changing mode of build/scripts-2.7/barman from 644 to 755
running install_lib
creating /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/utils.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/fs.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/retention_policies.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/diagnose.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/backup.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/recovery_executor.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/backup_executor.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/config.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/process.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/output.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/__init__.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/remote_status.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/xlog.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/lockfile.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/postgres.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/server.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/cli.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/version.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/compression.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/wal_archiver.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/infofile.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/exceptions.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/hooks.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/copy_controller.py -> /usr/lib/python2.7/site-packages/barman
copying build/lib/barman/command_wrappers.py -> /usr/lib/python2.7/site-packages/barman
byte-compiling /usr/lib/python2.7/site-packages/barman/utils.py to utils.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/fs.py to fs.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/retention_policies.py to retention_policies.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/diagnose.py to diagnose.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/backup.py to backup.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/recovery_executor.py to recovery_executor.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/backup_executor.py to backup_executor.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/config.py to config.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/process.py to process.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/output.py to output.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/__init__.py to __init__.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/remote_status.py to remote_status.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/xlog.py to xlog.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/lockfile.py to lockfile.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/postgres.py to postgres.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/server.py to server.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/cli.py to cli.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/version.py to version.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/compression.py to compression.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/wal_archiver.py to wal_archiver.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/infofile.py to infofile.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/exceptions.py to exceptions.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/hooks.py to hooks.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/copy_controller.py to copy_controller.pyc
byte-compiling /usr/lib/python2.7/site-packages/barman/command_wrappers.py to command_wrappers.pyc
running install_scripts
copying build/scripts-2.7/barman -> /usr/bin
changing mode of /usr/bin/barman to 755
running install_data
copying doc/barman.1 -> /usr/share/man/man1
copying doc/barman.5 -> /usr/share/man/man5
running install_egg_info
Writing /usr/lib/python2.7/site-packages/barman-2.4-py2.7.egg-info
Installing from the repo
Installation can also be done via yum as follows
[barman@barman-server~]$ yum install barman
Let us take a look at different types of backups barman supports
Physical Hot Backups
Barman supports Physical Hot Backups which means, online backup of physical data files and transaction log files of the database using rsync methodology which can be in the compressed form as well.
Let us take a look at the steps and commands to perform RSYNC backup using barman
#1 PostgreSQL database configuration file for barman
[pgdb]
description="Main PostgreSQL server"
conninfo=host=pgserver user=postgres dbname=postgres
ssh_command=ssh barman@pgserver
archiver=on
backup_method = rsync
“pgdb” is the identifier of the Postgres Database for barman and the configuration file name should be
The parameter backup_method defines the type of backup to be taken. In this case backup_method is rsync.
Note: For the barman backup command to be successful, password-less ssh authentication must be configured between barman and postgres servers.
#2 postgresql.conf file parameters
wal_level=replica
archive_mode=on
archive_command=’rsync to ’
Barman’s backup command
#3 Check if barman is ready to perform backups
[barman@pgserver pgdb]$ barman check pgdb
Server pgdb:
PostgreSQL: OK
is_superuser: OK
wal_level: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 4 backups, expected at least 0)
ssh: OK (PostgreSQL server)
not in recovery: OK
archive_mode: OK
archive_command: OK
continuous archiving: OK
archiver errors: OK
The above output says all is “OK” to proceed with the backup which means, you are good to take a backup.
For example, below output says backup cannot be taken because according to barman SSH is not working –
[barman@pgserver ~]$ barman check pgdb
Server pgdb:
PostgreSQL: OK
is_superuser: OK
wal_level: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 0 backups, expected at least 0)
ssh: FAILED (Connection failed using 'barman@pgserver -o BatchMode=yes -o StrictHostKeyChecking=no' return code 127)
not in recovery: OK
archive_mode: OK
archive_command: OK
continuous archiving: OK
archiver errors: OK
#4 Perform Database backup
[barman@barman-server ~]$ barman backup pgdb
Starting backup using rsync-exclusive method for server pgdb in /dbbackup/barman_backups/pgdb/base/20180816T153846
Backup start at LSN: 0/1C000028 (00000001000000000000001C, 00000028)
This is the first backup for server pgdb
WAL segments preceding the current backup have been found:
00000001000000000000000B from server pgdb has been removed
00000001000000000000000C from server pgdb has been removed
00000001000000000000000D from server pgdb has been removed
00000001000000000000000E from server pgdb has been removed
00000001000000000000000F from server pgdb has been removed
000000010000000000000010 from server pgdb has been removed
000000010000000000000011 from server pgdb has been removed
000000010000000000000012 from server pgdb has been removed
000000010000000000000013 from server pgdb has been removed
000000010000000000000014 from server pgdb has been removed
000000010000000000000015 from server pgdb has been removed
000000010000000000000016 from server pgdb has been removed
Starting backup copy via rsync/SSH for 20180816T153846
Copy done (time: 1 second)
This is the first backup for server pgdb
Asking PostgreSQL server to finalize the backup.
Backup size: 21.8 MiB
Backup end at LSN: 0/1C0000F8 (00000001000000000000001C, 000000F8)
Backup completed (start time: 2018-08-16 15:38:46.668492, elapsed time: 1 second)
Processing xlog segments from file archival for pgdb
000000010000000000000016
000000010000000000000017
000000010000000000000018
000000010000000000000019
00000001000000000000001A
00000001000000000000001B
00000001000000000000001C
00000001000000000000001C.00000028.backup
To understand if the barman backup command will even be successful, below command helps –
Incremental Backups
Another great capability of Barman is the ability to take incremental backups. This means, only the changed blocks since the last full database backup can be backed-up. For databases which undergo less data changes, incrementally backing them up can reduce resource usage.
It heavily depends on rsync and hard-links. Below are the benefits of incremental backups –
- Significantly reduces the daily backup time
- The volume of data being backed-up reduces as only the changed data blocks will be backed-up which, in-turn reduces the usage of infrastructure resources like network bandwidth, disk space, I/O etc.
- If you are after achieving a very good RTO, this is the feature you would be looking for
Commands for incremental backup is pretty much the same. Any subsequent backups after the first backup taken with option backup_method=rsync will be incremental backups and barman pulls the WALs using pg_recievexlog utility.
Remote Database Backups and Recovery
This capability of Barman is highly beneficial for DBAs in my opinion. The first thing DBAs would look for is avoid stressing production database server resources as much as possible during the backups and doing them remotely would be the best option. Barman leverages pg_basebackup which makes it a lot easier in scripting and automating it.
In general, traditionally available options for automated backups will be –
- pg_basebackup
- tar copy
The above two options involve a lot of development and testing to ensure an effective backup strategy is in-place to meet demands of SLAs and can pose challenges for large databases with multiple tablespaces.
With Barman, it is pretty simple. Another exceptional capability of barman is continuous WAL streaming. Let us take a look at that in a bit more detail.
Streaming Backup with continuous WAL streaming
This makes barman standout in comparison with other tools in the market. Live WAL files can be streamed continuously to a remote backup location using Barman. This is THE FEATURE which DBAs would be excited to know. I was excited to know about this. It is extremely difficult or next to impossible to achieve this with manually built scripts or with a combination of tools like pg_basebackup and pg_receivewal. With continuous WAL streaming, a better RPO can be achieved. If the backup strategy is designed meticulously, It would not be an exaggeration to say that an almost 0 RPO can be achieved.
Let us look at the steps, commands to perform a streaming barman backup
#1 postgresql.conf parameter changes
Following configurations to be done in the postgresql.conf
wal_level=replica
max_wal_senders = 2
max_replication_slots = 2
synchronous_standby_names = 'barman_receive_wal'
archive_mode=on
archive_command = 'rsync -a %p barman@pgserver:INCOMING_WAL_DIRECTORY/%f'
archive_timeout=3600 (should not be 0 or disabled)
#2 Create Replication Slot using barman
Replication slot is important for streaming backups. In-case continuous streaming of WALs fails for any reason, all the un-streamed WALs can be retained on the postgres database without being removed.
[barman@pgserver ~]$ barman receive-wal --create-slot pgdb
Creating physical replication slot 'barman' on server 'pgdb'
Replication slot 'barman' created
#3 Configure the database server configuration file for barman
Database identifier for barman is “pgdb”. A configuration file called pgdb.conf must be created in /etc/barman.d/ location with the following contents
[pgdb]
description="Main PostgreSQL server"
conninfo=host=pgserver user=postgres dbname=postgres
streaming_conninfo=host=pgserver user=barman
backup_method=postgres
archiver=on
incoming_wals_directory=/dbbackups/barman_backups/pgdb/incoming
streaming_archiver=on
slot_name=barman
streaming_conninfo is the parameter to configure for barman to perform streaming backups
backup_method must be configured to “postgres” when streaming backup is to be taken
streaming_archiver must be configured to “on”
slot_name=barman This parameter must be configured when you need barman to use replication slots. In this case the replication slot name is barman
Once the configuration is done, do a barman check to ensure streaming backups will run successful.
#4 Check if barman receive-wal is running ok
In general for the first barman receive-wal does not work immediately after configuration changes, might error out and barman check command might show the following –
[barman@pgserver archive_status]$ barman check pgdb
Server pgdb:
PostgreSQL: OK
is_superuser: OK
PostgreSQL streaming: OK
wal_level: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 0 backups, expected at least 0)
pg_basebackup: OK
pg_basebackup compatible: OK
pg_basebackup supports tablespaces mapping: OK
archive_mode: OK
archive_command: OK
continuous archiving: OK
pg_receivexlog: OK
pg_receivexlog compatible: OK
receive-wal running: FAILED (See the Barman log file for more details)
archiver errors: OK
When you run barman receive-wal, it might hang. To make receive-wal work properly for the first time, below command must be executed.
[barman@pgserver arch_logs]$ barman cron
Starting WAL archiving for server pgdb
Starting streaming archiver for server pgdb
Now, do a barman check again, it should be good now.
[barman@pgserver arch_logs]$ barman check pgdb
Server pgdb:
PostgreSQL: OK
is_superuser: OK
PostgreSQL streaming: OK
wal_level: OK
replication slot: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 2 backups, expected at least 0)
pg_basebackup: OK
pg_basebackup compatible: OK
pg_basebackup supports tablespaces mapping: OK
archive_mode: OK
archive_command: OK
continuous archiving: OK
pg_receivexlog: OK
pg_receivexlog compatible: OK
receive-wal running: OK
archiver errors: OK
If you can see, receivexlog status shows ok. This is one of the issues i faced.
#5 Check if the barman is ready to perform backups
[barman@pgserver ~]$ barman check pgdb
Server pgdb:
PostgreSQL: OK
is_superuser: OK
PostgreSQL streaming: OK
wal_level: OK
replication slot: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 4 backups, expected at least 0)
pg_basebackup: OK
pg_basebackup compatible: OK
pg_basebackup supports tablespaces mapping: OK
archive_mode: OK
archive_command: OK
continuous archiving: OK
pg_receivexlog: OK
pg_receivexlog compatible: OK
receive-wal running: OK
archiver errors: OK
#6 Check the streaming status using barman
[barman@pgserver pgdb]$ barman replication-status pgdb
Status of streaming clients for server 'pgdb':
Current LSN on master: 0/250008A8
Number of streaming clients: 1
1. #1 Sync WAL streamer
Application name: barman_receive_wal
Sync stage : 3/3 Remote write
Communication : TCP/IP
IP Address : 192.168.1.10 / Port: 52602 / Host: -
User name : barman
Current state : streaming (sync)
Replication slot: barman
WAL sender PID : 26592
Started at : 2018-08-16 16:03:21.422430+10:00
Sent LSN : 0/250008A8 (diff: 0 B)
Write LSN : 0/250008A8 (diff: 0 B)
Flush LSN : 0/250008A8 (diff: 0 B)
The above status means, barman is ready to perform streaming backup. Perform the backup as shown below –
[barman@pgserver arch_logs]$ barman backup pgdb
Starting backup using postgres method for server pgdb in /dbbackup/barman_backups/pgdb/base/20180816T160710
Backup start at LSN: 0/1F000528 (00000001000000000000001F, 00000528)
Starting backup copy via pg_basebackup for 20180816T160710
Copy done (time: 1 second)
Finalising the backup.
Backup size: 21.9 MiB
Backup end at LSN: 0/21000000 (000000010000000000000020, 00000000)
Backup completed (start time: 2018-08-16 16:07:10.401526, elapsed time: 1 second)
Processing xlog segments from file archival for pgdb
00000001000000000000001F
000000010000000000000020
000000010000000000000020.00000028.backup
000000010000000000000021
Processing xlog segments from streaming for pgdb
00000001000000000000001F
000000010000000000000020
Centralized and Catalogued Backups
Is highly beneficial for environments running multiple databases on multiple servers in a networked environment. This is one of the exceptional feature of Barman. I have worked in a real-time environments where-in i had to manage, administer 100s of databases and I always felt the need for centralized database backups and which is why Oracle RMAN became popular for Oracle database backup strategy and now Barman is filling that space for PostgreSQL. With Barman, DBA,s and DevOps engineers can work towards building a centralized backup server where-in Database backups for all the databases are maintained, validated.
Catalogued backups meaning, barman maintains a centralized repository where-in statuses of all the backups are maintained. You can check the backups available for particular database as shown below –
[barman@pgserver ~]$ barman list-backup pgdb
pgdb 20180816T160924 - Thu Aug 16 16:09:25 2018 - Size: 22.0 MiB - WAL Size: 135.7 KiB
pgdb 20180816T160710 - Thu Aug 16 16:07:11 2018 - Size: 21.9 MiB - WAL Size: 105.8 KiB
pgdb 20180816T153913 - Thu Aug 16 15:39:15 2018 - Size: 21.9 MiB - WAL Size: 54.2 KiB
pgdb 20180816T153846 - Thu Aug 16 15:38:48 2018 - Size: 21.9 MiB - WAL Size: 53.0 KiB
Backup Retention Policy
Retention policies can be defined for database backups. Backups can be rendered obsolete after a certain period and obsolete backups can be deleted time-to-time.
There are options in the configuration file to make sure backups are retained and made obsolete when retention period exceeds –
First parameter to configure is minimum_redundancy. Always configure minimum_redundancy to >0 to ensure backups are not deleted accidentally.
Example: minimum_redundancy = 1
- retention_policy parameter will determine how long the base backups must be retained for to ensure disaster recovery SLAs are met.
- wal_retention_policy parameter will determine how long the wal backups must be retained for. This is ensure expected RPO is met.
Existing retention and redundancy policies for a database server can be check using barman check command as follows
[barman@pgserver ~]$ barman check pgdb
Server pgdb:
PostgreSQL: OK
is_superuser: OK
PostgreSQL streaming: OK
wal_level: OK
replication slot: OK
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 4 backups, expected at least 0)
pg_basebackup: OK
pg_basebackup compatible: OK
pg_basebackup supports tablespaces mapping: OK
archive_mode: OK
archive_command: OK
continuous archiving: OK
pg_receivexlog: OK
pg_receivexlog compatible: OK
receive-wal running: OK
archiver errors: OK
Parallel Backups and Recoveries can be performed by utilizing multiple CPUs which really makes backups and recoveries complete faster. This feature is beneficial for very large databases sizing up to TeraBytes.
To execute backups parallely, add the following option in the database server configuration file (which is /etc/barman.d/pgdb.conf file)-
parallel_jobs = 1
I can conclude by saying that barman is an enterprise grade tool which can potentially help DBAs design an effective disaster recovery strategy.