blog
PostgreSQL Backups: What is pg_dumpall?
PostgreSQL is an open-source database management system that can be used to store or interact with the data of any application or website. If you are using a PostgreSQL database to store the information of your website or application, it is essential that you are aware of backup tools or utilities for PostgreSQL to protect the database against the data loss. In this article, we will go through some utilities provided by PostgreSQL to backup your databases and restore them.
PostgreSQL installation includes backup utilities called pg_dump and pg_dumpall to take logical backups of your PostgreSQL databases. These are standard utilities to create portable PostgreSQL database backups. We can use these tools to take the backups of any remote PostgreSQL databases as well.
Prerequisites
Before we start this tutorial, make sure that PostgreSQL is installed on your machine. Also, you should have permission to read all the information on the database.
Backup using pg_dump
In this example, we will run all the commands as a “postgres” user which is the default user for the PostgreSQL database and have the required permission for running the backup commands. Run the following command to switch the user to postgres.
$ sudo su - postgres
Now, you can use the following command to take the backup of any single database:
$ pg_dump mydb > /tmp/backups/postgres/mydb.sql
The above command is a very generic command. This command will generate the backup file as a plain text format which is okay for the small databases which have less number of objects inside it. But, for large databases, this is not suitable. Take a look at the following command now:
$ pg_dump -U -W -F t mydb > /tmp/backups/postgres/mydb.tar
Let’s see these flags more in details:
- -U option can be used to provide the user which will be used to connect to the remote server.
- -W option will prompt the user for the password before connecting to the remote server.
- -F parameter specifies the output back up file format. It can have multiple values as follows
- p: Plain text SQL (Default)
- t: tar
- d: directory format archive
- c: Any custom archive format
pg_dump utility is nothing but a client program. Therefore, it can be run from the remote system as well if you have access to the database server. You can use -h option to provide the host string and -p option to specify the port number where the PostgreSQL server is running:
$ pg_dump -U <USER_NAME> -h <HOST_NAME> -p <PORT_NO> <DB_NAME> > <BACKUP_FILE_NAME>
You can run the following command to see the list of all available options of pg_dump:
$ pg_dump -?
Backup using pg_dumpall
pg_dump command can take the backup of any one database at a time. What if we want to take the backup of all the databases we have in the PostgreSQL server? There are two options for that.
- Run the pg_dump command against each database one by one.
- Use another tool provided by the PostgreSQL called pg_dumpall to take the backup of all the databases.
Run the following command to take the backup of all databases using pg_dumpall:
$ pg_dumpall -F t > /tmp/backups/postgres/all_dbs.tar
All the options for pg_dumpall command are similar to pg_dump command. You can also use this command to take the backups of only database schemas.
Backup of only Schema definitions
$ pg_dumpall --schema-only > /tmp/backups/postgres/schemas/all_dbs.sql
Backup of only Role definitions
$ pg_dumpall --roles-only > /tmp/backups/postgres/roles/all_dbs.sql
Backup of only Tablespace definitions
$ pg_dumpall --tablespaces-only > /tmp/backups/postgres/tablespaces/all_dbs.sql
However, sometimes pg_dumpall can be a bad option to take backups for you as compared to pg_dump command.
Disadvantages of pg_dumpall command
- pg_dumpall command creates the backups of each database one by one in a sequential manner which can be a time-consuming process if your databases are big. That’s why you can’t perform parallel restores of your databases. In the case of pg_dump command, you can start the restore process as soon as one database backup file is generated, allowing you to run the backup and restore process simultaneously.
- Taking the backup of all databases together requires more time than taking the backup of a single database so, it can lead to more downtime for your server.
Restoring the PostgreSQL database dumps
We can import the PostgreSQL dumps in two ways.
Restore the database using psql
If you have created the backup using the default option which is plain text SQL file, then you can restore your database directly using the following psql command::
$ psql -U <USER_NAME> <DB_NAME> < <BACKUP_NAME>.sql
For example, you can restore the dump which you created just now like this:
$ psql -U postgres new_db < mydb.sql
Before you run the above command, make sure that a new empty database is already created as this command won’t create a new database if it doesn’t exist. You can run the following command to create a new database.
$ createdb -T template0 new_db
Another thing that should be done before restoring the database is creating all the dependent users who own any objects inside the database.
Restore the database using pg_restore
While creating the backup using pg_dump or pg_dumpall, if you choose any format option other than the default one, then you have to use pg_restore utility to restore your databases.
$ pg_restore -d new_db /tmp/backups/postgres/mydb.tar -c -U postgres
Here is the list of some options which can be used with the above command:
- -c – drop the database objects before recreating it
- -C – To create a new database for restoration
- -F – To provide the format of the backup file
- -e – To exit the process if any error encountered
You can use the following command to see all the available options:
$ pg_restore -?
Conclusion
For any database management system, having proper solutions for taking backup or restoring backups is very critical. Fortunately, PostgreSQL provides in-built functionality to backup and restore the databases. If you are planning to use these tools to take backups of your PostgreSQL databases, make sure to test your backups at regular intervals by restoring it to some dummy databases. So, it can be used to recover your databases in times of crisis without facing any problems.