blog

Migration and upgrades: achieving near zero-downtime in PostgreSQL

Sebastian Insausti

Published:

Migrating or upgrading a PostgreSQL database carries significant risk, particularly for organizations with strict uptime requirements. Whether the objective is a major version upgrade, hardware relocation, or a shift between cloud environments, maintaining uninterrupted service without sacrificing performance is critical. This article explores techniques for achieving near zero-downtime during PostgreSQL migrations and upgrades.

Why minimal downtime matters?

Downtime incurs significant costs, including lost revenue, diminished user satisfaction, and harm to brand image. This is particularly critical in industries like finance, e-commerce, and SaaS, where even brief interruptions can lead to substantial downstream effects.

Understanding PostgreSQL’s upgrade mechanics is the first step toward minimizing these disruptions:

  • Minor Upgrades (e.g., 14.1 to 14.2): Generally safe and quick, involving only a restart after installing updated binaries.
  • Major Upgrades (e.g., 14 to 15): These require a full data migration, which can lead to significant downtime unless properly planned and executed.

Upgrade paths

So, how can you upgrade or migrate your PostgreSQL database while minimizing disruption and ensuring data integrity? The process can be approached in several ways, each with its own considerations regarding downtime, complexity, and resource requirements. The core objective is often to achieve near zero-downtime, especially for critical production environments. This often involves strategies that allow a new database instance to be brought online and synchronized with the old one before a final, rapid cutover. Key strategies include:

pg_dump/pg_dumpall

It is a logical backup tool that allows you to dump your data and restore it in the new PostgreSQL version. Here you will have a downtime period that will vary according to your data size. You need to stop the system or avoid new data in the primary node, run the pg_dump command, move the generated dump to the new database node, and restore it. During this time, you can’t write to your primary PostgreSQL database to avoid data inconsistency.

The steps to perform the upgrade by using this approach are:

  1. Place the system in maintenance mode or stop all writes.
  2. Run pg_dumpall to export all databases, roles, and global configs.
  3. Copy the backup to the new PostgreSQL server.
  4. Import the backup into the new PostgreSQL instance.

Let’s see an example:

$ pg_dumpall -U postgres -h old-db-host -f full_cluster_backup.sql
$ scp full_cluster_backup.sql user@new-db-host:/tmp/
$ psql -U postgres -h new-db-host -f /tmp/full_cluster_backup.sql

N.B. Use this method only when downtime is acceptable.

pg_upgrade (In-Place)

It is a PostgreSQL tool to upgrade your PostgreSQL version in-place. It is fast, retains file paths and configurations, but it could be dangerous in a production environment. Using this method, the downtime will be considerably less than using the previous pg_dump method.

Let’s see an example:

$ pg_upgrade -b /usr/lib/postgresql/13/bin \
-B /usr/lib/postgresql/14/bin \
-d /var/lib/postgresql/13/data \
-D /var/lib/postgresql/14/data \
-U postgres

N.B. Use this method only when downtime is acceptable and full backups are in place.

Logical replication

This replication method allows you to perform version upgrades with near zero-downtime. In this way, you can add a standby node in the PostgreSQL target version, and when the replication is up-to-date, you can perform a failover process to promote the new PostgreSQL node. 

The steps to perform the upgrade by using this approach are:

  1. Set up a new PostgreSQL instance with the target version.
  2. Enable logical replication on the source (configuration).
  3. Create a publication on the source.
  4. Create a subscription on the target.
  5. Cut over once replication is up-to-date.

Let’s see an example:

-- On the source (old version)
CREATE PUBLICATION my_pub FOR ALL TABLES;

-- On the target (new version)
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=old_host port=5432 user=replicationuser password=secretpass dbname=appdb'
PUBLICATION my_pub;

You can monitor the logical replication by running this command in the subscriber node:

SELECT subname, pid, received_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, pg_wal_lsn_diff(latest_end_lsn, received_lsn) AS byte_lag
FROM pg_stat_subscription;

N.B. Make sure replication is up-to-date before promoting the new PostgreSQL instance.

Rolling upgrades in a multi-node cluster

Rolling upgrades minimize downtime for HA PostgreSQL deployments by sequentially upgrading replicas. Once all replicas are updated, one becomes the new primary, followed by upgrading the original primary. This method ensures continuous service with minimal interruption, requiring careful planning and PostgreSQL replication expertise.

The choice of method depends on several factors: the size and complexity of your database, your acceptable downtime window, your comfort level with different technologies, and your budget. Regardless of the chosen path, thorough planning, testing in a non-production environment, and a robust rollback strategy are paramount to a successful PostgreSQL upgrade or migration.

Manual vs. ClusterControl: What’s the better approach?

Now the question is, should I run this manually or by using an external tool like ClusterControl?

Manual scripts

Let’s see the pros and cons in case you want to do it manually or by using custom scripts:

  • Pros:
    • Full control of each action without external tool dependency.
  • Cons:
    • Difficult to manage dependencies and timing.
    • High risk of human error, especially in large clusters.
    • Requires scripting or manual tasks for rollback, switchover, and verification steps.

Using ClusterControl

ClusterControl provides a comprehensive framework to automate and orchestrate PostgreSQL upgrades and migrations:

  • Pros:
    • Automates the setup of logical replication.
    • Handles backups, configuration tweaks, and switchover operations.
    • Offers UI and CLI tools for monitoring and validation.
  • Cons:
    • Initial setup and learning curve.
    • Needs familiarity with ClusterControl workflows for optimal use.

Step-by-Step: Migration with logical replication

Let’s explore the process of migrating or upgrading from PostgreSQL 16 to PostgreSQL 17, leveraging ClusterControl and logical replication to minimize downtime.

The final topology should be something like this:

1. Deployment: Deploy two streaming replication clusters in PostgreSQL 16 and configure logical replication between them. This topology can be deployed with ClusterControl or imported if already in place.

clustercontrol clusters under management overview showing postgresql clusters

2. Upgrade: Once this topology is deployed (or imported into ClusterControl), utilize ClusterControl to upgrade one of the clusters to PostgreSQL 17.

clustercontrol minor / major upgrade selection ui

ClusterControl offers three methods for upgrading PostgreSQL, each with its own approach and requirements:

  • Copy: This is the default `pg_upgrade` method, utilizing the `–copy` option. It copies files incrementally, offering a faster alternative to traditional backup and restore. This method necessitates sufficient disk storage to accommodate both the old and new data files.
  • Link: This method employs the `–link` option with `pg_upgrade`, creating hard links instead of copying files. This makes it faster and eliminates the need for additional disk space. However, it is imperative not to start the old cluster after this step, as doing so could lead to data corruption in the new cluster due to shared data.
  • Pgdumpall: As an alternative to `pg_upgrade`, this method uses `pgdumpall` to back up the old cluster and then restore it onto the new cluster. This process requires adequate disk storage to hold the backup, in addition to the old and new files.

N.B. It is important to note that the cluster will be non-operational during the upgrade process, regardless of the method chosen.

3. Monitor replication lag: After upgrading, replication lag can be monitored through ClusterControl or by executing relevant queries directly on the PostgreSQL database.

clustercontrol cluster node topology ui

4. Execute a switchover: Once it is safe and there is no replication lag, switch your application to utilize the cluster running the new version. If a load balancer is in use, direct the active cluster to the PostgreSQL 17 instance.

5. Decommission or upgrade the old PostgreSQL 16 cluster: Once you have verified that the new PostgreSQL 16 cluster is fully operational, you can proceed with either decommissioning the old cluster or upgrading it by repeating the migration/upgrade process.

Validation & post-upgrade checks

After the switchover, it’s crucial to confirm that all systems are operating normally:

  • Replication Monitoring: Verify there is no lag or error logs.
  • Query Audits: Check the traffic on the new primary node.
  • ClusterControl Dashboard: Use built-in graphs and logs to validate system health.
  • Backup & Snapshots: Create post-upgrade snapshots for rollback capabilities and compliance.

You can perform all these actions from the ClusterControl UI.

Rollback plans

Achieving near zero-downtime PostgreSQL migrations and upgrades requires rollback readiness. 

The most important best practices are:

  • Keep the old system in read-only mode during the transition window for a seamless switch and fallback.
  • Create pre-upgrade backups using ClusterControl for system restoration in case of failure.
  • Document detailed rollback steps beforehand to minimize disruption and downtime.

Wrapping up

Near zero-downtime PostgreSQL upgrades and migrations are achievable through careful planning, robust tooling, and systematic validation. ClusterControl facilitates a safer and faster path to success by orchestrating logical replication, backup management, and automated failover, whether the upgrade is for performance, compliance, or modernization.For more information, you can refer to PostgreSQL Logical Replication Documentation or ClusterControl Documentation.

Install ClusterControl in 10-minutes. Free 30-day Enterprise trial included!

Script Installation Instructions

The installer script is the simplest way to get ClusterControl up and running. Run it on your chosen host, and it will take care of installing all required packages and dependencies.

Offline environments are supported as well. See the Offline Installation guide for more details.

On the ClusterControl server, run the following commands:

wget https://severalnines.com/downloads/cmon/install-cc
chmod +x install-cc

With your install script ready, run the command below. Replace S9S_CMON_PASSWORD and S9S_ROOT_PASSWORD placeholders with your choice password, or remove the environment variables from the command to interactively set the passwords. If you have multiple network interface cards, assign one IP address for the HOST variable in the command using HOST=<ip_address>.

S9S_CMON_PASSWORD=<your_password> S9S_ROOT_PASSWORD=<your_password> HOST=<ip_address> ./install-cc # as root or sudo user

After the installation is complete, open a web browser, navigate to https://<ClusterControl_host>/, and create the first admin user by entering a username (note that “admin” is reserved) and a password on the welcome page. Once you’re in, you can deploy a new database cluster or import an existing one.

The installer script supports a range of environment variables for advanced setup. You can define them using export or by prefixing the install command.

See the list of supported variables and example use cases to tailor your installation.

Other Installation Options

Helm Chart

Deploy ClusterControl on Kubernetes using our official Helm chart.

Ansible Role

Automate installation and configuration using our Ansible playbooks.

Puppet Module

Manage your ClusterControl deployment with the Puppet module.

ClusterControl on Marketplaces

Prefer to launch ClusterControl directly from the cloud? It’s available on these platforms:

Subscribe below to be notified of fresh posts