Upgrading PostgreSQL 11 to PostgreSQL 13 with TimescaleDB and PostGIS in Linux using pg_upgrade

Paul Namuag

Businesses and enterprises using old versions of PostgreSQL (PG) face challenges when upgrading to at least the most recent stable version from PostgreSQL 12 or PostgreSQL 13. There are many reasons why upgrading to the latest version is a must. Some of the main reasons for this are to take advantage of its critical enhancements to its built-in functionalities, security updates, performance improvements, and new implementations beneficial for database management.

Upgrading PostgreSQL comes with a few challenges as it's not as easy compared to other mainstream databases. If you are facing this type of problem, don’t worry. PostgreSQL does not lock you up on a specific version to use. In this blog, we’ll walk through an example of this challenge while having a TimescaleDB and PostGIS installed on an existing PostgreSQL 11 host. 

Why pg_upgrade?

pg_upgrade has been around for a very long time as a tool for upgrading major versions of PostgreSQL. Using this tool is not required for minor version upgrades, which means that upgrading your current version of 11.9 to 11.13 is not necessary.

When upgrading your PostgreSQL to a major version with pg_upgrade, the tool works by allowing the data stored in PostgreSQL data files to be upgraded to a later PostgreSQL major version. This works without needing a data dump/reload, which can take some time if you have a big dataset. 

Now, here comes the fuss. PostgreSQL, especially for major version releases, is known to have new features added that often change the layout of the system tables,  but the internal data storage format rarely changes. pg_upgrade uses this fact to perform rapid upgrades by creating new system tables and simply reusing the old user data files. If a future major release ever changes the data storage format in a way that makes the old data format unreadable, pg_upgrade will not be usable for such upgrades. (The community will attempt to avoid such situations.)

Some may consider pg_upgrade to be dangerous, especially for the production environment. Well, this tool has been widely used elsewhere from QA, to dev, to production environments. It does have its limitations or caveats, such as the known Unicode or character sets stored in your dataset. In that case, you might consider using pg_dump/pg_restore, but it can take some time to finish depending on how large your data is. For newer versions of PostgreSQL, such as PG 14.0, you can only take a dump/restore (or export/import) or logical replication, otherwise use pg_upgrade.

For larger datasets, using pg_upgrade requires you to run this on the same host, which by default applies a copy of all your physical files from your data directory. In that case, pg_upgrade supports the -k or --link option, which means it will use hard links instead of copying files to the new cluster.

pg_upgrade does its best to make sure the old and new clusters are binary-compatible, e.g., by checking for compatible compile-time settings, including 32/64-bit binaries. It is also important that any external modules are binary compatible, though this cannot be checked by pg_upgrade.

pg_upgrade supports upgrades from 8.4.X and later to the current major release of PostgreSQL, including snapshot and beta releases.

Here's the situation…

In this setup, I used ClusterControl to deploy a PostgreSQL 11 database cluster for a single node. The following have been tested on Centos 7 and Ubuntu Focal (20.04.1):

$ /usr/pgsql-11/bin/postgres --version
postgres (PostgreSQL) 11.13

postgres=# \dx
                                           List of installed extensions

          Name          | Version |   Schema   |                            Description

------------------------+---------+------------+-------------------------------------------------------------------
 fuzzystrmatch          | 1.1     | public     | determine similarities and distance between strings
 pg_stat_statements     | 1.6     | public     | track execution statistics of all SQL statements executed
 plpgsql                | 1.0     | pg_catalog | PL/pgSQL procedural language
 postgis                | 3.1.4   | public     | PostGIS geometry and geography spatial types and functions
 postgis_raster         | 3.1.4   | public     | PostGIS raster types and functions
 postgis_sfcgal         | 3.1.4   | public     | PostGIS SFCGAL functions
 postgis_tiger_geocoder | 3.1.4   | tiger      | PostGIS tiger geocoder and reverse geocoder
 postgis_topology       | 3.1.4   | topology   | PostGIS topology spatial types and functions
 timescaledb            | 2.3.1   | public     | Enables scalable inserts and complex queries for time-series data
(9 rows)

So I got the following,

PostgreSQL server version: 11.13

TimescaleDB version: 2.3.1

PostGIS version: 3.1.4

If you want to test this with ClusterControl, there are two ways to have TimescaleDB. You can deploy a TimescaleDB cluster or have PostgreSQL and enable the TimescaleDB plugin.

 

Setting up for your PostgreSQL 13

Using your package manager setup for the Linux environment with your PostgreSQL and TimescaleDB repository is easier. Here are the steps to do so:

Setup the required repositories

First, let's add the PostgreSQL repository.

For CentOS/RHEL/Oracle Linux

You have to make sure that you have the right repository. For Enterprise Linux (EL) 7, you can do the following:

sudo yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

For other architecture, you can base here https://download.postgresql.org/pub/repos/yum/reporpms/ and replace the EL-7-x86_64 subdirectory.

 

Let's add the TimescaleDB repository as well.

vi /etc/yum.repos.d/timescale_timescaledb.repo

Then add the following contents for this file,

[timescale_timescaledb]
name=timescale_timescaledb
baseurl=https://packagecloud.io/timescale/timescaledb/el/7/\$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/timescale/timescaledb/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300

Just replace the baseurl accordingly if you are using a version other than EL 7.

For Ubuntu/Debian

Add the PG repository for Ubuntu Focal:

deb http://apt.postgresql.org/pub/repos/apt/ focal-pgdg main

For other Ubuntu/Debian distributions, just replace the focal accordingly, which you can find here http://apt.postgresql.org/pub/repos/apt/dists/. For example, replace focal-pgdg with buster-pgdg. 

 

Now, let's add the repository for TimescaleDB,

sudo sh -c "echo 'deb [signed-by=/usr/share/keyrings/timescale.keyring] https://packagecloud.io/timescale/timescaledb/ubuntu/ $(lsb_release -c -s) main' > /etc/apt/sources.list.d/timescaledb.list"

Import the keyring,

wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/timescale.keyring

and update the package lists for upgrades for packages that need upgrading, as well as new packages that have just come to the repositories.

sudo apt-get update

You can replace the sub-directory in the URL if you are using Debian from ubuntu.

Now that we have the repository ready, we're good to go.

Install PostgreSQL version 13 with TimescaleDB and PostGIS

Installing PostgreSQL 13 can be done on the same host. First, you must make sure things such as the database port are unique. In other words, it has to be different from the current PostgreSQL 11 installed on the same host.

For CentOS/RHEL/Oracle Linux

Run the command below to install PostgreSQL 13 and its dependent packages: 

yum install postgresql13.x86_64 postgresql13-server.x86_64 postgresql13-contrib.x86_64 postgresql13-libs.x86_64  

Then initialize the database cluster and its required collection of databases by running the command below:

$ /usr/pgsql-13/bin/postgresql-13-setup initdb

At this point, there should be two data directories both for PG 11 and PG 13:

[[email protected] ~]# ls -alth /var/lib/pgsql/* -d
drwx------. 4 postgres postgres 51 Sep 22 14:19 /var/lib/pgsql/13
drwx------. 4 postgres postgres 33 Sep 21 18:53 /var/lib/pgsql/11

Now that we're good with PostgreSQL 13 let's install TimescaleDB. We need to make sure that the plugin to be installed is the same version on PostreSQL 11. 

Take note that, in order to make sure that pg_upgrade will work smoothly, the plugins of your source and major version destination should be the same version. This is because pg_upgrade will look for its designated libraries linked to the plugins or extensions that have been loaded or used by your old or source database version of your PostgreSQL. You can verify this in your Enterprise Linux by running showduplicates or by verifying with info just like below either with dnf or yum:

$ yum --showduplicates list timescaledb_13.x86_64 timescaledb-2-postgresql-13.x86_64 timescaledb-2-oss-postgresql-13.x86_64 timescaledb-2-loader-postgresql-13.x86_64|grep '2.3.1'
Repository pgdg-common is listed more than once in the configuration
timescaledb-2-loader-postgresql-13.x86_64  2.3.1-0.el7     timescale_timescaledb
timescaledb-2-oss-postgresql-13.x86_64     2.3.1-0.el7     timescale_timescaledb
timescaledb-2-postgresql-13.x86_64         2.3.1-0.el7     timescale_timescaledb

Or verify it with the info option:

$ yum info timescaledb-2-loader-postgresql-13-2.3.1-0.el7.x86_64 timescaledb-2-postgresql-13-2.3.1-0.el7.x86_64

Now, we're ready to install the TimescaleDB package for the PG 13 version.

$ yum install timescaledb-2-loader-postgresql-13-2.3.1-0.el7.x86_64 timescaledb-2-postgresql-13-2.3.1-0.el7.x86_64

After you’ve installed it, you can try to run timescaledb-tune tool to tune up your postgresql.conf configuration file. Just run the command below:

$  timescaledb-tune --pg-config=/usr/pgsql-13/bin/pg_config

Now, let's install the PostGIS package for the PG 13 version as well.

$ yum install -y postgis31_13.x86_64 

For Ubuntu/Debian

Simply run:

$  apt install postgresql-client-13 postgresql-13

The great thing with Ubuntu/Debian distributions is that there are tools for PostgreSQL that are very handy for managing your PostgreSQL clusters, such as pg_lsclusters,  pg_ctlcluster, etc. 

You can verify your available clusters installed.

$ pg_lsclusters
Ver Cluster Port Status Owner    Data directory              Log file
11  main    5432 online   postgres /var/lib/postgresql/11/main log/postgresql-%Y-%m-%d_%H%M%S.log
13  main    5433 online postgres /var/lib/postgresql/13/main /var/log/postgresql/postgresql-13-main.log

In Ubuntu/Debian, there's no need to change the port since it will be handled during the installation phase and it detects and sets it uniquely, accordingly.

Now, let's install TimescaleDB.

$ apt install timescaledb-2-loader-postgresql-13 timescaledb-2-postgresql-13

Optionally, you can run the timescaledb-tune tool to tune up your postgresql.conf configuration file by simply just invoking the tool as the following:

$ timescaledb-tune

Now, we're ready to install the PostGIS package for PG 13.

$ apt install postgresql-13-postgis-3-scripts postgresql-13-postgis-3

 

Review your postgresql.conf

It's always better to review your postgresql.conf configuration file. In Enterprise Linux versions, you can locate your postgresql.conf either in your data_directory or PGDATA path. Whereas for Ubuntu/Debian, you can locate it  in /etc/postgresql/<version-number>/<cluster-name>/postgresql.conf. Make sure that in your postgresql.conf, the following lines are correctly configured:

shared_preload_libraries = 'pg_stat_statements,timescaledb'     # pg_stat_statements is not required but if you are using ClusterControl, make sure this is appended.
port = 5532   # make sure that the port number is unique than the old version of your PostgreSQL

listen_address = *     # depends on your setup but if you need to specify the available network interfaces to its IP addresses (IPv4 or IPv6) set it accordingly.

It's best practice to compare your old and new versions of your PostgreSQL configuration files to ensure that your postgresql.conf is identical to what is needed and set.

Before proceeding with the next step, we also need to check that your PostgreSQL version 13 is loaded accordingly. Make sure that you have the latest version acquired or installed in your host. Start the database and make sure that it starts and runs correctly.

To start in EL distributions, run the command below:

$ sudo -iu postgres /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -o "-c config_file=/var/lib/pgsql/13/data/postgresql.conf" start

or for Ubuntu/Debian, run the command below:

$ sudo -iu postgres /usr/lib/postgresql/13/bin/pg_ctl -D /var/lib/postgresql/13/main/ -o "-c config_file=/etc/postgresql/13/main/postgresql.conf" start

or use the pg_ctlcluster tool to start, restart, or stop your PG Cluster.

 

When ready, run pg_upgrade…

Before moving on, first make sure that you always have your backup from your old server ready and available. Always take a logical backup and physical backup as good practice before proceeding with a major upgrade.

Now that you're ready, you are good to run pg_upgrade. In practice, you have to initially run pg_upgrade with a check to determine the incompatibility and issues before proceeding to the main procedure of pg_upgrade. Before running the pg_upgrade, make sure that both PG 11 and PG 13 are down while doing this process. That just means downtime is needed for this process.

To do that, run the following command with --check option:

$ sudo -iu postgres /usr/lib/postgresql/13/bin/pg_upgrade -o "-c config_file=/etc/postgresql/11/main/postgresql.conf" --old-datadir=/var/lib/postgresql/11/main/   -O "-c config_file=/etc/postgresql/13/main/postgresql.conf"  --new-datadir=/var/lib/postgresql/13/main/ --old-bindir=/usr/lib/postgresql/11/bin --new-bindir=/usr/lib/postgresql/13/bin --check

Replace the --old-datadir value or config_file accordingly if it's different from your setup.

This shall run compatibility checks just like the result below:

Performing Consistency Checks

-----------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for system-defined composite types in user tables  ok
Checking for reg* data types in user tables                 ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for tables WITH OIDS                               ok
Checking for invalid "sql_identifier" user columns          ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok
Checking for new cluster tablespace directories             ok

*Clusters are compatible*

If all checks signal “ok,” that means a successful check, and the bottom message shows that clusters are compatible, then you should be good to go. 

Finally, run the command again without the --check option: 

$ sudo -iu postgres /usr/lib/postgresql/13/bin/pg_upgrade -o "-c config_file=/etc/postgresql/11/main/postgresql.conf" --old-datadir=/var/lib/postgresql/11/main/   -O "-c config_file=/etc/postgresql/13/main/postgresql.conf"  --new-datadir=/var/lib/postgresql/13/main/ --old-bindir=/usr/lib/postgresql/11/bin --new-bindir=/usr/lib/postgresql/13/bin

Performing Consistency Checks

-----------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for system-defined composite types in user tables  ok
Checking for reg* data types in user tables                 ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for tables WITH OIDS                               ok
Checking for invalid "sql_identifier" user columns          ok
Creating dump of global objects                             ok
Creating dump of database schemas                           ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok
Checking for new cluster tablespace directories             ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.
Performing Upgrade

------------------
Analyzing all rows in the new cluster                       ok
Freezing all rows in the new cluster                        ok
Deleting files from new pg_xact                             ok
Copying old pg_xact to new server                           ok
Setting oldest XID for new cluster                          ok
Setting next transaction ID and epoch for new cluster       ok
Deleting files from new pg_multixact/offsets                ok
Copying old pg_multixact/offsets to new server              ok
Deleting files from new pg_multixact/members                ok
Copying old pg_multixact/members to new server              ok
Setting next multixact ID and offset for new cluster        ok
Resetting WAL archives                                      ok
Setting frozenxid and minmxid counters in new cluster       ok
Restoring global objects in the new cluster                 ok
Restoring database schemas in the new cluste                ok
Copying user relation files                                 ok
Setting next OID for new cluster                            ok
Sync data directory to disk                                 ok
Creating script to analyze new cluster                      ok
Creating script to delete old cluster                       ok
Checking for extension updates                              notice
Your installation contains extensions that should be updated
with the ALTER EXTENSION command.  The file
    update_extensions.sql
when executed by psql by the database superuser will update
these extensions.

Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
    ./analyze_new_cluster.sh

Running this script will delete the old cluster's data files:
    ./delete_old_cluster.sh

Depending on the size of your dataset, it might take a bit of time. In the example command above, it does copy the transaction logs, which are of physical files. However, if your disk size is tight, then you can use the -k or --link option, which uses hard links. If all goes well, it should provide you with messages such as the above notifying you with complete upgrade and notices to update the extensions you might have. In this setup, it goes smoothly. The update_extensions.sql contains only the following:

$ cat update_extensions.sql
\connect postgres
ALTER EXTENSION "pg_stat_statements" UPDATE;

As you noticed, pg_upgrade does a series of actions. It analyzes all rows from the source cluster, copying transaction metadata logs and its multi-transaction status data and the rest.

Once you figured out that everything looks good, then run the analyze_new_cluster.sh script, which shall run

vacuumdb --all --analyze-only.
$ sudo -iu postgres PGPORT=5433 ./analyze_new_cluster.sh

Take note that you should specify the port of your PostgreSQL 13 so that vacuumdb will run correctly to the right target server.

In this case, the upgrade outcome with TimescaleDB and PostGIS extensions enabled goes well. Once all appears in order, make sure to start the server and do a series of tests and checks.

 

Test the upgrade process

Always test and review the upgrade process. Here are a few tables and a user-defined database, which contains hypertables and PostGIS tables that rely on using geometry and geography spatial types and functions.

$ sudo -iu postgres psql -p5433
psql (13.4 (Ubuntu 13.4-1.pgdg20.04+1))
Type "help" for help.

postgres=# \l

                              List of databases

   Name    |  Owner   | Encoding | Collate |  Ctype  |   Access privileges

-----------+----------+----------+---------+---------+-----------------------
 mydb      | postgres | UTF8     | C.UTF-8 | C.UTF-8 |
 postgres  | postgres | UTF8     | C.UTF-8 | C.UTF-8 |
 template0 | postgres | UTF8     | C.UTF-8 | C.UTF-8 | =c/postgres          +
           |          |          |         |         | postgres=CTc/postgres
 template1 | postgres | UTF8     | C.UTF-8 | C.UTF-8 | postgres=CTc/postgres+
           |          |          |         |         | =c/postgres
(4 rows)

postgres=# \c mydb
You are now connected to database "mydb" as user "postgres".

mydb=# set search_path="$user",public;
SET
mydb=# \dt+
                                  List of relations

 Schema |      Name       | Type  |  Owner   | Persistence |    Size    | Description

--------+-----------------+-------+----------+-------------+------------+-------------
 public | conditions      | table | postgres | permanent   | 8192 bytes |
 public | global_points   | table | postgres | permanent   | 16 kB      |
 public | roads           | table | postgres | permanent   | 16 kB      |
 public | sample_table    | table | postgres | permanent   | 8192 bytes |
 public | spatial_ref_sys | table | postgres | permanent   | 6968 kB    |
(5 rows)

Checking some of my PostGIS and hypertables:

mydb=# \d roads
                        Table "public.roads"

   Column   |         Type         | Collation | Nullable | Default
------------+----------------------+-----------+----------+---------
 road_id    | integer              |           |          |
 road_name  | character varying    |           |          |
 roads_geom | geometry(LineString) |           |          |

mydb=# \d sample_table
                                     Table "public.sample_table"

 Column |           Type           | Collation | Nullable |                 Default
--------+--------------------------+-----------+----------+------------------------------------------
 id     | integer                  |           | not null | nextval('sample_table_id_seq'::regclass)
 time   | timestamp with time zone |           | not null |
 name   | character varying        |           | not null |
Indexes:
   "sample_table_pkey" PRIMARY KEY, btree (id, "time")
    "sample_table_time_idx" btree ("time" DESC)
Triggers:
    ts_insert_blocker BEFORE INSERT ON sample_table FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Number of child tables: 371 (Use \d+ to list them.)

mydb=# \d conditions
                        Table "public.conditions"

   Column    |           Type           | Collation | Nullable | Default
-------------+--------------------------+-----------+----------+---------
 time        | timestamp with time zone |           | not null |
 location    | text                     |           | not null |
 location2   | character(10)            |           | not null |
 temperature | double precision         |           |          |
 humidity    | double precision         |           |          |
Indexes:
    "conditions_time_idx" btree ("time" DESC)
Triggers:
    ts_insert_blocker BEFORE INSERT ON conditions FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Number of child tables: 366 (Use \d+ to list them.)

mydb=# select count(*) from sample_table;
 count
-------
  2588
(1 row)



mydb=# SELECT name FROM global_points WHERE ST_DWithin(location, 'SRID=4326;POINT(-110 29)'::geography, 1000000);
  name

--------
 Town
 Forest
(2 rows)



mydb=# SELECT n, ST_AsEWKT(ST_GeometryN(the_geom, n)) As geomewkt
mydb-# FROM (
mydb(# VALUES (ST_GeomFromEWKT('MULTIPOINT(1 2 7, 3 4 7, 5 6 7, 8 9 10)') ),
mydb(# ( ST_GeomFromEWKT('MULTICURVE(CIRCULARSTRING(2.5 2.5,4.5 2.5, 3.5 3.5), (10 11, 12 11))') )
mydb(# )As foo(the_geom)
mydb-# CROSS JOIN generate_series(1,100) n
mydb-# WHERE n <= ST_NumGeometries(the_geom);
 n |                geomewkt

---+-----------------------------------------
 1 | POINT(1 2 7)
 1 | CIRCULARSTRING(2.5 2.5,4.5 2.5,3.5 3.5)
 2 | POINT(3 4 7)
 2 | LINESTRING(10 11,12 11)
 3 | POINT(5 6 7)
 4 | POINT(8 9 10)
(6 rows)

Now everything looks ready to serve as my new cluster.

Once you get things going, then you can run:

$ sudo -iu postgres ./delete_old_cluster.sh

Make sure you only run the script since this is a very dangerous script, as it will delete all files in your old PostgreSQL cluster.

Conclusion

pg_upgrade is such a great tool for managing and upgrading your PostgreSQL database server. It can handle complex setups such as with TimescaleDB or PostGIS extensions enabled. Although this tool comes with its limitations, there's no stopping from using it, especially for your DBA work and on production servers apart from QA and development environments.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.