blog

Multitenancy Options for PostgreSQL

Achilleas Mantzios

Published: August 28, 2018
Last Updated: May 4, 2022

Multi-tenancy in a software system is called the separation of data according to a set of criteria in order to satisfy a set of objectives. The magnitude/extend, the nature and the final implementation of this separation is dependent on those criteria and objectives. Multi-tenancy is basically a case of data partitioning but we’ll try to avoid this term for the obvious reasons (the term in PostgreSQL has a very specific meaning and is reserved, as declarative table partitioning was introduced in postgresql 10).

The criteria might be:

according to the id of an important master table, which symbolizes the tenant id which might represent:
1. a company/organization within a larger holding group
2. a department within a company/organization
3. a regional office/branch of the same company/organization
according to a user’s location/IP
according to a user’s position inside the company/organization

The objectives might be:

separation of physical or virtual resources
separation of system resources
security
accuracy and convenience of management/users at the various levels of the company/organization

Note by fulfilling an objective we also fulfill all the objectives beneath, i.e. by fulfilling A we also fulfill B, C and D, by fulfilling B we also fulfill C and D, and so forth.

If we want to fulfill objective A we may choose to deploy each tenant as a separate database cluster within its own physical/virtual server. This gives maximum separation of resources and security but gives poor results when we need to see the whole data as one, i.e. the consolidated view of the whole system.

If we want to only achieve objective B we might deploy each tenant as a separate postgresql instance in the same server. This would give us control over how much space would be assigned to each instance, and also some control (depending on the OS) on CPU/mem utilization. This case is not essentially different than A. In the modern cloud computing era, the gap between A and B tends to get smaller and smaller, so that A will be most probably the prefered way over B.

If we want to achieve objective C, i.e. security, then it is enough to have one database instance and deploy each tenant as a separate database.

And finally if we care only for “soft” separation of data, or in other words different views of the same system, we can achieve this by just one database instance and one database, using a plethora of techniques discussed below as the final (and major) topic of this blog. Talking about multi-tenancy, from the DBA’s perspective, cases A, B and C bear a lot of similarities. This is since in all cases we have different databases and in order to bridge those databases, then special tools and technologies must be used. However, if the need to do so comes from the analytics or Business Intelligence departments then no bridging maybe needed at all, since the data could be very well replicated to some central server dedicated to those tasks, making bridging unnecessary. If indeed such a bridging is needed then we must use tools like dblink or foreign tables. Foreign tables via Foreign Data Wrappers is nowadays the preferred way.

If we use option D, however, then consolidation is already given by default, so now the hard part is the opposite: separation. So we may generally categorize the various options into two main categories:

Soft separation
Hard separation

Hard Separation via Different Databases in Same Cluster

Let’s suppose that we have to design a system for an imaginary business offering car and boat rentals, but because those two are governed by different legislations, different controls, audits, each company must maintain separate accounting departments and thus we would like to keep their systems separated. In this case we choose to have a different database for each company: rentaldb_cars and rentaldb_boats, which will have identical schemas:

# d customers
                                  Table "public.customers"
   Column    |     Type      | Collation | Nullable |                Default                
-------------+---------------+-----------+----------+---------------------------------------
 id          | integer       |           | not null | nextval('customers_id_seq'::regclass)
 cust_name   | text          |           | not null |
 birth_date  | date          |           |          |
 sex         | character(10) |           |          |
 nationality | text          |           |          |
Indexes:
    "customers_pkey" PRIMARY KEY, btree (id)
Referenced by:
    TABLE "rental" CONSTRAINT "rental_customerid_fkey" FOREIGN KEY (customerid) REFERENCES customers(id)

# d rental
                              Table "public.rental"
   Column   |  Type   | Collation | Nullable |              Default               
------------+---------+-----------+----------+---------------------------------
 id         | integer |           | not null | nextval('rental_id_seq'::regclass)
 customerid | integer |           | not null |
 vehicleno  | text    |           |          |
 datestart  | date    |           | not null |
 dateend    | date    |           |          |
Indexes:
    "rental_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "rental_customerid_fkey" FOREIGN KEY (customerid) REFERENCES customers(id)

Lets suppose we have the following rentals. In rentaldb_cars:

rentaldb_cars=# select cust.cust_name,rent.vehicleno,rent.datestart FROM rental rent JOIN customers cust on (rent.customerid=cust.id);
    cust_name    | vehicleno | datestart  
-----------------+-----------+------------
 Valentino Rossi | INI 8888  | 2018-08-10
(1 row)

and in rentaldb_boats:

rentaldb_boats=# select cust.cust_name,rent.vehicleno,rent.datestart FROM rental rent JOIN customers cust on (rent.customerid=cust.id);
   cust_name    | vehicleno | datestart  
----------------+-----------+------------
 Petter Solberg | INI 9999  | 2018-08-10
(1 row)

Now the management would like to have a consolidated view of the system, e.g. a unified way to view the rentals. We may solve this via the application, but if we don’t want to update the application or don’t have access to the source code, then we might solve this by creating a central database rentaldb and by making use of foreign tables, as follows:

CREATE EXTENSION IF NOT EXISTS postgres_fdw WITH SCHEMA public;
CREATE SERVER rentaldb_boats_srv FOREIGN DATA WRAPPER postgres_fdw OPTIONS (
    dbname 'rentaldb_boats'
);
CREATE USER MAPPING FOR postgres SERVER rentaldb_boats_srv;
CREATE SERVER rentaldb_cars_srv FOREIGN DATA WRAPPER postgres_fdw OPTIONS (
    dbname 'rentaldb_cars'
);
CREATE USER MAPPING FOR postgres SERVER rentaldb_cars_srv;
CREATE FOREIGN TABLE public.customers_boats (
    id integer NOT NULL,
    cust_name text NOT NULL
)
SERVER rentaldb_boats_srv
OPTIONS (
    table_name 'customers'
);
CREATE FOREIGN TABLE public.customers_cars (
    id integer NOT NULL,
    cust_name text NOT NULL
)
SERVER rentaldb_cars_srv
OPTIONS (
    table_name 'customers'
);
CREATE VIEW public.customers AS
 SELECT 'cars'::character varying(50) AS tenant_db,
    customers_cars.id,
    customers_cars.cust_name
   FROM public.customers_cars
UNION
 SELECT 'boats'::character varying AS tenant_db,
    customers_boats.id,
    customers_boats.cust_name
   FROM public.customers_boats;
CREATE FOREIGN TABLE public.rental_boats (
    id integer NOT NULL,
    customerid integer NOT NULL,
    vehicleno text NOT NULL,
    datestart date NOT NULL
)
SERVER rentaldb_boats_srv
OPTIONS (
    table_name 'rental'
);
CREATE FOREIGN TABLE public.rental_cars (
    id integer NOT NULL,
    customerid integer NOT NULL,
    vehicleno text NOT NULL,
    datestart date NOT NULL
)
SERVER rentaldb_cars_srv
OPTIONS (
    table_name 'rental'
);
CREATE VIEW public.rental AS
 SELECT 'cars'::character varying(50) AS tenant_db,
    rental_cars.id,
    rental_cars.customerid,
    rental_cars.vehicleno,
    rental_cars.datestart
   FROM public.rental_cars
UNION
 SELECT 'boats'::character varying AS tenant_db,
    rental_boats.id,
    rental_boats.customerid,
    rental_boats.vehicleno,
    rental_boats.datestart
   FROM public.rental_boats;

In order to view all the rentals and the customers in the whole organization we simply do:

rentaldb=# select cust.cust_name, rent.* FROM rental rent JOIN customers cust ON (rent.tenant_db=cust.tenant_db AND rent.customerid=cust.id);
    cust_name    | tenant_db | id | customerid | vehicleno | datestart  
-----------------+-----------+----+------------+-----------+------------
 Petter Solberg  | boats     |  1 |          1 | INI 9999  | 2018-08-10
 Valentino Rossi | cars      |  1 |          2 | INI 8888  | 2018-08-10
(2 rows)

This looks good, isolation and security are guaranteed, consolidation is achieved, but still there are problems:

customers must be separately maintained, meaning that the same customer might end up with two accounts
The application must respect the notion of a special column (such as tenant_db) and append this to every query, making it prone to errors
The resulting views are not automatically updatable (since they contain UNION)

Soft Separation in the Same Database

When this approach is chosen then consolidation is given out of the box and now the hard part is separation. PostgreSQL offers a plethora of solutions to us in order to implement separation:

Views
Role Level Security
Schemas

With views, the application must set a queryable setting such as application_name, we hide the main table behind a view, and then in every query on any of the children (as in FK dependency) tables, if any, of this main table join with this view. We will see this in the following example in a database we call rentaldb_one. We embed the tenant company identification into the main table:

rentaldb_one=# d rental_one
                                   Table "public.rental_one"
   Column   |         Type          | Collation | Nullable |              Default               
------------+-----------------------+-----------+----------+------------------------------------
 company    | character varying(50) |           | not null |
 id         | integer               |           | not null | nextval('rental_id_seq'::regclass)
 customerid | integer               |           | not null |
 vehicleno  | text                  |           |          |
 datestart  | date                  |           | not null |
 dateend    | date                  |           |          |
Indexes:
    "rental_pkey" PRIMARY KEY, btree (id)
Check constraints:
    "rental_company_check" CHECK (company::text = ANY (ARRAY['cars'::character varying, 'boats'::character varying]::text[]))
Foreign-key constraints:
    "rental_customerid_fkey" FOREIGN KEY (customerid) REFERENCES customers(id)

PostgreSQL Management & Automation with ClusterControl

Learn about what you need to know to deploy, monitor, manage and scale PostgreSQL

Download the Whitepaper

The table customers’ schema remains the same. Lets see the current contents of the database:

rentaldb_one=# select * from customers;
 id |    cust_name    | birth_date | sex | nationality
----+-----------------+------------+-----+-------------
  2 | Valentino Rossi | 1979-02-16 |     |
  1 | Petter Solberg  | 1974-11-18 |     |
(2 rows)

rentaldb_one=# select * from rental_one ;
 company | id | customerid | vehicleno | datestart  | dateend
---------+----+------------+-----------+------------+---------
 cars    |  1 |          2 | INI 8888  | 2018-08-10 |
 boats   |  2 |          1 | INI 9999  | 2018-08-10 |
(2 rows)

We use the new name rental_one in order to hide this behind the new view which will have the same name of the table that the application expects: rental.The application will need to set the application name to denote the tenant. So in this example we will have three instances of the application, one for cars, one for boats and one for the top management. The application name is set like:

rentaldb_one=# set application_name to 'cars';

We now create the view:

create or replace view rental as select company as "tenant_db",id,customerid,vehicleno,datestart,dateend from rental_one where (company = current_setting('application_name') OR current_setting('application_name')='all');

Note: We keep the same columns, and table/view names as possible, the key point in multi-tenant solutions is to keep things same in the application side, and changes to be minimal and manageable.

Let’s do some selects:

rentaldb_one=# set application_name to ‘cars’;

rentaldb_one=# set application_name to 'cars';
SET
rentaldb_one=# select * from rental;
 tenant_db | id | customerid | vehicleno | datestart  | dateend
-----------+----+------------+-----------+------------+---------
 cars      |  1 |          2 | INI 8888  | 2018-08-10 |
(1 row)

rentaldb_one=# set application_name to 'boats';
SET
rentaldb_one=# select * from rental;
 tenant_db | id | customerid | vehicleno | datestart  | dateend
-----------+----+------------+-----------+------------+---------
 boats     |  2 |          1 | INI 9999  | 2018-08-10 |
(1 row)

rentaldb_one=# set application_name to 'all';
SET
rentaldb_one=# select * from rental;
 tenant_db | id | customerid | vehicleno | datestart  | dateend
-----------+----+------------+-----------+------------+---------
 cars      |  1 |          2 | INI 8888  | 2018-08-10 |
 boats     |  2 |          1 | INI 9999  | 2018-08-10 |
(2 rows)

The 3rd instance of the application which must set application name to “all” is intended for use by the top management with a view to the whole database.

A more robust solution, security-wise, may be based on RLS (row level security). First we restore the name of the table, remember we don’t want to disturb the application:

rentaldb_one=# alter view rental rename to rental_view;
rentaldb_one=# alter table rental_one rename TO rental;

First we create the two groups of users for each company (boats, cars) which must see their own subset of the data:

rentaldb_one=# create role cars_employees;
rentaldb_one=# create role boats_employees;

We now create security policies for each group:

rentaldb_one=# create policy boats_plcy ON rental to boats_employees USING(company='boats');
rentaldb_one=# create policy cars_plcy ON rental to cars_employees USING(company='cars');

After giving the required grants to the two roles:

rentaldb_one=# grant ALL on SCHEMA public to boats_employees ;
rentaldb_one=# grant ALL on SCHEMA public to cars_employees ;
rentaldb_one=# grant ALL on ALL tables in schema public TO cars_employees ;
rentaldb_one=# grant ALL on ALL tables in schema public TO boats_employees ;

we create one user in each role

rentaldb_one=# create user boats_user password 'boats_user' IN ROLE boats_employees;
rentaldb_one=# create user cars_user password 'cars_user' IN ROLE cars_employees;

And test:

postgres@smadev:~> psql -U cars_user rentaldb_one
Password for user cars_user:
psql (10.5)
Type "help" for help.

rentaldb_one=> select * from rental;
 company | id | customerid | vehicleno | datestart  | dateend
---------+----+------------+-----------+------------+---------
 cars    |  1 |          2 | INI 8888  | 2018-08-10 |
(1 row)

rentaldb_one=> q
postgres@smadev:~> psql -U boats_user rentaldb_one
Password for user boats_user:
psql (10.5)
Type "help" for help.

rentaldb_one=> select * from rental;
 company | id | customerid | vehicleno | datestart  | dateend
---------+----+------------+-----------+------------+---------
 boats   |  2 |          1 | INI 9999  | 2018-08-10 |
(1 row)

rentaldb_one=>

The nice thing with this approach is that we don’t need many instances of the application. All the isolation is done at the database level based on the user’s roles. Therefore in order to create a user in the top management all we need to do is grant this user both roles:

rentaldb_one=# create user all_user password 'all_user' IN ROLE boats_employees, cars_employees;
postgres@smadev:~> psql -U all_user rentaldb_one
Password for user all_user:
psql (10.5)
Type "help" for help.

rentaldb_one=> select * from rental;
 company | id | customerid | vehicleno | datestart  | dateend
---------+----+------------+-----------+------------+---------
 cars    |  1 |          2 | INI 8888  | 2018-08-10 |
 boats   |  2 |          1 | INI 9999  | 2018-08-10 |
(2 rows)

Looking at those two solutions we see that the view solution requires changing the basic table name, which may be pretty intrusive in that we may need to run exactly the same schema in a non multitenant solution, or with an app that is not aware of application_name, while the second solution binds people to specific tenants. What if the same person works e.g. on the boats tenant in the morning and on the cars tenant in the afternoon? We will see a 3rd solution based on schemas, which in my opinion is the most versatile, and does not suffer of any of the caveats of the two solutions described above. It allows the application to run in a tenant-agnostic manner, and the system engineers to add tenants on the go as needs arise. We will keep the same design as before, with the same test data (we will keep working on the rentaldb_one example db). The idea here is to add a layer in front of the main table in the form of a database object in a separate schema which will be early enough in the search_path for that specific tenant. The search_path can be set (ideally via a special function, which gives more options) in the connection configuration of the data source at the application server layer (therefore outside of the application code). First we create the two schemas:

rentaldb_one=# create schema cars;
rentaldb_one=# create schema boats;

Then we create the database objects (views) in each schema:

CREATE OR REPLACE VIEW boats.rental AS
 SELECT rental.company,
    rental.id,
    rental.customerid,
    rental.vehicleno,
    rental.datestart,
    rental.dateend
   FROM public.rental
  WHERE rental.company::text = 'boats';

CREATE OR REPLACE VIEW cars.rental AS
 SELECT rental.company,
    rental.id,
    rental.customerid,
    rental.vehicleno,
    rental.datestart,
    rental.dateend
   FROM public.rental
  WHERE rental.company::text = 'cars';

Next step is to set the search path in each tenant as follows:

For the boats tenant:

set search_path TO 'boats, "$user", public';

For the cars tenant:

set search_path TO 'cars, "$user", public';

For the top mgmt tenant leave it at default

Lets test:

rentaldb_one=# select * from rental;
 company | id | customerid | vehicleno | datestart  | dateend
---------+----+------------+-----------+------------+---------
 cars    |  1 |          2 | INI 8888  | 2018-08-10 |
 boats   |  2 |          1 | INI 9999  | 2018-08-10 |
(2 rows)

rentaldb_one=# set search_path TO 'boats, "$user", public';
SET
rentaldb_one=# select * from rental;
 company | id | customerid | vehicleno | datestart  | dateend
---------+----+------------+-----------+------------+---------
 boats   |  2 |          1 | INI 9999  | 2018-08-10 |
(1 row)

rentaldb_one=# set search_path TO 'cars, "$user", public';
SET
rentaldb_one=# select * from rental;
 company | id | customerid | vehicleno | datestart  | dateend
---------+----+------------+-----------+------------+---------
 cars    |  1 |          2 | INI 8888  | 2018-08-10 |
(1 row)

Instead of set search_path we may write a more complex function to handle more complex logic and call this in the connection configuration of our application or connection pooler.

In the example above we used the same central table residing on public schema (public.rental) and two additional views for each tenant, using the fortunate fact that those two views are simple and therefore writeable. Instead of views we may use inheritance, by creating one child table for each tenant inheriting from the public table. This is a fine match for table inheritance, a unique feature of PostgreSQL. The top table might be configured with rules to disallow inserts. In the inheritance solution a conversion would be needed to populate the children tables and to prevent insert access to the parent table, so this is not as simple as in the case with views, which works with minimal impact to the design. We might write a special blog on how to do that.

The above three approaches may be combined to give even more options.

Why Cloud Repatriation Matters Now More Than Ever

Automating Day 2 operations: Scaling, upgrades and maintenance

PostgreSQL Bi-Directional Logical Replication — A Deep Dive

A MySQL backup and recovery tutorial on K8s using the MySQL Operator for Kubernetes