PostgreSQL Deployment and Maintenance with Ansible

Ashokraj M

Ansible is one of the most well-known and widely-used IT automation tools, it helps us in automating IT operational tasks such as...

  • Bootstrapping the host (VM or bare-metal machine) from scratch
  • Configuring hosts and services
  • Managing software deployments and upgrades
  • Ansible also has the support for orchestrating the cloud infrastructure like creating a bunch of EC2 and RDS instance for your applications on public clouds (AWS, GCP, Azure). More on the cloud provisioning can be found here

Since this blog is mostly about managing PostgreSQL using Ansible, we will not go into detail of Ansible usages, however, we will go through some Ansible basics. I recommend going through the Ansible document link if you would like to learn more about it.

Ansible Basics

Ansible is an open-source project written in python whoses source code is available on GitHub. Since it is a python package we can easily install Ansible using pip.

Ansible needs to be installed on only one host from which we will be orchestrating our operational tasks using Ansible commands (Ansible, Ansible-playbook). We call this orchestration host the Control Node.

Ansible commands uses OpenSSH libraries to login to the target hosts for executing operational tasks, we call these target hosts Managed Node. The Managed Node’s hostname or IP are mentioned in a file which is called Inventory, this inventory filename is then specified as an input to the Ansible commands.

In the inventory file, we can list multiple hosts under one single group, this will avoid repeating the same tasks multiple times for different host. More details on the inventory file usage can be found here.

Since Ansible command uses SSH to login, there is no need to install Ansible on all the host, it only needs to be installed on the Control Node. However, all Control Node and Managed Node should have python and any necessary python libraries installed. More on the Ansible installation can be found here.

For the demo, I will be using a laptop as the control node and guest CentOS-7 VM as the managed node. The CentOS-7 VM was provisioned using Vagrant on the provider VirtualBox.

Installing Ansible on the Control Node

We will install Ansible using pip as referenced in the Ansible document page. The following commands were executed as “Ansible” user.

$ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
$ python get-pip.py --user

Using --user option, makes pip and Ansible commands to be installed under the HOME directory and we need to add the bin path to our PATH environment variable.

$ echo 'export PATH=$HOME/Library/Python/2.7/bin:$PATH' >> ~/.bash_profile
$ source ~/.bash_profile

The following pip command has installed Ansible version 2.8.0 (which is the latest stable version at the time of writing this blog.)

$ pip install --user ansible 
$ which ansible
/Users/Ansible/Library/Python/2.7/bin/Ansible
$ ansible --version
Ansible 2.8.0
...
... 

Control Node and Managed Node Prechecks

Make sure you have a proper network connection between the control node and the managed node.

Check your Firewall for any rules which may block inbound and outbound connections on the SSH port, if so open the SSH port to provide access on both the control and managed nodes.

First, try to connect via SSH to the managed node. You should be able to login to the managed node from the control node.

You can setup the password-less SSH access to the managed nodes according to your organization security policies. For this demo, I have password-less configured for SSH to my managed node “pg01” (CentOS-7) for the user “vagrant”. This makes the managed node have the sudo power, most of the installation and host configurations tasks will be executed as “vagrant” user with “sudo”.

On the control node, we have the configuration file ansible.cfg which will be used by the Ansible commands. Below are some configuration options which are defined in the config file. To learn more on the other configuration options available, check out the sample configuration file.

  • remote_port - If the SSH server on the managed node runs on a different port other than the default port 22, we can change it
  • remote_user - The login username that will be used by Ansible to connect the managed node, for running the tasks
  • private_key_file - SSH private key which will be used for Ansible to login

Since the above said configuration applies globally for all the managed nodes, if we want to have a different configuration for a specific host or host group we can specify them in the inventory file. You can see an example of this below in the “development.yaml” inventory file.

Performing an Ansible Dry Run

Create an inventory file “development.yaml” as shown below.

$ pwd
/Users/Ansible/postgres-setup

$ cat development.yaml 
all:
  hosts:
  children:
    postgres_clusters:
      hosts:
        pg01:
      vars: 
        ansible_port: 22
        ansible_user: "vagrant"
        ansible_private_key_file: "/Users/Ansible/postgres-setup/private_key"

In the inventory file above host pg01 is one of the members of the host group postgres_clusters. The variables ansible_port, ansible_user, and ansible_private_key_file only apply to the hosts under the group postgres_clusters.

We will now check to see if Ansible can run the tasks on the managed node. In the example below the ansible command executes the module ping on the managed node pg01, if Ansible was able to run the module ping then you should see SUCCESS as the response.

$ ansible -i development.yaml -m ping pg01
pg01 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    }, 
    "changed": false, 
    "ping": "pong"
}

When running Ansible on the managed node as the first task, it collects the info like hostname, IP address, the memory of the managed node. To check these we can call the module setup which would return a large JSON. We can make use of any of these in our Ansible playbook.

$ ansible -i development.yaml -m setup pg01 
pg01 | SUCCESS => {
    "ansible_facts": {
        "ansible_all_ipv4_addresses": [
            "192.168.100.4", 
            "10.0.2.15"
        ], 
        "ansible_all_ipv6_addresses": [
            "fe80::a00:27ff:fe29:ac89", 
            "fe80::5054:ff:fe26:1060"
        ],

Ansible Role

Ansible Role is a way of bundling a certain set of related tasks and configuration settings into one single unit just by assigning a role to a particular host or host group. Ansible will apply all the related configuration and tasks. This avoids repeating the tasks multiple times for each different host or host group.

Each role is represented as a directory and within the role directory there will be subdirectories like defaults files, handlers, meta, tasks, templates, tests, vars. The purpose of these directories can be found here.

Ansible commands, by default, search the role directory under the paths mentioned in DEFAULT_ROLES_PATH.

$ ansible-config list | grep -A2 '^DEFAULT_ROLES_PATH'
DEFAULT_ROLES_PATH:
  default: ~/.Ansible/roles:/usr/share/Ansible/roles:/etc/Ansible/roles
  description: Colon separated paths in which Ansible will search for Roles.

Ansible Galaxy

Ansible Galaxy is a portal where the community folks share the GitHub repository of their Ansible Roles. We can browse through the galaxy portal for the required Ansible roles. Using the command ansible-galaxy, we could download and reuse the role. Before using a role, go through in detail all the Ansible YAML files under the directories defaults, vars, tasks, templates, handlers and be aware of how the role works.

For our deployment of PostgreSQL, we will use the role “postgresql” developed by the author ANXS and GitHub repo.

Installing Ansible role “anxs.postgresql”

$ ansible-galaxy  install anxs.postgresql
- downloading role 'postgresql', owned by anxs
- downloading role from https://github.com/ANXS/postgresql/archive/v1.10.1.tar.gz
- extracting anxs.postgresql to /Users/ansible/.Ansible/roles/anxs.postgresql
- anxs.postgresql (v1.10.1) was installed successfully

The above command installs the role directory ”anxs.postgresql” under the directory “/Users/ansible/.Ansible/roles,” this is one of the directories in the DEFAULT_ROLES_PATH and the ansible command will search this directory for any roles.

Ansible Playbook

An Ansible Playbook is a YAML file in which we will list out the tasks or roles which have to be executed on a particular host or the host group. You can read more on developing playbooks as well as learn the definition of tags like hosts, tasks, roles, vars here.

By default, all the tasks are executed as the ansible user which has logged in. In order to execute particular tasks with a different user (or with ‘root’ privilege) we can make use of become. How to use this command can be found here.

In the playbook below (postgres-play.yaml), I have listed the role “anxs.postgresql” under the host group “postgres_clusters”, so all the tasks in the role anxs.postgresql will be executed for all the hosts under the group “postgres_clusters”.

$ cat postgres-play.yaml 
---
- hosts: postgres_clusters
  become: yes
  roles: 
    - role: anxs.postgresql

become: yes in the YAML defines that this role will be executed with higher privilege by using the DEFAULT_BECOME_METHOD “sudo”

$ ansible-config list | grep -A2 '^DEFAULT_BECOME_METHOD'
DEFAULT_BECOME_METHOD:
  default: sudo
  description: Privilege escalation method to use when `become` is enabled.

We will be running this playbook as the user “vagrant” and the user was already provisioned with sudo power.

[[email protected] ~]$ sudo cat /etc/sudoers.d/vagrant
%vagrant ALL=(ALL) NOPASSWD: ALL
Severalnines
 
DevOps Guide to Database Management
Learn about what you need to know to automate and manage your open source databases

Deploying PostgreSQL Using Ansible

We will now run the playbook ‘postgres-play.yaml’ which will install all PostgreSQL related packages and configure it using the default settings.

For this example, Ansible will install PostgreSQL 9.6 on port 5432, with the postgres max_connections set to 100. All the default settings can be found in the file /Users/ansible/.Ansible/roles/anxs.postgresql/defaults/main.yml.

$ grep -E '^postgresql_(version|port|max_connections):' ~/.Ansible/roles/anxs.postgresql/defaults/main.yml 
postgresql_version: 9.6
postgresql_port: 5432
postgresql_max_connections: 100

Running the playbook

$ ansible-playbook -i development.yaml postgres-play.yaml
PLAY [postgres_clusters] ***************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************
ok: [pg01]
...
...


PLAY RECAP *****************************************************************************************************************************************************************************************************
pg01                       : ok=21   changed=14   unreachable=0    failed=0    skipped=32   rescued=0    ignored=0 

Once the Ansible has executed all the tasks, a summary of the task executions will be shown under PLAY RECAP.

  • ok=21, 21 tasks executed with no changes.
  • changed=14, 14 tasks has made changes to the host, like installing postgres, creating directories, files, starting the postgres.
  • skipped=32, 32 tasks has been skipped, may be due to some feature was not enabled. Since we are installing on entOS, Ubuntu related tasks were skipped.

Check PostgreSQL service status and configuration.

[[email protected] ~]$ systemctl status postgresql-9.6
● postgresql-9.6.service - PostgreSQL 9.6 database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql-9.6.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/postgresql-9.6.service.d
           └─custom.conf
   Active: active (running) since Wed 2019-05-29 07:15:25 UTC; 24min ago
     Docs: https://www.postgresql.org/docs/9.6/static/
  Process: 7559 ExecStartPre=/usr/pgsql-9.6/bin/postgresql96-check-db-dir /var/lib/pgsql/9.6/data (code=exited, status=0/SUCCESS)
 Main PID: 7564 (postmaster)
   CGroup: /system.slice/postgresql-9.6.service
           ├─7564 /usr/pgsql-9.6/bin/postmaster -D /etc/postgresql/9.6/data
           ├─7567 postgres: checkpointer process   
           ├─7568 postgres: writer process   
           ├─7569 postgres: wal writer process   
           ├─7570 postgres: autovacuum launcher process   
           └─7571 postgres: stats collector process   

[[email protected] ~]$ psql -U postgres
psql (9.6.13)
Type "help" for help.

postgres=# show max_connections ;
 max_connections 
-----------------
 100
(1 row)

postgres=# show statement_timeout ;
 statement_timeout 
-------------------
 
(1 row)

postgres=# show log_min_duration_statement ;
 log_min_duration_statement 
----------------------------
 -1
(1 row)

We have now installed PostgreSQL on the managed host “pg01” using the default configuration.

Changing the PostgreSQL Configuration

Now we will re-configure the PostgreSQL instance using our custom settings.

I have created custom.yaml file (as shown below) which has the list of variables defined to modify PostgreSQL settings like listen_addresses, max_connections, wal_level, hot_standby, statement_timeout, log_checkpoint, log_lock_waits, log_destination, log_min_duration_statement.

$ pwd
/Users/ansible/postgres-setup
$ cat custom.yaml 
postgresql_listen_addresses: "*"
postgresql_max_connections: 300
postgresql_wal_level: "hot_standby"
postgresql_hot_standby: "on"
postgresql_statement_timeout: 60000
postgresql_log_lock_waits: "on"
postgresql_log_destination: "csvlog"
postgresql_log_min_duration_statement: 0

We will now change our playbook postgres-play.yaml to use this custom.yaml.

$ cat postgres-play.yaml  
---
- hosts: postgres_clusters
  become: yes
  vars_files:
    - ./custom.yaml
  roles: 
    - role: anxs.postgresql

Using vars_files tags, I have specified the custom configuration file custom.yaml, which will override the default configuration specified in the role anxs.postgresql. More details on variable precedence can be found here.

We could now re-run the same ansible-playbook command which we had executed previously, but this will execute all the tasks like installing PostgreSQL, configuring, creating users and databases. For this we should restrict Ansible to only execute the tasks related to PostgreSQL configuration by using --tags <TAG NAME> option.

In order to know the list of supported tags, we could run the command with --list-tags.

$ ansible-playbook -i development.yaml postgres-play.yaml --list-tags
playbook: postgres-play.yaml
  play #1 (postgres_clusters): postgres_clusters        TAGS: []
      TASK TAGS: [always, postgresql, postgresql-configure, postgresql-databases, postgresql-extensions, postgresql-install, postgresql-monit, postgresql-users]

From the above tags we will specify only postgresql-configure tag to modify the postgresql settings.

$ ansible-playbook  -i development.yaml postgres-play.yaml --tags postgresql-configure

PLAY [postgres_clusters] ***************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************
ok: [pg01]
...
...

TASK [anxs.postgresql : PostgreSQL | Update configuration - pt. 2 (postgresql.conf)] ***************************************************************************************************************************
changed: [pg01]
...
...
TASK [anxs.postgresql : PostgreSQL | Reload all conf files] ****************************************************************************************************************************************************
changed: [pg01]

PLAY RECAP *****************************************************************************************************************************************************************************************************
pg01                       : ok=13   changed=2    unreachable=0    failed=0    skipped=6    rescued=0    ignored=0

As you see in the PLAY RECAP, only 2 changes has propagated to the managed node pg01. First one is updating the configuration and second one is reloading the configurations.

Verify the configuration changes have taken effect on the managed node.

postgres=# show listen_addresses ;
 listen_addresses
------------------
 localhost
(1 row)

postgres=# show max_connections ;
 max_connections 
-----------------
 100
(1 row)

postgres=# show wal_level ;
 wal_level 
-----------
 minimal
(1 row)

postgres=# show hot_standby ;
 hot_standby 
-------------
 off
(1 row)

postgres=# show statement_timeout;
 statement_timeout 
-------------------
 1min
(1 row)

postgres=# show log_lock_waits ;
 log_lock_waits 
----------------
 on
(1 row)

postgres=# show log_destination ;
 log_destination 
-----------------
 csvlog
(1 row)

postgres=# show log_min_duration_statement;
 log_min_duration_statement 
----------------------------
 
(1 row)

As you can see, some configurations change like listen_addresses, max_connections, wal_level, hot_standby have not yet taken effect. These configuration changes need a PostgreSQL restart and the role anxs.postgresql has only reloaded the service itself.

In order to avoid an abrupt restart of PostgreSQL during production hours, the original author may not have added the restart task to the role. We can manually restart the postgresql service, during the scheduled downtime.

[[email protected] ~]$ sudo systemctl restart postgresql-9.6

[[email protected] ~]$ psql -U postgres
psql (9.6.13)

postgres=# show listen_addresses ;
 listen_addresses 
------------------
 
(1 row)

postgres=# show max_connections ;
 max_connections 
-----------------
 300
(1 row)

postgres=# show wal_level;
 wal_level 
-----------
 replica
(1 row)

postgres=# show hot_standby;
 hot_standby 
-------------
 on
(1 row)

Creating PostgreSQL Users & Databases

We will now create the users “app1” and “app2” and the databases “app1_db” and “app2_db” owned by the users “app1” and “app2” respectively.

I have added two new variables, postgresql_users and postgresql_database to custom.yaml, which has the list of users and databases which need to be created. Role anxs.postgresql uses the Ansible module postgresql_users and postgresql_db for creating the user and the database. You can refer to these documents to add the variables.

$ cat custom.yaml 
...
...
postgresql_users:
  - name: app1
    pass: md5bb0592c05941d14c231da96950c71b60
    encrypted: yes
  - name: app2
    pass: md5bbb1e4d09b64ca54a237727af46cba7c
    encrypted: yes

postgresql_databases:
  - name: app1_db
    owner: app1 
  - name: app2_db
    owner: app2 

We will now run only the tasks associated with the tags postgresql-users and postgresql-databases.

$ ansible-playbook -i development.yaml postgres-play.yaml --tags postgresql-users,postgresql-databases

PLAY [postgres_clusters] ***************************************************************************************************************************************************************************************
...
...
TASK [anxs.postgresql : PostgreSQL | Make sure the PostgreSQL users are present] *******************************************************************************************************************************
changed: [pg01] => (item=None)
changed: [pg01] => (item=None)
changed: [pg01]
...
...
TASK [anxs.postgresql : PostgreSQL | Make sure the PostgreSQL databases are present] ***************************************************************************************************************************
changed: [pg01] => (item={u'owner': u'app1', u'name': u'app1_db'})
changed: [pg01] => (item={u'owner': u'app2', u'name': u'app2_db'})
...
...
PLAY RECAP *****************************************************************************************************************************************************************************************************
pg01                       : ok=6    changed=2    unreachable=0    failed=0    skipped=9    rescued=0    ignored=0

Verify the users and databases are created on the managed host.

postgres=# \du
                                   List of roles
 Role name |                         Attributes                         | Member of 
-----------+------------------------------------------------------------+-----------
 app1      |                                                            | {}
 app2      |                                                            | {}
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}

postgres=# \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 app1_db   | app1     | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 app2_db   | app2     | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(5 rows)

Allowing External Hosts to Connect to the PostgreSQL Server

We will now allow the external hosts to connect the PostgreSQL service by adding the variable postgresql_pg_hba_custom to custom.yaml

$ cat custom.yaml
...
...
postgresql_pg_hba_custom:
  - {type: "host", database: "all", user: "all", address: "0.0.0.0/0", method: "md5" }

Running the tasks tagged with postgresql-configure, to apply the configuration.

$ ansible-playbook -i development.yaml postgres-play.yaml --tags postgresql-configure

Verifying if I am able to connect to the PostgreSQL server from my control node.

$ PGPASSWORD=password psql -h pg01 -U app1 -d app1_db -c 'Select true'
 bool
------
 
(1 row)

Conclusion

This blog should give you the basics you need to know to use Ansible for deploying and managing PostgreSQL. We have, however, only covered a few PostgreSQL administration tasks. Depending upon on your organizations infrastructure, you may need to override several of the default configurations and add even more tasks to the Ansible role.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.