Challenges of Scaling the Moodle PostgreSQL Database

Sebastian Insausti

Moodle is the most popular Learning Management System (LMS) allowing educators to create their own website with courses or content that extend learning. These kinds of platforms are getting more and more important to allow you to continue learning remotely when the traditional education system is not available or just as a complement of it, so the increase of traffic or users require to scale your environment to ensure a low response time.

Scalability is the property of a system/database to handle a growing amount of demands by adding resources. There are different approaches available to scale your database depending on the way that you need to scale it, and, in a production environment, most probably long downtime is not desired, so you should also take this into account.

In this blog, we will look at what scale options are available and how to scale your Moodle PostgreSQL database in an easy way without affecting your running system.

Horizontal Scaling and Vertical Scaling

There are two main ways to scale your database:

  • Horizontal Scaling (scale-out): It is performed by adding more database nodes creating or increasing a database cluster.
  • Vertical Scaling (scale-up): It is performed by adding more hardware resources (CPU, Memory, Disk) to an existing database node.

For Horizontal Scaling, you can add more database nodes as standby nodes. It can help you to improve the read performance balancing the traffic between the nodes. In this case, you will need to add a load balancer to distribute traffic to the correct node depending on the policy and the node state. You should also consider adding two or more load balancer nodes to avoid a single point of failure, and using a tool like “Keepalived”, to ensure the availability. Keepalived is a service that allows you to configure a Virtual IP Address within an active/passive group of servers. This Virtual IP Address is assigned to an active server (active Load Balancer). If this server fails, the IP Address is automatically migrated to the “Secondary” passive server, allowing it to continue working with the same IP Address in a transparent way for the systems.

For Vertical Scaling, it could be needed to change some configuration parameters to allow PostgreSQL to use a new or better hardware resource. Let’s see some of these parameters from the PostgreSQL documentation.

  • work_mem: Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. Several running sessions could be doing such operations concurrently, so the total memory used could be many times the value of work_mem.
  • maintenance_work_mem: Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. Larger settings might improve performance for vacuuming and for restoring database dumps.
  • autovacuum_work_mem: Specifies the maximum amount of memory to be used by each autovacuum worker process.
  • autovacuum_max_workers: Specifies the maximum number of autovacuum processes that may be running at any one time.
  • max_worker_processes: Sets the maximum number of background processes that the system can support. Specify the limit of the process like vacuuming, checkpoints, and more maintenance jobs.
  • max_parallel_workers: Sets the maximum number of workers that the system can support for parallel operations. Parallel workers are taken from the pool of worker processes established by the previous parameter.
  • max_parallel_maintenance_workers: Sets the maximum number of parallel workers that can be started by a single utility command. Currently, the only parallel utility command that supports the use of parallel workers is CREATE INDEX, and only when building a B-tree index.
  • effective_cache_size: Sets the planner's assumption about the effective size of the disk cache that is available to a single query. This is factored into estimates of the cost of using an index; a higher value makes it more likely index scans will be used, a lower value makes it more likely sequential scans will be used.
  • shared_buffers: Sets the amount of memory the database server uses for shared memory buffers. Settings significantly higher than the minimum are usually needed for good performance.
  • temp_buffers: Sets the maximum number of temporary buffers used by each database session. These are session-local buffers used only for access to temporary tables.
  • effective_io_concurrency: Sets the number of concurrent disk I/O operations that PostgreSQL expects can be executed simultaneously. Raising this value will increase the number of I/O operations that any individual PostgreSQL session attempts to initiate in parallel. Currently, this setting only affects bitmap heap scans.
  • max_connections: Determines the maximum number of concurrent connections to the database server. Increasing this parameter allows PostgreSQL to run more backend processes simultaneously.

The challenge here can be how to know if you need to scale your Moodle Database and in what way, and the answer is Monitoring.

Monitoring PostgreSQL for Moodle

Scaling a database is a complex process, so you should check some metrics to be able to determine the best strategy to scale it.

You can monitor the CPU, Memory, and Disk usage to determine if there is some configuration issue or if actually you need to scale your database. For example, if you are seeing a high server load but the database activity is low, probably there is no need to scale it, you only need to check the configuration parameters to match it with your hardware resources.

Checking the disk space used by the PostgreSQL node per database can help you to confirm if you need more disk or even a table partition. To check the disk space used by a database/table you can use some PostgreSQL function like pg_database_size or pg_table_size.

From the database side, you should check:

  • Amount of connection
  • Running queries
  • Index usage
  • Bloat
  • Replication Lag

These could be clear metrics to confirm if the scaling of your database is needed.

ClusterControl as a Scaling and Monitoring System

ClusterControl can help you to cope with both scaling ways that we mentioned earlier and to monitor all the necessary metrics to confirm the scaling requirement.

If you are not using ClusterControl yet, you can install it and deploy or import your current PostgreSQL database selecting the “Import” option and follow the steps, to take advantage of all the ClusterControl features like backups, automatic failover, alerts, monitoring, and more.

Horizontal Scaling

For horizontal scaling, if you go to cluster actions and select “Add Replication Slave”, you can either create a new replica from scratch or add an existing PostgreSQL database as a replica.

Let's see how adding a new replication slave can be a really easy task.

As you can see in the image, you only need to choose your Master server, enter the IP address for your new slave server and the database port. Then, you can choose if you want ClusterControl to install the software for you and if the replication slave should be Synchronous or Asynchronous.

In this way, you can add as many replicas as you want and spread read traffic between them using a load balancer, which you can also implement with ClusterControl.

Now, if you go to cluster actions and select “Add Load Balancer”, you can deploy a new HAProxy Load Balancer or add an existing one.

And then, in the same load balancer section, you can add a Keepalived service that will be running on the load balancer nodes for improving your high availability environment.

After adding a Load Balancer or using a Virtual IP having the Keepalived service in place, you must update your Moodle configuration to use the new database endpoint. For this, go to your Moodle root directory and modify the config.php file with the new IP address:

$CFG->dbhost    = 'IP_ADDRESS';

$CFG->dbname    = 'moodle';

$CFG->dbuser    = 'mdluser';

$CFG->dbpass    = '********';

$CFG->prefix    = 'mdl_';

$CFG->dboptions = array (

  'dbpersist' => 0,

  'dbport' => PORT,

  'dbsocket' => '',

);

Make sure you can access your database through the Load Balancer or Virtual IP address, or if you need to update your pg_hba.conf PostgreSQL file to allow it.

Vertical Scaling

For vertical scaling, with ClusterControl you can monitor your database nodes from both the operating system and the database side. You can check some metrics like CPU usage, Memory, connections, top queries, running queries, and even more. You can also enable the Dashboard section, which allows you to see the metrics in a more detailed and friendlier way.

From ClusterControl, you can also perform different management tasks like Reboot Host, Rebuild Replication Slave, or Promote Slave, with one click.

Conclusion

Scaling your Moodle PostgreSQL database can be a hard task as you will need to know how you need to scale and how to do it without affecting the systems. Having a good monitoring system is the first step to know when and how you need to scale your Moodle Database. Adding a Load Balancer will help you to avoid unnecessary downtime and it will also improve the High Availability in your LMS environment.

All these things that we mentioned can be done using ClusterControl which will make the job easier. ClusterControl provides a whole range of features, like monitoring, alerting, automatic failover, backup, point-in-time recovery, backup verification, scaling, and more.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.