Severalnines Blog
The automation and management blog for open source databases

Scaling Your Time-Series Database - How to Simply Scale TimescaleDB

In the previous blogs, my colleagues and I showed you how you can monitor performance, manage and deploy clusters, run backups and even enable automatic failover for TimescaleDB.

In this blog we will show you how to scale your single TimescaleDB instance to multi-node cluster in just a few simple steps.

We will start with a common setup, a single node instance running on CentosOS. The node is up-and-running and it’s already being monitored and managed by the ClusterControl.

If you would like to learn how to deploy or import your TimescaleDB instance, check out the blog written by my colleague Sebastian Insausti, “How to Easily Deploy TimescaleDB.”

The setup looks as follows...

ClusterControl: Single instance TimescaleDB
ClusterControl: Single instance TimescaleDB

So, it’s a single production instance and we want to convert it to cluster with no downtime. Our main goal is to scale application read operations to other machines with an option to use them as staging HA servers when writing server crash.

More nodes should also reduce application maintenance downtime. Like patching applied in the rolling restart mode - one node patched at the time while other nodes are serving database connections.

The last requirement is to create a single address for our new cluster so our new nodes will be visible for the application from one place.

We can summarize our action plan into two major steps:

  • Adding a replica reads
  • Install and configure Haproxy

Adding a Replica Reads

If we go to cluster actions and select “Add Replication Slave”, we can either create a new replica from scratch or add an existing TimescaleDB database as a replica.

ClusterControl: Add replication slave
ClusterControl: Add replication slave
ClusterControl: Add new Replication slave, Import existing Replication Slave
ClusterControl: Add new Replication slave, Import existing Replication Slave

As you can see in the below image, we only need to choose our Master server, enter the IP address for our new slave server and the database port.

ClusterControl: Add replication slave
ClusterControl: Add replication slave

Then we can choose if we want ClusterControl to install the software for us and if the replication slave should be Synchronous or Asynchronous. When you are importing existing slave server you can use the import option as follows:

ClusterControl: Import replication slave for TimescaleDB
ClusterControl: Import replication slave for TimescaleDB

Both ways, we can add as many replicas as we want. In our example case, we will add two nodes. CusterControl will create an internal job and take care of all the necessary steps with one none at a time.

ClusterControl: add read replica
ClusterControl: add read replica
ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

Adding a Load Balancer to TimescaleDB

At this point, our data is distributed across multiple nodes or data centers if you chose to add replication slave nodes in a different location. The cluster is scaled out with two additional read replica nodes.

ClusterControl: Two nodes added
ClusterControl: Two nodes added

The question is how does the application know which database node to access? We will use HAProxy and different ports for write and read operations.

From the TimescaleDB cluster, context menu choose to add load balancer.

Now we need to provide the location of the server where Haproxy should be installed, what policy we want to use for database connections and which nodes take part of the Haproxy configuration.

When all is set hit deploy button. After a few minutes, we should get our cluster configuration ready. ClusterControl will take care of all prerequisites and configurations to deploy load balancer.

After a successful deployment, we can see our new cluster’s topology; with load balancing and additional read nodes. With more nodes on-board, ClusterControl automatically enables auto recovery. This way when the master node goes down, the failover operation will start by itself.

ClusterControl: Final topology
ClusterControl: Final topology

Conclusion

TimescaleDB is an open-source database invented to make SQL scalable for time-series data. Having an automated way to extend their cluster is a key to achieving performance and efficiency. As we have seen above, you can now scale TimescaleDB by using ClusterControl with ease.