Tips for Migrating from Proprietary to Open Source Databases

Krzysztof Ksiazek

Back in the day proprietary databases were the only acceptable options. 

“No one ever got fired for buying from Oracle/Microsoft/IBM” was the saying. 

Huge, monolithic databases used for every single purpose. Paid support - that’s how the database landscape looked like in the 90s and early 00s. Sure, open source databases were there but they were treated like a “toy database,” suitable for a small website, a blog maybe, or a very small e-shop. No one sane would use them for anything critical. 

Things have changed over time and open source databases have matured. More and more are being created every year. We now see specialization, allowing users to pick the best option for given workload - time series, analytical, columnstore, NoSQL, relational, key-value  - you can pick whatever databases you need and, usually, there are numerous options to pick from. This leads to open source being more and more popular in the database world. Companies are now looking at their bills from their proprietary database vendor and wonder if they can reduce it a bit by an adopting free open source database.

As usual, there are pros and cons. What are things you may want to consider before implementing open source database? In this blog post we will share some tips you may want to keep in mind when planning to migrate to open source databases.

Start Small and Expand

If your organisation does not use open source databases, most likely it doesn’t have experience in managing them either. While doable through external consultants, it is probably not a good idea to migrate your whole environment into open source databases. 

The better approach, instead, would be to start with small projects. Maybe you need to build some sort of an internal tool or website, maybe some sort of a monitoring tool - that’s a great opportunity to use open source databases and get the experience of using it in real world environments. This will let you to learn the ins and outs of a database - how to diagnose it, how to track its performance, how to tune it to improve its performance. 

On top of that you would learn more about the high availability options and implementations. Working with it also helps to understand the differences between the proprietary database that you use and the open source databases that you implemented for smaller projects. In time you should see the open source database footprint in your organization increased significantly along with the experience of your teams. 

At some point, after collecting enough experience to run the open source databases, you may decide to pull the lever and start a migration project. Another, also quite likely option, is that you would be gradually moving your operations to open source databases on project-by-project basis. Eventually your existing applications that use proprietary RDBMS will become deprecated and, eventually, replaced by new iteration of the software, which rely on open source databases.

Pick the Right Database for the Job

We already mentioned that open source databases are, typically, quite specialised. You can pick a datastore suitable for different uses - time series, document store, key-value store, columnar store, text search. This, combined with the previous tip, lets you pick exactly the type of a database that your small project requires. Let’s say you want to build a monitoring stack for your environment. 

You may want to look into time series databases like Prometheus or TimeScaleDB to power your monitoring application. Maybe you want to implement some sort of analytical/big data solutions. You would have to design some sort of an ETL process to extract the data from your main RDBMS and load it into an open source solution. There are numerous options to be used as a datastore. Depending on the requirements and data you can choose from a wide variety of databases, for example Clickhouse, Cassandra, Hive and many more.

What about moving some parts of the application from the proprietary RDBMS to an open source solution?

There are options for that as well. Datastores like PostgreSQL or MariaDB have a great deal of features related to easy the migration from datastores like Oracle. They do come with, for example, support for PL/SQL and other SQL features available for Oracle. This is a great help - less code has to be converted alongside with the migration from Oracle to other databases, it is more likely that your stored procedures and functions also will work mostly out of the box and you won’t have to spend a significant amount of time to rewrite, test and debug code that represents a core logic of your application.

You should also not forget about the resources you already have in your team. Maybe someone has some experience in more popular open source datastores? If those match your requirements and fit in your environment, you might be able to utilize existing knowledge in your team and easy transition into the open source world.

Consider Getting Support

Migrating into datastores that you don’t have too much experience with is always tricky. Even if you proceed with small steps you still may be caught by surprise behavior, unexpected situation, bugs or even just operational situations that you are not familiar with. 

Open source databases typically have a great community support - email lists, forums, Slack channels. Usually, when you ask, someone will attempt to help you. This may not be enough in some cases. If that’s the case, you should look for a paid support. There are numerous ways how this can be achieved. 

First of all, not all open source projects are made equal. Larger projects, like PostgreSQL or MySQL may have developed an ecosystem of consulting companies that hire experts who can do consulting for you. If such option, for whatever reason, is not available or feasible, you can always reach out to the project maintainers.

It is very common that the company, which develops the datastore, will be happy to help for a price.

Independent consulting companies have a significant advantage - they are independent and they will propose a solution based on their experience, not based on the datastore and tools they develop. Vendors, well, they will usually push their own solutions and environment. On the other hand, vendors are quite interested in increasing the adoption of their datastore. If you are migrating from a proprietary datastore, they may have been in that situation several times before thus they should be able to provide you with good practices, help you to design the migration process and so on. For an inexperienced company access to the experts in the field of migration might be a great asset.

You may find access to external support useful not only at the migration phase. Dealing with new datastore is always tricky and learning all of the aspects of the operations are time consuming. Even after migration you may still benefit from having someone you can call, ask and, what's most important, learn from. Trainings, remote DBA, all of those are common options in the world of open source databases, especially for those larger, more established projects.

Find a System or Tool to Help

Open source databases come with a variety of tools, some more or less complex. Some adds functionality (high availability, backups, monitoring), some are designed to make the database management easier. 

It is important to leverage them (consulting contracts may be useful here to introduce you to the most important tooling for the open source datastore of your choice) as they can significantly increase your operational velocity and the performance and stability of the whole setup. 

Some tools are free, some require a license to operate, you should look into the pool and pick what suits you the most. We are talking here about load balancers, tools to manage replication topology, Virtual IP management tools, clustering solutions, more or less dedicated monitoring and observability platforms that may track anything starting from typical database performance metrics through giving smart predictions based on the data to specialized analysis of the query performance. 

The open source world is where such tools thrive - this also has pros and cons. It’s easy to find “A” tool, it’s hard to find “THE” tool. You can find numerous projects on GitHub, everyone doing almost the same as others. Which one to choose is the hard part. That’s why, ideally you would have a helping hand in either a support contract or you can also rely on management platforms that will help you to manage multiple aspects of the operations on open source databases, including helping you to standardize and pick the correct tools for that.

There are also platforms like ClusterControl, which was designed to help non-experienced people fully grasp the power of open source databases. ClusterControl supports multiple different types of open source datastores: MySQL and its flavors, PostgreSQL, TimeScaleDB or MongoDB. It provides unified user interface to access management functions for those datastores, making it easy to deploy and manage them. Instead of spending time testing different tools and solutions, using ClusterControl you can easily deploy highly available environment, schedule backups and keep track of metrics in the system. 

Tools like ClusterControl reduce the load on your team, increasing their ability to understand what is happening in the database they are not as familiar with as they would have wanted.

Take Your Time

What’s good to keep in mind is that there is no need to rush. The stability of your environment and the well-being of your users is paramount. Take your time to run the tests, verify all the aspects of your application. Verify that you have proper high availability and disaster recovery processes in place. 

Only when you are 100% sure you are good to switch, it’d be time to pull the lever and make the switch to open source databases.

 
ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.