The introduction of DevOps in organizations has changed the development process and also introduced some new challenges. In addition, developers and DevOps teams, along with their own chosen programming languages, also have their favorite database systems.
The product life cycle is getting shorter each year so developers want to be able to develop fast, using technologies they know best.
Having multiple RDBMS database backends means your organization will become more agile on the development side, but it also imposes additional knowledge on the operation teams.
Extending your infrastructure from one to many databases implies you have to also monitor, manage and scale them.
As every storage backend excels at different use cases, this also means you have to reinvent the wheel for every one of them.
Knowing the similarities and key differences will help you to immerse into different flavors of RDBMS.
In this article we will go through the following points:
- A brief introduction to the platform
- Oracle, MSSQL, MySQL , PostgreSQL
- Platform support
- Installation process
- Database access
- Backup process
- Controlling query execution
- Replication options
- Community support
A brief introduction to the platform
PostgreSQL is for many recognized as the world’s most advanced open source database. It is a fully open source database system released under its own license, the PostgreSQL License, comparable to the MIT or BSD licenses. The PostgreSQL community is active and continuously improving existing and new features. As per the DB-engine popularity rank, PostgreSQL was the DBMS of the year 2017 and 2018. The DB-Engines popularity shows that the trend didn’t change over the years.
An interesting fact is that PostgreSQL didn’t support SQL until 1994. The QUEL language was used to query data from it. SQL support was added later on.
PostgreSQL has many advanced features that other enterprise database management systems offer, such as such as views, stored procedures, indexes, and triggers in addition to the primary key, foreign key and atomicity features.
PostgreSQL can be extended by users by modifying existing features, adding new features and distributed freely as it is open-source. It runs on major platforms such as UNIX, MacOS, Windows, and Linux etc. It supports video, text, audio, images, programming interfaces for different languages. The list of supported languages includes C/C++, Java, Python, Perl etc.
Oracle is one of the largest vendors of RDBMS (relational database management system) in the IT world. It is known as an Oracle database, Oracle DB or Oracle marketed by Oracle.
Oracle Database is being used by many companies in the IT industry for transaction processing, business analytics, business intelligence application purpose, etc..
Oracle has a long and very interesting history:
On 16th June 1977 Software Development Laboratories (SDL) was created in Santa Clara, California by Larry Ellison, Bob Miner, and Ed Oates. In 1977 Oracle took its name from the CIA project codename and the irst commercialized Oracle RDBMS is shown to the world in 1979.
Oracle database is available in different editions such as Enterprise edition Standard edition, Express edition, and Oracle Lite. The biggest competitor for Oracle database is the Microsoft SQL server.
Microsoft SQL Server is a very popular RDBMS with restrictive licensing and modest cost of ownership if the database is of significant size, or is used by a significant number of clients.
It’s one of the three market-leading database technologies, along with Oracle Database and IBM’s DB2.
It provides a very user-friendly interface and easy to learn, which has resulted in a large installed user base.
Like other RDBMS software, Microsoft SQL Server is built on top of SQL, a standardized programming language that database administrators (DBAs) and other IT professionals use to manage databases and query the data they contain. SQL Server is tied to Transact-SQL (T-SQL), an implementation of SQL from Microsoft that adds a set of proprietary programming extensions to the standard language.
MySQL is an Oracle-backed open source relational database management system based on SQL.
Originally conceived by the Swedish company MySQL AB, MySQL was acquired by Sun Microsystems in 2008 and then by Oracle when it bought Sun in 2010.
Developers can use MySQL under the GNU General Public License (GPL). The Enterprise version comes with support and additional features for security and high availability.
It’s the second most popular database in the world according to db-engines ranking and probably the most present database backend on the planet as it runs most of the internet services around the globe. MySQL runs on virtually all platforms, including Linux, UNIX, and Windows.
MySQL is an important component of an open source enterprise stack called LAMP.
LAMP is a web development platform that uses Linux as the operating system, Apache as the web server, MySQL as the relational database management system and PHP as the object-oriented scripting language.
The most popular version of Oracle DB, Oracle 12c is a truly enterprise RDBMS system which is supported on a variety of operating systems and platforms. Oracle dominates the database world in part because it runs on dozens of platforms, everything from a Mainframe, Sparc, Mac to Intel. The list includes following OS and architecture combinations: Linux on x86-64 (only Red Hat Enterprise Linux, Oracle Linux, and SUSE distributions are supported) Microsoft Windows on x86-64. Oracle Solaris on SPARC and x86-64. IBM AIX on POWER Systems. Linux on IBM zEnterprise Systems HP-UX on Itanium.
Being a Microsoft product, SQL was designed to be very much compatible with Windows OS. On November 16, 2016, Microsoft announced the beginning of a new story: SQL Server is now supported on Linux and Docker. Hell freezes over!
MYSQL carries out smoother execution on all platforms like Microsoft, UNIX, Linux, Mac etc.
In general, PostgreSQL can be expected to work on various (even exotic) CPU architectures and operating systems.
It includes CPU architectures like x86, x86_64, IA64, PowerPC, PowerPC 64, S/390, S/390x, Sparc, Sparc 64, Alpha, ARM, MIPS, MIPSEL, M68K, and PA-RISC. It is often possible to build on an unsupported CPU type by configuring with –disable-spinlocks, but performance will be poor.
PostgreSQL can be expected to work on the following operating systems: Linux (all recent distributions), Windows (Win2000 SP4 and later), FreeBSD, OpenBSD, NetBSD, Mac OS X, AIX, HP/UX, IRIX, Solaris, Tru64 Unix, and UnixWare.
From all four presented databases systems, Oracle has the most complex system requirements which comes with a complex installation process. On both Windows and Linux based platforms Oracle uses a dedicated Oracle Universal Installer (OUI) tool as the main installation process. The OUI is used to install the Oracle Database software. OUI is a graphical user interface utility that enables you to:
- View the Oracle software that is installed on your machine
- Install new Oracle Database software
- Delete Oracle software that is no longer required.
During the installation process, OUI will start the Oracle Database Configuration Assistant (DBCA) which can install a pre-created default database that contains example schemas or can guide you through the process of creating and configuring a customized database.
If you do not create a database during installation, you can invoke DBCA after you have installed the software, to create one or more databases.
Beginning with SQL Server 2016 (13.x), SQL Server is only available as a 64-bit application.
Installation happens via the Installation Wizard, a command prompt, or through sysprep tool.
The Installation Wizard runs the SQL Server Installation Center. To create a new installation of SQL Server, select the Installation option on the left side, and then click New SQL Server stand-alone installation or add features to an existing installation.
The Linux based installation is very similar to the open source database installation method. It supports packaging for Debian and RedHat based systems. The steps consist of repository configuration, package installation and post-installation configuration, quite similar to MySQL. The whole process is greatly described in the following article.
Oracle provides a set of binary distributions of MySQL. These include generic binary distributions in the form of compressed tar files (files with a .tar.gz extension) for a number of platforms, and binaries in platform-specific packages. On the Windows platform, the installation process is triggered by the standard installation wizard via GUI.
PostgreSQL is available in a majority of Linux distributions so it’s very likely you can install it through a simple yum or apt-get command. For the HA configuration, you can use the ClusterControl s9s tool or GUI. S9S tools can help you to create a PostgreSQL cluster with just one single line command:
$ s9s cluster
--cluster-name='PostgreSQL 11 Streaming Replication'
Creating PostgreSQL Cluster
Job 259 RUNNING [█▋ ] 15% Installing helper packages
For more information, check this blog.
Access to the database and DB creation
Oracle separates the process of the binary and database creation. Unlike other popular database systems, database creation involves much more steps.
The Database Configuration Assistant (DBCA) is the preferred way to create a database because it can do it in a much more automated approach. DBCA can be launched by the Oracle Universal Installer (OUI), depending on the type of install that you select. You can also launch DBCA as a standalone tool at any time after Oracle Database Installation.
You can run DBCA in interactive mode or non-interactive/silent mode. Interactive mode provides a graphical interface and guided workflow for creating and configuring a database. Non-interactive/silent mode enables you to script the database creation. You can run DBCA in non-interactive/silent mode by specifying command-line arguments, a response file or both.
When a database is created you can access it with a dedicated client called sqlplus. SQL*Plus is a terminal client program with which you can access Oracle Database.
SQL Server Management Studio (SSMS) is the main tool for administering the Database Engine and writing Transact-SQL code. SSMS is available as a free download from the Microsoft Download Center. The latest version can be used with older versions of the Database Engine.
Management Studio is a preferred method to create a new database. To create a database in Microsoft SQL Server, connect to the computer where Microsoft SQL Server is installed using an administrator account.
Start Microsoft SQL Server Management Studio and choose to create a database option. The wizard process will walk you through the process. If you prefer command line this can be done with CREATE DATABASE syntax.
In order to access your MySQL database use mysql client. The database creation is as simple as CREATE DATABASE
PostgreSQL database has the option for multiple ‘schemas’ which operate similarly to databases in MySQL.
Schemas contain the tables, indexes, etc, and can be accessed simultaneously by the same connection to the database that houses them. Access methods for PostgreSQL are defined in a file: pg_hba.conf. It can be located in various places. On Ubuntu 14.04 it is located in /etc/postgresql/9.3/main/pg_hba.conf, on Centos 7 on the other hand it’s located by default in /var/lib/pgsql/data/pg_hba.conf.
Oracle has the most complex, dedicated built-in backup tool of all four servers described here; it’s called Recovery Manager (RMAN).
RMAN allows you to run sophisticated backup policies and selective restores. The same operations usually require a lot of manual steps in other RDBMS.
We can take backups in two ways:
- disabling the database and copying physical files (so-called cold backup)
- using RMAN and make a backup without disabling the database (hot backup)
To make a hot backup, set the base in ARCHIVELOG mode. This will tell Oracle to not keep the copy of redo log files as an archivelogs.
In the MS SQL world, you can use the built-in T-SQL commands to backup and restore databases. There is no need to use tools like mysqlhotcopy and mysqldump.
MS SQL Server offers three different online backup strategies:
- Simple Recovery Model (ALTER DATABASE dbname SET RECOVERY SIMPLE)
- Full Recovery Model (ALTER DATABASE dbname SET RECOVERY FULL)
- Bulk-Logged Recovery Model (ALTER DATABASE dbname SET RECOVERY BULK_LOGGED)
The recommended model is the full recovery if no data loss is acceptable. This mode is similar to the MySQL feature when the binary log is enabled. You can recover the database to any point of time, but you should regularly back up the transaction log as well as the database.
The bulk-logged model can be used for large bulk operations such as importing data or creating indexes on big tables. It’s rather less common method to run a database, especially production. It does not support point-in-time recovery so it is generally used as a temporary solution.
The Simple model is useful when the database is rarely updated or for testing and development purposes. In SIMPLE mode, the transaction log of the database is cut each time after the transaction is completed. In the other modes, the log is truncated via CHECKPOINT statement or after the transaction backup file. In case the database is damaged, only the most recent backup can be recovered and all changes since this backup are lost.
Two most popular backup utilities are available for MySQL and MariaDB, namely mysqldump logical backup and binary backup Percona XtraBackup and MariaBackup (a fork of Percona XtraBackup). MySQL Enterprise version offers also mysqlbackup which is similar to XtraBackup and MariaBackup hot backup tools.
Most DBMS’s provide some built-in backup tools. PostgreSQL has pg_dump and pg_dumpall out of the box. However, you may want to use some other tools for your production databases. More information can be found in the top backup tools for PostgreSQL article.
Controlling Query execution and concurrency support
In Oracle, all the database objects are grouped by schemas. Schemas are collection of database objects and all the database objects are shared among all schemas and users. It can be translated to MySQL databases. Even though it is all shared, each user can be limited to certain schemas and tables via roles and permissions. This concept is quite similar to MySQL databases.Hi
MS SQL Server organizes all objects, such as tables, views, and procedures, by database names. Users are assigned to a log in, which is granted access to the specific database and its objects. Also, in SQL Server each database has a private, unshared disk file on the server.
MySQL only has MVCC support in InnoDB. It is a storage engine and by default is available in MySQL. It also provides ACID-complaint features like foreign key support and transaction handling. By default, each query is treated as a separate transaction, which is a different approach than in Oracle DB.
Postgres engine performs concurrency control by using a method called MVCC (Multiversion Concurrency Control). For every user connected to the database, the Postgres database gives a snapshot of the database at a particular instance. When the database must to update an item, it will add the newer version and point the old version as obsolete. It allows the database to save overhead but requires a regulated sweep to delete the old, outdated data.
Security features are great, the system provides multi-layered security including controls to evaluate risks, prevent unauthorized data disclosure, detect and report on database activities and enforce data access controls.
Security features are modest, the RDBMS offers fewer features than Oracle but still much more than Open Source database systems.
MySQL implements security based on Access Control Lists (ACLs) for all connections, queries, and other operations that a user may attempt to perform. There is also some support for SSL-encrypted connections between MySQL clients and servers.
PostgreSQL has ROLES and inherited roles to set and maintain permissions. PostgreSQL has native SSL support for connections to encrypt client/server communications. It also has Row Level Security.
In addition to this, PostgreSQL comes with a built-in enhancement called SE-PostgreSQL which provides additional access controls based on SELinux security policy. More details here.
Oracle database, similarly to MySQL, has a large community, mostly organized around https://community.oracle.com and passionate groups in any locations around the world like for example https://poug.org/en/. The paid support gives you access to the support group previously known as metalink, not support.oracle.com.
Compared to other database systems, MSSQL probably has the least organized community groups but still very active. Microsoft does a great job in promoting its products in the universities. This gives young developers, devops and DBAs easy access to the technology (free licenses) and any necessary materials.
MySQL has a large community of contributors who, particularly following the acquisition by Oracle, focus mainly on maintaining existing features with some new features emerging occasionally. The advantage over other open source databases is a very strong external vendor eco-system. Companies like MariaDB and Percona not only offer great support but also contribute by adding enterprise features into their open source versions.
PostgreSQL has a very strong and active community. Its community improves existing features while its innovative committers strive to ensure it remains the most advanced database with new features and security, limiting the distance between Oracle and MSSQL databases. PostgreSQL is known for having more features than other RDBMS on the market.
Oracle offers logical and physical replication through a built-in Oracle Data Guard. It is an enterprise feature.
Data Guard is a Ship Redo / Apply Redo technology, “redo” is the information needed to recover transactions.
A production database referred to as a primary database broadcasts redo to one or more replicas referred to as standby databases. When an insert or update is made to a table, this change is captured by the log writer into an archive log, and replicated to the standby system.
Standby databases are in a continuous phase of recovery, verifying and applying redo to maintain synchronization with the primary database. A standby database will also automatically re-synchronize if it becomes temporarily disconnected to the primary database due to power outages, network problems, etc.
For more flexible replication options like multisource, selective replication you should consider an extra paid tool, Oracle Golden Gate.
Microsoft SQL Server provides the following types of replication for use in distributed applications:
- Transactional replication
- Merge replication
- Snapshot replication
It can be greatly extended with Microsoft Integration Services, giving you an option to customize the replication flow out of the box.
PostgreSQL has several options available, each with its own pros and cons, depending on what is needed through replication. The build options are based on Write Ahead Log. Files are shipped to a standby server where they are read and replayed, or Streaming Replication, where a read-only standby server fetches transaction logs over a database connection to replay them. In the case of a more sophisticated replication architecture, you would probably like to check Slony (master to multiple slaves) or Bucardo (multimaster).
MySQL Replication is probably the most popular high availability solution for MySQL,
and widely used by top web services.
It is easy to set up but ongoing maintenance like software upgrades, schema changes, topology changes, failover and recovery have always been tricky.
MySQL replication does not require any third party tools, both master-slave and multimaster can be done out of the box.
The recent versions of MySQL added multi source replication and Global transaction id which make it even more reliable and easier to maintain.
Priority databases like Oracle and MSSQL offer robust management systems and fine support. Among the long list of supported features, users can get the reassuring feeling of access to enterprise support and paid knowledge systems.
On the other side, the cost of the license, not that big of a feature gap and enterprise plugins, will make you eager to shift to the open source decision easier than ever.
Using predefined processes and automation can not only save you time but also protect you from common mistakes.
A management platform that systematically addresses all the different aspects of the database lifecycle will be more robust than patching together a number of point solutions.