Considerations on Encryption for Data at Rest for MariaDB

Krzysztof Ksiazek

Data security is crucial in times of GDPR, PCI DSS or HIPPA. To comply with the regulations, one has to exercise extreme caution regarding how the data should be stored and protected. Data, typically, can be at rest or in transit. Data in transit is the data transferred from or to the database. Query results sent to the client or application or replicated data between nodes of the cluster are examples of cases when data is in transit. We tend to secure data in that state using the SSL or TLS - encrypted connections between the database nodes or the database and client.

On the other side of the spectrum, we have data at rest - we would say that most data is, at the given moment, at rest. We are talking here about data stored on disk in tablespaces, different data structures (gcache buffer, redo logs) and logs (binary and relay logs). Let’s take a look at the considerations around this subject in MariaDB.

What to Encrypt in MariaDB?

Ideally, you need to encrypt everything. Databases store data in different places and different ways, as mentioned above. The largest set of data is stored in tablespaces - this is the “final” location where the data is stored. Obviously, it is possible to encrypt tablespaces - otherwise, the whole feature would be pointless. MariaDB can store the data in one shared tablespace, several of them or every table may be stored in a separate tablespace - all of those scenarios are supported. Users have some level of flexibility when choosing what to encrypt. You can encrypt everything, individual tables or everything except some individual tables.

MariaDB InnoDB redo log

Another structure that stores the data is InnoDB redo log. InnoDB redo log is a place where data is written after a given row has been upgraded. Data from redo log will eventually be transferred to the tablespace but for a time being InnoDB redo log contains all of the modifications that happened recently. As you can imagine, this data is also critical and should be protected - MariaDB allows you to encrypt InnoDB redo log.

MariaDB Binary logs

Binary logs (as well as relay logs) store information about executed queries that modify the data. As the included information allows us to reconstruct the current state of a row that has undergone modification, this is another form of data that should be protected and encrypted. Both binary and relay logs can be encrypted in MariaDB.

Galera cache

Galera cache (gcache) is an on-disk buffer in Galera Cluster that stores the information about executed modifications. It is used in case of node failure or temporary network issues to allow nodes that join the cluster to catch up using just the data they are missing, avoiding transferring the whole dataset. Similar to binary logs or redo logs, gcache contains the list of modifications and, as such, it can be used to recover and put together pieces of data. In the community version of MariaDB Galera Cluster gcache cannot be encrypted. Such option becomes available in the Enterprise version of MariaDB Galera Cluster.

What can’t be Encrypted in MariaDB?

There are still some places where pieces of data may show up that cannot be encrypted, at least as of now, in MariaDB. Firstly, error logs may contain samples of queries that can potentially expose some data. It is impossible to encrypt error logs, but it is possible to redirect the error log to the syslog and implement some protection mechanism outside of MariaDB.

Logs from Audit Plugin

Audit Plugin generates log as well - this log may contain sensitive information, including the exact queries that have been executed on the database. It is not possible to encrypt this log, but it can be redirected to the syslog and encrypt there.

Query Logs

General and slow query logs - those logs will contain queries (or at least samples of them) that were executed by MariaDB. As of now, it is not possible to encrypt those logs.

InnoDB buffer pool

Memory - MariaDB performs encryption only for the pages that are stored on disk. All of the data that is stored in InnoDB buffer pool will be unencrypted. InnoDB buffer pool is intended to keep the rows recently modified or accessed by SELECT query - those rows will obviously contain data samples. As of now, there is no option to encrypt the InnoDB buffer pool in MariaDB. Please keep in mind that one would require access to the system to read the live memory. It is not a trivial task, even though it is not impossible to accomplish either.

Please keep in mind that we covered encryption options included in MariaDB. There is always the possibility to use another layer of encryption. For example, encrypt whole storage will render logs not readable for anyone who would have physical access to the disk. On the other hand, it won’t protect the data from someone who is able to log into the system.

Compatibility With External Tools

Another thing to consider is compatibility. If you decide to encrypt your MariaDB, you have to keep in mind that this may impact the way you operate. It is not possible to use external tools like XtraBackup or mysqlbinlog to process the data and create a backup or to deal with binary logs. You will have to stick to the tools created by MariaDB (like Mariabackup), which are written with the encryption mechanism in mind. They can handle the data at rest encryption is implemented in MariaDB.

Planning for Encryption Process

This section will not discuss the process in detail, but it looks at what you should consider when planning for encryption, like resources and time. CPU utilization will increase as well as the I/O activity for the duration of the process. From the user standpoint, it all comes down to the configuration settings and then executing ALTER commands to rebuild and encrypt existing tables. For large databases, this alone can be a significant challenge that would require planning. Schema changes can be a serious burden, and it is recommended to use tools like pt-online-schema-change to reduce the impact them on the production systems and gain better control over the process.

Final Thoughts

As we mentioned, data is critical for all organizations, and it is crucial to ensure the data is safe and is protected. Encryption of the data at rest is one of the important elements in the whole picture. We would love to hear from you about your experience with data at rest encryption in MariaDB. If you would like to share your thoughts, you are welcomed to leave a comment below.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.