MySQL offers multiple storage engines to store its data, with InnoDB and MyISAM being the most popular ones. Each storage engine implements a more specific set of features required for a type of workload, and as a result, works differently from other engines. Since data is stored inside the storage engine, we need to understand how the storage engines work to determine the best backup tool. In general, MySQL backup tools perform a special operation in order to retrieve a consistent data - either lock the tables or establish a transaction isolation level that guarantees data read is unchanged.
MyISAM was the default storage engine for MySQL versions prior to 5.5.5. It is based on the older ISAM code but has many useful extensions. The major deficiency of MyISAM is the absence of transactions support. Aria is another storage engine with MyISAM heritage and is a MyISAM replacement in all MariaDB distributions. The main difference is that Aria is crash safe, whereas MyISAM is not. Being crash safe means that an Aria table can recover from unexpected failures in a much better way than a MyISAM table can. In most circumstances, backup operations for MyISAM and Aria are almost identical.
MyISAM uses table-level locking. It stores indexes in one file and data in another. MyISAM tables are generally more compact in size on disk when compared to InnoDB tables. With the nature of table-level locking and no transaction support, the recommended way to backup MyISAM tables is to acquire the global read lock by using FLUSH TABLE WITH READ LOCK (FTWRL) to make MySQL read-only temporarily or use LOCK TABLES statement explicitly. Without that, MyISAM backups will be inconsistent.
InnoDB is the default storage engine for MySQL and MariaDB. It provides the standard ACID-compliant transaction features, along with foreign key support and row-level locking.
Percona’s XtraDB is an enhanced version of the InnoDB storage engine for MySQL and MariaDB. It features some improvements that make it perform better in certain situations. It is backwards compatible with InnoDB, so it can be used as a drop-in replacement.
There are a number of key components in InnoDB that directly influences the behaviour of backup and restore operation:
- Crash recovery
- Multiversion concurrency control (MVCC)
InnoDB does transactions. A transaction will never be completed unless each individual operation within the group is successful (COMMIT). If any operation within the transaction fails, the entire transaction will fail and any changes will be undone (ROLLBACK).
The following example shows a transaction in MySQL (assuming autocommit is off):
BEGIN; UPDATE account.saving SET balance = (balance - 10) WHERE id = 2; UPDATE account.current SET balance = (balance + 10) WHERE id = 2; COMMIT;
A transaction starts with a BEGIN and ends with a COMMIT or ROLLBACK. In the above example, if the MySQL server crashes after the first UPDATE statement completed (line 2), that update would be rolled back and the balance value won’t change for this transaction. The ability to rollback is vital when performing crash recovery, as explained in the next section.
InnoDB maintains a transaction log, also called redo log. The redo log is physically represented as a set of files, typically named ib_logfile0 and ib_logfile1. The log contains a record of every change to InnoDB data. When InnoDB starts, it inspects the data files and the transaction log, and performs two steps:
- Applies committed transaction log entries to the data files.
- Performs an undo operation (rollback) on any transactions that modified data but did not commit.
The rollback is performed by a background thread, executed in parallel with transactions from new connections. Until the rollback operation is completed, new connections may encounter locking conflicts with recovered transactions. In most situations, even if the MySQL server was killed unexpectedly in the middle of heavy activity, the recovery process happens automatically. No action is needed from the DBA.
Percona Xtrabackup utilizes InnoDB crash recovery functionality to prepare the internally inconsistent backup (the binary copy of MySQL data directory) into a consistent and usable database again.
InnoDB is a multiversion concurrency control (MVCC) storage engine which means many versions of a single row can exist at the same time. Due to this nature, unlike MyISAM, InnoDB does not require a global read lock to get a consistent read. It utilizes its ACID-compliant transaction component called isolation. Isolation is the “i” in the acronym ACID - the isolation level determines the capabilities of a transaction to read/write data that is accessed by other transactions.
In order to get a consistent snapshot of InnoDB tables, one could simply start a transaction with REPEATABLE READ isolation level. In REPEATABLE READ, a read view is created at the start of the transaction, and this read view is held open for the duration of the transaction. For example, if you execute a SELECT statement at 6 AM, and come back in an open transaction at 6 PM, when you run the same SELECT, then you will see the exact same result set that you saw at 6 AM. This is part of MVCC capability and it is accomplished using row versioning and UNDO information.
Logical backup like mysqldump uses this approach to generate a consistent backup for InnoDB without explicit table lock that can cause the MySQL server to be read-only.
The MEMORY storage engine (formerly known as HEAP) creates special-purpose tables with contents that are stored in memory. Because the data is vulnerable to crashes, hardware issues, or power outages, only use these tables as temporary work areas or read-only caches for data pulled from other tables.
Due to the transient nature of data from MEMORY tables (data is not persisted to disk), only logical backup is capable of backing up these tables. Backup in physical format is not possible.
That’s it for today, but you can read more about backups in our whitepaper - The DevOps Guide to Database Backups for MySQL and MariaDB.