MySQL Storage Engine Optimization: Configuring InnoDB Optimization For High Performance

Lukas Vileikis

InnoDB is one of the most widely used storage engines in MySQL. This storage engine is known as a high-reliability and a high-performance storage engine and its key advantages include supporting row-level locking, foreign keys and following the ACID model. InnoDB replaces MyISAM as the default storage engine since MySQL 5.5, which was released in 2010.

This storage engine can be incredibly performant and powerful if optimized properly - today we’re taking a look at the things we can do to make it perform at the very best of its ability, but before we dive into InnoDB though, we should understand what the aforementioned ACID model is.

What is ACID and Why is it Important?

ACID is a set of properties of database transactions.The acronym translates to four words: Atomicity, Consistency, Isolation and Durability. In short, these properties ensure that database transactions are processed reliably and warrant data validity despite errors, power outages or any such issues. A database management system that adheres to these principles is said to be an ACID-compliant DBMS. Here’s how everything works in InnoDB:

  • Atomicity ensures that the statements in a transaction operate as an indivisible unit and that their effects are seen collectively or not at all;
  • Consistency is handled by MySQL’s logging mechanisms which record all changes to the database;
  • Isolation refers to InnoDB’s row-level locking;
  • Durability is also maintained because InnoDB maintains a log file that tracks all changes to the system.

Understanding InnoDB

Now that we have covered ACID, we should probably look at how InnoDB looks under the hood. Here’s how InnoDB looks like from the inside (image courtesy of Percona):

InnoDB Internals
InnoDB Internals

From the image above we can clearly see that InnoDB has a few parameters crucial to its performance and these are as follows:

  • The innodb_data_file_path parameter describes the system tablespace (the system tablespace is the storage area for the InnoDB data dictionary, the double write and change buffers and undo logs). The parameter depicts the file where data derived from InnoDB tables will be stored;
  • The innodb_buffer_pool_size parameter is a memory buffer that InnoDB uses to cache data and indexes of its tables;
  • The innodb_log_file_size parameter depicts the size of InnoDB log files;
  • The innodb_log_buffer_size parameter is used to write to the log files on disk;
  • The innodb_flush_log_at_trx_commit parameter controls the balance between strict ACID compliance and higher performance;
  • The innodb_lock_wait_timeout parameter is the length of time in seconds an InnoDB transaction waits for a row lock before giving up;
  • The innodb_flush_method parameter defines the method used to flush data to InnoDB data files and log files which can affect I/O throughput.

InnoDB also stores the data from its tables in a file called ibdata1 - the logs however are stored in two separate files named ib_logfile0 and ib_logfile1: all of those three files reside in the /var/lib/mysql directory. 

In order to make InnoDB as performant as possible, we must fine tune these parameters and optimize them as much as we can by looking at our available hardware resources.

Tuning InnoDB For High Performance

In order to adjust InnoDB’s performance on your hardware, follow these steps:

  • In order to extend innodb_data_file_path automatically, specify the autoextend attribute in the setting and restart the server. For example:

innodb_data_file_path=ibdata1:10M:autoextend

When the autoextend parameter is used, the data file automatically increases in size by 8MB increments each time space is required. A new auto-extending data file can also be specified like so (in this case, the new data file is called ibdata2):

innodb_data_file_path=ibdata1:10M;ibdata2:10M:autoextend
  • When using InnoDB, the main mechanism used is the buffer pool. InnoDB heavily relies on the buffer pool and as a rule of thumb, the innodb_buffer_pool_size parameter should be about 60% to 80% of the total available RAM on the server. Keep in mind that you should leave some RAM for the processes running in the OS as well;

  • InnoDB’s innodb_log_file_size should be set as big as possible, but not bigger than necessary. In this case, keep in mind that a bigger log file size is better for performance, but the bigger it is, the more recovery time after a crash is required. As such, there is no “one size fits all” solution, but it’s said that the combined size of the log files should be large enough. This helps the MySQL server from regularly working on checkpointing and disk flushing activity. This saves too much CPU and disk IO and can run smoothly during its peak time or high workload activity. Although the recommended approach is to test and experiment it yourself and find the optimal value yourself;

  • The innodb_log_buffer_size value should be set to at least 16M. A large log buffer allows large transactions to run without a need to write the log to disk before the transactions commit saving some disk I/O;

  • When tuning innodb_flush_log_at_trx_commit, keep in mind that this parameter accepts three values - 0, 1 and 2. With a value of 1 you get ACID compliance and with values 0 or 2 you get more performance, but less reliability because in that case transactions for which logs have not yet been flushed to disk can be lost in a crash;

  • In order to set innodb_lock_wait_timeout to a proper value, keep in mind that this parameter defines the time in seconds (the default value is 50) before issuing the following error and rolling back the current statement:

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
  • In InnoDB, there are multiple flush methods available. By default this setting is set to “async_unbuffered” on Windows machines if the value is set to NULL and to “fsync” in Linux machines. Here’s what the methods are and what they do:

InnoDB Flush Method

Purpose

normal

InnoDB will use simulated asynchronous I/O and buffered I/O.

unbuffered

InnoDB will use simulated asynchronous I/O and non-buffered I/O.

async_unbuffered

InnoDB will use Windows asynchronous I/O and non-buffered I/O. Default settings on Windows machines.

fsync

InnoDB will use the fsync() function to flush the data and the log files. Default setting on Linux machines.

O_DSYNC

InnoDB will use O_SYNC to open and flush the log files and the fsync()function to flush the data files. O_DSYNC is faster than O_DIRECT, but data may or may not be consistent due to latency or an outright crash.

nosync

Used for internal performance testing - unsupported.

littlesync

Used for internal performance testing - unsupported.

O_DIRECT

InnoDB will use O_DIRECT to open the data files and the fsync()function to flush both the data and the log files. In comparison with O_DSYNC, O_DIRECT is more stable and more data consistent, but slower. The OS cache will be avoided using this setting - this setting is the recommended setting on Linux machines.

O_DIRECT_NO_FSYNC

InnoDB will use O_DIRECT during flushing I/O - the “NO_FSYNC” part defines that the fsync() function will be skipped.

 
  • You should also consider enabling innodb_file_per_table setting. This parameter is ON by default in MySQL 5.6 and higher. This parameter relieves you of management issues relating to InnoDB tables by storing them in separate files and avoiding bloated main dictionaries and system tables. Enabling this variable also avoids from facing data recovery complexity when a certain table is corrupted
  • Now that you modified these settings per the instructions outlined above, you should be almost ready to go! Before you hit the ground running though, you should probably keep an eye on the busiest file in the entire InnoDB infrastructure - the ibdata1.

Dealing with ibdata1

There are several classes of information that are stored in ibdata1:

  1. The data of InnoDB tables;
  2. The indexes of InnoDB tables;
  3. InnoDB table metadata;
  4. Multiversion Concurrency Control (MVCC) data;
  5. The doublewrite buffer - such a buffer enables InnoDB to recover from half-written pages. The purpose of such a buffer is to prevent data corruption;
  6. The insert buffer - such a buffer is used by InnoDB to buffer updates to the same page so they can be performed at once and not one after another.

When dealing with big data sets, the ibdata1 file can get extremely large and this can be the core of a very frustrating problem - the file can only grow and by default, it cannot shrink. You can shut down MySQL and delete this file but this is not recommended unless you know what you are doing. When deleted, MySQL will not function properly as the dictionary and system tables are gone, thus the main system table is corrupted. 

In order to shrink ibdata1 once and for all, follow these steps:

  1. Dump all data from InnoDB databases. You can use mysqldump or mysqlpump for this action;
  2. Drop all databases except for the mysql, performance_schema and information_schema databases;
  3. Stop MySQL;
  4. Add the following to your my.cnf file:
    [mysqld]
    innodb_file_per_table = 1
    innodb_flush_method = O_DIRECT
    innodb_log_file_size = 25% of innodb_buffer_pool_size
    innodb_buffer_pool_size = up to 60-80% of available RAM.
  5. Delete the ibdata1 and ib_logfile* files (these will be recreated upon the next restart of MySQL);
  6. Start MySQL and restore the data from the dump you took before.  After performing the steps outlined above, the ibdata1 file will still grow, but it will no longer contain the data from InnoDB tables - the file will only contain metadata and each InnoDB table will exist outside of ibdata1. Now, if you go to the /var/lib/mysql directory, you will see two files representing each table you have with the InnoDB engine. The files will look like so:
    1. demotable.frm
    2. demotable.ibd

The .frm file contains the storage engine header and the .ibd file contains the table data and indexes of your table.

Before rolling out the changes though, make sure to fine-tune the parameters according to your infrastructure. These parameters can make or break InnoDB performance so make sure to keep an eye on them at all times. Now you should be good to go!

Summary

To summarize, optimizing the performance of InnoDB can be a great benefit if you develop applications that require data integrity and high performance at the same time - InnoDB allows you to change how much memory the engine is allowed to consume, to change the log file size, the flush method the engine uses and so on - these changes can make InnoDB perform extremely well if they are tuned properly. Before performing any enhancements though, beware of the consequences of your actions to both your server and MySQL.

As always, before optimizing anything for performance always take (and test!) backups so you can restore your data if necessary and always test any changes on a local server before rolling out the changes to production.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.