Understanding Indexes in MySQL: Part One

Lukas Vileikis

Indexes in MySQL are a very complex beast. We have covered MySQL indexes in the past, but we have never taken a deeper dive into them - we will do that in these series of blog posts. This blog post should act as a very general guide to indexes while the other parts of these series will dive a little bit deeper into these subjects. 

What are Indexes?

In general, as already noted in a previous blog post about indexes, an index is an alphabetical list of records with references to the pages on which they are mentioned. In MySQL, an index is a data structure that is most commonly used to quickly find rows. You might also hear the term “keys” - it refers to indexes too.

What do Indexes Do?

In MySQL indexes are used to quickly find rows with specific column values and to prevent reading through the entire table to find any rows relevant to the query. Indexes are mostly used when the data stored in a database system (for example, MySQL) gets bigger because the larger the table, the bigger the probability that you might benefit from indexes.

MySQL Index Types

As far as MySQL is concerned, you might have heard about it having multiple types of indexes:

  • A B-Tree INDEX - such an index is frequently used to speed up SELECT queries matching a WHERE clause. Such an index can be used on fields where values do not need to be unique, it also accepts NULL values.

  • A FULLTEXT INDEX - such an index is used to use full text search capabilities. This type of index finds keywords in the text instead of directly comparing values to the values in the index.

  • A UNIQUE INDEX is frequently used to remove duplicate values from a table. Enforces the uniqueness of row values.

  • A PRIMARY KEY is also an index - it’s frequently used together with fields having an AUTO_INCREMENT attribute. This type of index does not accept NULL values and once set, the values in the column which has a PRIMARY KEY cannot be changed.

  • A DESCENDING INDEX is an index that stores rows in a descending order. This type of index was introduced in MySQL 8.0 - MySQL will use this type of an index when a descending order is requested by the query.

Choosing Optimal Data Types for Indexes in MySQL

As far as indexes are concerned, there’s also the need to keep in mind that MySQL supports a wide variety of data types and some data types cannot be used together with certain kinds of indexes (for example, FULLTEXT indexes can only be used on text-based (CHAR, VARCHAR or TEXT) columns - they cannot be used on any other data types) so before actually choosing the indexes for your database design, decide on the data type you are going to use on the column in question (decide what kind of data class you are going to store: are you going to store numbers? String values? Both numbers and string values? etc.), then decide on the range of the values you are going to store (choose the one that you don’t think you will exceed because increasing the data type range can be a time-consuming task later on - we recommend you opt to use a simple data type), and if you do not intend to use NULL values in your columns, specify your fields as NOT NULL whenever you can - when a nullable column is indexed, it requires an extra byte per entry.

Choosing Optimal Character Sets and Collations for Indexes in MySQL

Aside from data types, also keep in mind that each character in MySQL takes up space. For example, UTF-8 characters may take anywhere between 1 and 4 bytes each, so you might want to avoid indexing, for example, 255 characters and only use, say, 50 or 100 characters for a certain column.

The Benefits and Drawbacks of Using Indexes in MySQL

The main benefit of using indexes in MySQL is the increased performance of search queries matching a WHERE clause - indexes speed up SELECT queries matching a WHERE clause because MySQL doesn’t read through the entire table to find rows relevant to the query. However, bear in mind that indexes have their own drawbacks. The main ones are as follows:

  • Indexes consume disk space.

  • Indexes degrade the performance of INSERT, UPDATE and DELETE queries - when data is updated, the index needs to be updated together with it.

  • MySQL does not protect you from using multiple types of indexes at the same time. In other words, you can use a PRIMARY KEY, an INDEX and a UNIQUE INDEX on the same column - MySQL does not protect you from doing such a mistake.

If you suspect that some of your queries are becoming slower, consider taking a look into the Query Monitor tab of ClusterControl - by enabling the query monitor you can see when a certain query was last seen and its maximum and average execution time which can help you to choose the best indexes for your table.

How to Choose the Best Index to Use?

To choose the best index to use, you can use MySQL’s built-in mechanisms. For example, you can use the query explainer - the EXPLAIN query. It will explain what table is used, if it has partitions or not, what indexes are possible to use and what key (index) is used. It will also return the index length and the amount of rows your query returns:

mysql> EXPLAIN SELECT * FROM demo_table WHERE demo_field = ‘demo’\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: demo_table
   partitions: NULL
         type: ref
possible_keys: demo_field
          key: demo_field
      key_len: 1022
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

In this case, keep in mind that indexes are frequently used to help MySQL efficiently retrieve data when data sets are larger than usual. If your table is small, you might not need to use indexes, but if you see that your tables are getting bigger and bigger, chances are you might benefit from an index.

In order to choose the best index to use for your specific scenario though, bear in mind that indexes can be a leading cause of performance problems too. Keep in mind that whether MySQL will effectively use the indexes or not depends on a couple of factors including the design of your queries, the indexes in use, the types of indexes in use, also your database load at the time the query is executed and other things. Here’s a couple of things to consider when using indexes in MySQL:

  • How much data do you have? Perhaps some of it is redundant?

  • What queries do you use? Would your queries use LIKE clauses? What about ordering?

  • What kind of an index would you need to use to improve the performance of your queries?

  • Would your indexes be large or small? Would you need to use an index on a prefix of the column to make its size smaller?

It is worth noting that you should probably avoid using multiple types of indexes (e.g a B-Tree index, a UNIQUE INDEX and a PRIMARY KEY) on the same column too.

Improving Query Performance with Indexes

To improve query performance with indexes, you need to take a look at your queries - the EXPLAIN statement can help with that. In general, here’s a couple of things you should consider if you want your indexes to improve the performance of your queries:

  • Only ask the database for what you need. In most cases, using SELECT column will be faster than using SELECT * (that is the case without using indexes too)

  • A B-tree index might be a fit if you search for exact values (e.g SELECT * FROM demo_table WHERE some_field = ‘x’) or if you want to search for values using wildcards (e.g SELECT * FROM demo_table WHERE some_field LIKE ‘demo%’ - in this case, bear in mind that using LIKE queries with anything in the beginning of it might do more harm than good - avoid using LIKE queries with a percentage sign in front of the text you’re searching - that way MySQL might not use an index because it doesn’t know what does the row value begin with) - though keep in mind that a B-tree index can also be used for column comparisons in expressions that use the equal (=), more than (>), more than or equal to (>=), less than (<), less than or equal to (<=) or BETWEEN operators.

  • A FULLTEXT index might be a fit if you find yourself using full-text (MATCH ... AGAINST()) search queries or if your database is designed in such a way that only uses text-based columns - FULLTEXT indexes can use TEXT, CHAR or VARCHAR columns, they cannot be used on any other types of columns.

  • A covering index might be of use if you want to run queries without additional I/O reads on big tables. To create a covering index, cover the WHERE, GROUP BY and SELECT clauses used by the query.

We will further look into the types of indexes in the upcoming parts of this blog series, but in general, if you use queries like SELECT * FROM demo_table WHERE some_field = ‘x’ a B-tree INDEX might be a fit, if you use MATCH() AGAINST() queries you should probably look into a FULLTEXT index, if your table has very long row values, you should probably look into indexing a part of the column.

How Many Indexes Should You Have?

If you ever used indexes to improve the performance of your SELECT queries, you have probably asked yourself a question: how many indexes should you actually have? In order to understand this, you need to keep the following things in mind:

  1. Indexes are usually the most effective with big amounts of data.

  2. MySQL uses only one index per each SELECT statement in a query (subqueries are seen as separate statements) - use the EXPLAIN query to find out which indexes are the most effective for the queries you use.

  3. Indexes should make all of your SELECT statements fast enough without compromising too much on disk space - “fast enough”, however, is relative so you would need to experiment.

Indexes and Storage Engines

When dealing with indexes in MySQL, also keep in mind that there might be some kinds of limitations if you use various engines (for example if you use MyISAM as opposed to InnoDB). We will go into more detail in a separate blog, but here are some ideas:

  • The maximum number of indexes per MyISAM and InnoDB tables are 64, the maximum number of columns per index in both storage engines is 16.

  • The maximum key length for InnoDB is 3500 bytes - the maximum key length for MyISAM is 1000 bytes.

  • The fulltext indexes have limitations in certain storage engines - for example, the InnoDB fulltext indexes have 36 stopwords, MyISAM stopword list is a little bit bigger with 143 stopwords. InnoDB derives these stopwords from the innodb_ft_server_stopword_table variable while MyISAM derives these stopwords from the storage/myisam/ft_static.c file - all words that are found in the file will be treated as stopwords.

  • MyISAM was the only storage engine with the support for full-text search options until MySQL 5.6 (MySQL 5.6.4 to be exact) came around meaning that InnoDB supports full-text indexes since MySQL 5.6.4. When a FULLTEXT index is in use, it finds keywords in the text instead of comparing values directly to the values in the index.

  • Indexes play a very important role for InnoDB - InnoDB locks rows when it accesses them, so a reduced number of rows InnoDB accesses can reduce locks.

  • MySQL allows you to use duplicate indexes on the same column.

  • Certain storage engines have certain default types of indexes (e.g for the MEMORY storage engine the default index type is hash)

Summary

In this part about indexes in MySQL, we have gone through some general things related to indexes in this relational database management system. In the upcoming blog posts we will go through some more in-depth scenarios of using indexes in MySQL including the usage of indexes in certain storage engines etc. - we will also explain how ClusterControl can be used to achieve your performance goals in MySQL.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.