Using the Aria Storage Engine with MariaDB Server

Sebastian Insausti

MariaDB Server is one of the most popular open-source database servers. It was created by the original developers of MySQL and it became popular for being fast, scalable, and robust. MariaDB has a rich ecosystem of storage engines, plugins, and other available tools that make it very versatile for a wide variety of use cases.

As for the MariaDB storage engine, you have different types to choose from such as XtraDB, InnoDB, MyRocks, MyISAM, or even Aria. There is not a best storage engine type, because it depends on the workload itself. The last one mentioned, Aria Storage Engine, is compiled by default from MariaDB 5.1 and it is required to be 'in use' when the MariaDB service is started.

In this blog, we will see what Aria Storage Engine is, and how to use it in a MariaDB Server.

What is Aria Storage?

Aria is a storage engine for MySQL and MariaDB. It was originally developed with the goal of becoming the default transactional and non-transactional storage engine for MariaDB and MySQL.

Currently, it supports encryption and deadlock detection, and it also offers a crash-safe alternative to MyISAM. When MariaDB restarts after a crash, Aria recovers all tables to the state as of the start of a statement or at the start of the last LOCK TABLES statement.

Aria supports external and internal check, repair, and compression of rows, different row formats, different index compress formats, aria_chk, and more. 

This storage engine has been used for the MariaDB system tables since the 10.4 version.

Differences Between Aria and MyISAM

Let’s see some basic differences between Aria and his direct competitor: MyISAM, and then the advantages and disadvantages of the Aria Storage Engine.

  • Aria uses big log files (1G by default).
  • Aria has a log control file (aria_log_control) and log files (aria_log.%). The log files can be automatically purged when not needed or purged on demand.
  • Aria uses 8K pages by default, while MyISAM uses 1K. This makes Aria a bit faster when using keys of fixed size, but slower when using variable-length packed keys.

Advantages of Aria Storage Engine

  • Data and indexes are crash-safe.
  • On a crash, changes will be rolled back to the state of the start of a statement or a last LOCK TABLES statement.
  • Aria can replay almost everything from the log. The things that can't be replayed yet are:
    • Batch INSERT into an empty table.
    • ALTER TABLEs.
  • LOAD INDEX can skip index blocks for unwanted indexes.
  • Supports all MyISAM ROW formats and new PAGE format where data is stored in pages. 
  • Multiple concurrent inserters into the same table.
  • When using PAGE format, row data is cached by page cache.
  • Aria has unit tests of most parts.
  • Supports both crash-safe and not transactional tables.
  • PAGE is the only crash-safe/transactional row format.
  • PAGE format should give a notable speed improvement on systems that have bad data caching.
  • From MariaDB 10.5, the max key length is 2000 bytes, compared to 1000 bytes in MyISAM.

Disadvantages of Aria Storage Engine

  • Aria doesn't support INSERT DELAYED.
  • Aria doesn't support multiple key caches.
  • The storage of very small rows (< 25 bytes) is not efficient for PAGE format.
  • MERGE tables don't support Aria.
  • Aria data pages in block format have an overhead of 10 bytes/page and 5 bytes/row. Transaction and multiple concurrent-writer support will use an extra overhead of 7 bytes for new rows, 14 bytes for deleted rows, and 0 bytes for old compacted rows.
  • No external locking.
  • Aria has one page size for both index and data. MyISAM supports different page sizes per index.
  • Small overhead per index page (15 bytes).
  • The minimum data file size for PAGE format is 16K.
  • Aria doesn't support indexes on virtual fields.

The Aria Storage Formats

It supports three different table storage formats.

Fixed-length

These tables contain records of a fixed-length. Each column is the same length for all records, regardless of the actual contents. It is the default format if a table has no BLOB, TEXT, VARCHAR or VARBINARY fields, and no ROW FORMAT is provided.

Characteristics:

  • Fast, since MariaDB will always know where a record begins.
  • Easy to cache.
  • Take up more space than dynamic tables, as the maximum amount of storage space will be allocated to each record.
  • Reconstructing after a crash is uncomplicated due to the fixed positions.
  • No fragmentation or need to re-organize, unless records have been deleted and you want to free the space up.

Tables containing BLOB or TEXT fields cannot be FIXED as, by design, these are both dynamic fields.

Dynamic

These tables contain records of a variable length. It is the default format if a table has any BLOB, TEXT, VARCHAR, or VARBINARY fields, and no ROW FORMAT is provided.

Characteristics:

  • Each row contains a header indicating the length of the row.
  • Rows tend to become fragmented easily. UPDATING a record to be longer will likely ensure it is stored in different places on the disk.
  • All string columns with a length of four or more are dynamic.
  • They require much less space than fixed-length tables.
  • Restoring after a crash is more complicated than with FIXED tables.

Page

It is the default format for Aria tables, and is the only format that can be used if TRANSACTIONAL is set to 1.

Characteristics:

  • It is cached by the page cache, which gives a better random performance as it uses fewer system calls.
  • It doesn’t fragment as easily as the DYNAMIC format during UPDATES. The maximum number of fragments is very low.
  • Updates more quickly than dynamic tables.
  • Has a slight storage overhead, mainly notable on very small rows.
  • Slower to perform a full table scan.
  • Slower if there are multiple duplicate keys, as Aria will first write a row, then keys, and only then check for duplicates.

To know the storage format used by a table you can use the SHOW TABLE STATUS statement.

Transactional Options for Aria Storage Engine

Actually, for Aria, transactional means crash-safe, and it is not supported for partitioned tables. It also requires the PAGE row format to make it work.

The TRANSACTIONAL and ROW_FORMAT table options interact as follows:

  • If TRANSACTIONAL=1 is set, then the only supported row format is PAGE. If ROW_FORMAT is set to some other value, then Aria issues a warning, but still forces the row format to be PAGE.
  • If TRANSACTIONAL=0 is set, then the table will not be crash-safe, and any row format is supported.
  • If TRANSACTIONAL is not set to any value, then any row format is supported. If ROW_FORMAT is set, then the table will use that row format. Otherwise, the table will use the default PAGE row format. In this case, if the table uses the PAGE row format, then it will be crash-safe. If it uses some other row format, then it will not be crash-safe.

How to Use the Aria Storage Engine on MariaDB Server

First, you need to create a database (if you don’t have one created), and use it:

MariaDB [(none)]> create database db1;

Query OK, 1 row affected (0.003 sec)

MariaDB [(none)]> use db1

Database changed

Then, create a table using the “Aria” engine:

MariaDB [db1]> CREATE TABLE table1 (id int(11) DEFAULT NULL, name text)

    -> ENGINE=Aria

    -> TRANSACTIONAL=1;

Query OK, 0 rows affected (0.025 sec)

We specified the TRANSACTIONAL value in 1 to see it here, but, as we mentioned, is not necessary as it will be 1 by default if we are using Aria without specifying Row Format and Transactional values. Now, you will have the table created:

MariaDB [db1]> SHOW CREATE TABLE table1\G

*************************** 1. row ***************************

       Table: table1

Create Table: CREATE TABLE `table1` (

  `id` int(11) DEFAULT NULL,

  `name` text DEFAULT NULL

) ENGINE=Aria DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1 TRANSACTIONAL=1

1 row in set (0.000 sec)

And in the table status, you can check both the transactional and row format values:

MariaDB [db1]> SHOW TABLE STATUS\G

*************************** 1. row ***************************

            Name: table1

          Engine: Aria

         Version: 10

      Row_format: Page

            Rows: 0

  Avg_row_length: 0

     Data_length: 8192

 Max_data_length: 17592186011648

    Index_length: 8192

       Data_free: 0

  Auto_increment: NULL

     Create_time: 2020-06-30 18:59:17

     Update_time: 2020-06-30 18:59:17

      Check_time: NULL

       Collation: latin1_swedish_ci

        Checksum: NULL

  Create_options: transactional=1

         Comment:

Max_index_length: 137438945280

       Temporary: N

1 rows in set (0.001 sec)

There are many parameters to configure related to Aria Storage Engine. You can find a full list in the official documentation site.

Aria Storage Engine Tools

Let’s see some tools for working with this storage engine.

aria_chk

Aria_chk is used to check, repair, optimize, sort, and get information about Aria tables. With the MariaDB server, you can use CHECK TABLE, REPAIR TABLE, and OPTIMIZE TABLE to do similar things.

This tool should not be used when MariaDB is running as it assumes the table won’t be changed during his usage.

$ aria_chk [OPTIONS] aria_tables[.MAI]

Similar to MyISAM, the Aria table information is stored in 2 different files: 

  • MAI file contains base table information and the index.
  • MAD file contains the data. 
Aria_chk takes one or more MAI files as arguments.

For example, to check all your tables and repairs only those that have an error, run this command in your data directory:

$ aria_chk --check --force --sort_buffer_size=1G */*.MAI

Checking Aria file: db1/table1.MAI

Data records:       0   Deleted blocks:       0

- check file-size

- check key delete-chain

- check index reference

- check record links

...

aria_pack

Aria_pack is a tool for compressing Aria tables. The resulting tables are read-only, and usually about 40% to 70% smaller. The file name used by this tool is the .MAI index file.

$ aria_pack [options] file_name [file_name2...]

Aria_pack compresses each column separately, and, when the resulting data is read, only the individual rows and columns required need to be decompressed, allowing for quicker reading.

$ aria_pack /var/lib/mysql/world/country

Compressing aria_pack /var/lib/mysql/world/country.MAD: (549 records)

- Calculating statistics

- Compressing file

37.71%

Remember to run aria_chk -rq on compressed tables

Once a table has been packed, use the command aria_chk -rq to rebuild its indexes.

$ aria_chk -rq --ignore-control-file /var/lib/mysql/world/country

Recreating table '/var/lib/mysql/world/country'

- check record delete-chain

- recovering (with sort) Aria-table '/var/lib/mysql/world/country'

Data records: 549

- Fixing index 1

State updated

aria_read_log

Aria_read_log is a tool for displaying and applying log records from an Aria transaction log.

$ aria_read_log OPTIONS

You need to use one of “-d” or “-a” options:

  • a: Apply log to tables: modifies tables. You should make a backup first. Displays a lot of information if you don’t use the --silent parameter.
  • d: Display brief info read from records' header.
$ cd /var/lib/mysql

$ aria_read_log -d

You are using --display-only, NOTHING will be written to disk

The transaction log starts from lsn (1,0x2007)

TRACE of the last aria_read_log

Rec#1 LSN (1,0x2007) short_trid 0 redo_create_table(num_type:30) len 1042

Rec#2 LSN (1,0x2421) short_trid 0 redo_create_table(num_type:30) len 527

Rec#3 LSN (1,0x2638) short_trid 61986 long_transaction_id(num_type:36) len 6

Rec#4 LSN (1,0x2641) short_trid 61986 file_id(num_type:35) len 22

Rec#5 LSN (1,0x265d) short_trid 61986 undo_bulk_insert(num_type:39) len 9

Rec#6 LSN (1,0x266a) short_trid 0 incomplete_log(num_type:37) len 2

Rec#7 LSN (1,0x266f) short_trid 61986 commit(num_type:27) len 0

...

Conclusion

As you can see, Aria Storage Engine has many improvements against MyISAM, and it is a great storage engine alternative to be used. It is also easy to use as it is part of the MariaDB Server installation, so just by specifying the ENGINE table parameter is enough to enable it.

MariaDB is still working on this storage engine, so probably we will see new improvements in future versions soon.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.