Importance of Append-only File in Redis

Krzysztof Ksiazek

While looking around Redis data directory you may have noticed several files, among them a file with .aof extension. 

[email protected]:~# ls -alh /var/lib/redis/
total 54M
drwxr-x---  2 redis redis 4.0K Jul  1 10:40 .
drwxr-xr-x 39 root  root  4.0K Jun 17 11:32 ..
-rw-r-----  1 redis redis  39M Jun 25 09:56 appendonly.aof
-rw-rw----  1 redis redis  16M Jul  1 10:36 dump.rdb

You may wonder what it is and what its role is. Actually, it is quite an important file for your Redis installation. Let’s have a quick look at it and see what it is for.

Snapshotting basics

First, the basics. Redis is an in-memory data store which means that all the data you store in it will reside in memory. As we all know, memory is quite volatile storage and cannot be trusted with any serious data. If we use Redis for storing data we can easily recreate (for example, a caching layer), this may be acceptable (even though, as we discussed in one of our earlier blogs, it still is better to have backups of your cache nodes) but generally speaking we would like to persist our data on disk so that it can survive restart of the nodes (no matter if it is a planned maintenance or a crash).

Luckily, Redis comes with a mechanism of snapshotting the data to disk. It can be invoked by hand through SAVE or BGSAVE commands in Redis. 

127.0.0.1:6379> SAVE
OK
127.0.0.1:6379> BGSAVE
Background saving started

Former will happen immediately, interfering with the operations on the database, latter will spawn a child process that will perform the dump, minimizing the impact to the performance of the Redis datastore.

Redis may also be configured to automatically snapshot the data.

127.0.0.1:6379> CONFIG GET save
1) "save"
2) "10 1000"

Here the snapshot will be performed every 10 seconds if at least 1000 changes to the dataset were made. You can reconfigure this setting to your liking:

127.0.0.1:6379> CONFIG SET save "5 1000"
OK

Here we have increased the frequency of the snapshotting as long as 1000 writes will happen.

This is ok but it is not ideal. Snapshots will be executed every second but you still may lose some data that happened within the last second. When using only the RDB snapshots you do not really have a proper durability.

Enters Append Only File

Given that RDB snapshot can’t deliver proper durability, Append Only File (AOF) has been created. The idea behind this is to store all of the changes that are happening in the database in the file. If you are familiar with other database systems like PostgreSQL or MySQL, you can think of it as a WAL or binary log. New entries are always appended (thus the name) so the writes are always sequential. This helps with performance, even with SSD sequential access is faster than the random one.

AOF has to be enabled in Redis configuration:

appendonly yes

Once enabled, it will take the role of a main source of truth regarding the status of the data. What it means is, whenever there will be a need to load the data, either after the restart or to provision replicas, AOF will be used for that.

Redis configuration file presents a couple more settings that govern the durability. First, appendfsync, defines if fsync is executed after every write to AOF. There are three options. ‘No’ means that fsync is not executed and when the data will be persisted on the disk depends on the settings of the operating system. Data will be persisted only when the filesystem cache will be flushed to disk and then persisted on the device. This is the fastest option but it does not provide proper durability regarding cases where the whole node crashes or is restarted. Second option, ‘everysec’, means that the fsync is performed after every second so, theoretically, assuming that the disk will persist data immediately after receiving the write, it is possible to lose up to a second of data. Third option, ‘always’, means that the fsync is performed after every write. This is the most expensive option performance-wise but it guarantees the best durability.

AOF file eventually will have to be rewritten otherwise it will grow indefinitely. How it is going to be done depends on the configuration: auto-aof-rewrite-percentage and auto-aof-rewrite-min-size define when exactly AOF should be rewritten. To reduce the length of the AOF it is also possible to combine RDB and AOF into one file. The setting aof-use-rdb-preamble, when enabled, means that the AOF file will be split into two parts. One would be RDB file and then the AOF tail. RDB will contain the snapshot of the database at the given moment and then AOF tail will continue keeping the track of the changes.

As we mentioned, AOF is quite useful for multiple purposes. First, it is obviously a way to persist the data stored in Redis. Then, when you use replication across Redis instances, replicas will reach out to the master and ask for the missing data. Such data will be read from AOF, making sure that the replica is up to data.

You can clearly see that the Append-only File has numerous functions. While not a must-have, it is the only way to obtain proper durability in Redis.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.