What is the write-ahead log?

For example, four clients send requests to perform different operations on the server. The server queues all the operation requests and executes them sequentially. When the server started to execute the operation request sent by Client C, the server crashed. After rebooting, the server requests all the clients to resend the requests. The server receives the requests from all the clients but not in the same order. The server redoes the same operations, but the data is no longer consistent.

Definition

We need to ensure data integrity and durability in the scenarios above. For this purpose, we need to log each operation that modifies data onto the disk in append-only mode. Such a log file is called the write-ahead log (WAL). It helps in the recovery process and atomicity of transactions in case of failure. This log file has different aliases, such as transaction log and commit log.

Solution

All the operations that modify the data are first logged onto the disk before performing the operation. The operation log should contain a sufficient amount of information, as shown below, that will help the system to either perform the UNDO or REDO operation.

All the operations that have not been applied yet are now performed. This operation is known as roll-forward recovery, also known as REDO.
The operations are performed, but the transaction is not successfully committed and will be removed. This operation is known as roll-backward recovery, also known as UNDO.

Because of WAL, the system only performs the missing operations, thereby reducing the number of disk operations and increasing performance. All the instructions are executed after the last commit statement.

Imagine creating a list of things to do and appending the everyday to-do list in the same notebook. When we need to check today’s to-do list after a month, opening the notebook and finding the page containing today’s to-do list won’t be easy. If we do the same thing with the log, that won’t be a wise decision either. We can handle such issues by using segmented logs and low-water mark techniques.

Other than the size of the log file, there can be other issues too, such as a client getting disconnected and performing the same operation after getting connected again. So while logging operations in the log file, it is crucial to ensure that duplicate operations are not performed twice.

For example, let’s say a client requested the server to append some data to the file. The server logs that operation and then performs it. However, before the server was able to send the acknowledgment to the client, the client got disconnected. For the client, the operation is still not performed. Upon connecting to the server again, the client sends the request for the append operation once more. If we don’t ensure that duplicate operations are not performed, then the server can contain inconsistent data. We can do this by using a data structure like HashMap, where all the values are hashed before storing and duplicates are avoided.

Example

Let’s discuss some examples where WAL is used:

In ZooKeeper (learn more about it in Educative’s exclusive course) and RAFT, the logging implemented is similar to WAL.
In Kafka (learn more about it in Educative’s exclusive course), the implementation of storage has a similar structure to WAL in databases.
In Cassandra and all other databases (both SQL and NoSQL), WAL is used to ensure durability.
In Chubby (learn more about it in Educative’s exclusive course), when the leader fails, WAL is used to store the transaction log.

New on Educative

Learn to Code

Learn any Language as a beginner

Develop a human edge in an AI powered world and learn to code with AI from our beginner friendly catalog

🏆 Leaderboard

Daily Coding Challenge

Solve a new coding challenge every day and climb the leaderboard

Free Resources