Introduction

Databases are made for scale and are a highly concurrent system. Thus it is normal for them to expect multiple concurrent connections.

Also, in most situations, we’ll want our database to be the source of truth and always contain consistent data.

There are many concurrency related phenomena that can occur in a database when multiple transactions try to access/modify the same block of data. Ex dirty reads, dirty writes etc.

One such phenomenon is the lost update problem.

Before we jump in to discuss what the lost update problem is and how we can tackle it, let’s quickly brush the different isolation levels that are supported in most databases.

Database Transaction Isolation Level

Most databases provide the following transaction isolation levels.

Read Committed Isolation

The most basic level of transaction isolation is read committed.

It makes two promises:

You’ll only read data that has been committed. (no dirty reads)
You’ll only overwrite the data that has been already committed.

This is the default isolation level on most databases including postgres and mysql.

In postgres, we can check the transaction isolation level using the following command:

SHOW default_transaction_isolation;

Snapshot Isolation or Repeatable Reads

Here the idea is that each transaction reads from a consistent snapshot of the database.

That is, the transaction sees all the data that was committed in the database at the time the transaction started.

Databases generally implement snapshot isolation using a technique called as MVCC.

Serializable Isolation Level

Even though snapshot isolation looks like it should suffice in resolving our concurrency perils, it doesn’t protect us from problems like Phantom reads (we’ll cover this in another blog post).

This is the most strict isolation level that databases provide. These are logical equivalent of as if there was no concurrency in the database and all transactions were performed sequentially one after the other.

This is generally implemented using two-phase locking.

The Lost Update Problem

Lost update problem occurs when multiple transactions try to touch the same rows in the database.

The lost update problem can occur if an application reads a value, modifies it and writes it back to the database. If multiple transactions are trying to do the same thing on this same row, one or more updates will be lost.

This can have serious implications depending on the type of application.

Imagine, two transactions reading the account balance, modifying it (adding $100) and committing it back. What is one of the updates for lost?

142d943e2b6b9dbcbadc39a72a4bbd68

In the above figure, the counter is incremented twice but the value gets incremented by just one. An update got lost!

Ways To Tackle The Lost Update Problem

In my opinion for any application, having the lost update problem is unacceptable as it can compromise the durability (updates getting lost) and consistency (updates not causing the intended effect) for the database.

There are a few ways which can be used to tackle the problem:

1. Using SELECT FOR UPDATE:

In this method, transactions will have to take locks on the rows that they are trying to alter when reading them.

Other transactions will have to wait for the lock to be released to acquire the lock.

Though this solves the problem, the database won’t throw an error or let the user know that something wrong has happened even if the application developer forgets to acquire the lock via code.

BEGIN;
SELECT * FROM LOG WHERE ID = 4444 FOR UPDATE;

2. Repeatable Read Isolation Level

One way to ensure that the lost updates don’t happen at the database level is to use stricter isolation level.

SET default_transaction_isolation = "repeatable read"

This, in my opinion, puts us in a much safer state. We can still acquire the lock to make sure that transaction doesn’t try to overwrite each other’s updates, but if that ever happens, the database will fail the transaction that does so.

In most of the cases, retrying the failing transaction should suffice.

Please note that stricter isolation level like Serializable will also solve the issue.

Conclusion

Lost updates can happen in the databases, and depending upon the severity of the application it can have calamitous effects.

Though acquiring lock can prevent this problem but a miss in the application code can harm database’s ACID guarantees.

Thus it can be a better design choice to let the database throw an error and let the application know that it’s trying to overwrite an update instead of silently doing the damage.

That’s all, folks!

2 responses to “Tackling Lost Updates Problem In Database Using Stricter Transaction Isolation Level”

Trying to Wrap My Head Around The Dynamo Storage System: A Deep Dive – Priyank Verma

October 5, 2019 at 3:15 pm

[…] Dynamo is an eventual consistent storage system. The term eventual, is tricky. There is no upper bound here. Data will be replicated to all the replicas in 1 second, 10 seconds, 10 minutes… even 10 days, no one knows.Thus we must take care of scenarios like dirty reads and lost updates. […]

LikeLike

Charter Tunisie

May 26, 2024 at 4:36 pm

Very niice blog you have here

LikeLike