Reza Tabrizi

Detecting deadlocks for Databases

August 12, 2025

Databases are so magical, and I wanted to learn more about the inner workings. I was reading up on locking, MVCC, and concurrency control, and fell down a deadlock rabbit hole. Once I had the problem statement, I wrote two approaches.

Problem statement

You’re given a set of locks and a transaction ID. Each lock has:

A simple Python-ish struct:


class Lock:
    def __init__(self, id, holders, waiters):
        self.id = id
        self.holders = holders
        self.waiters = waiters

Each transaction may wait on at most one lock at a time. If a transaction is waiting on a lock, it's blocked by all transactions in that lock's holders.

Task: Given the set of locks and a target transaction ID, determine whether that transaction is currently in a deadlock.

Lock-based dependency graph

First pass: model locks as nodes and ask, “can the locks I need eventually become unlockable?” If the lock dependency chain has a cycle, I'm stuck.

How I do it:

  1. Get all the locks the target transaction is waiting on (roots).
  2. Build a map from txn to locks it's currently waiting on.
  3. Run DFS over locks; follow: lock |> its holders |> the locks those holders are waiting on.

Cycle check:

  1. Mark a lock as “visiting” (gray) when you enter it.
  2. If you see a gray lock again, that's a cycle |> deadlock.
  3. When done with a lock, mark it “visited” (black).

# We use a DFS with white/gray/black coloring encoded as:
#   visited[lock_id] = False  -> GRAY  (currently exploring this lock)
#   visited[lock_id] = True   -> BLACK (finished exploring this lock)
#   Unvisited locks are simply absent from 'visited'.

def detect_cycle_for_locks(lock, visited, graph) -> bool:
    # Mark current lock as GRAY
    visited[lock.id] = False

    # For every transaction that currently HOLDS this lock...
    for holder in lock.holders:
        # ...look up every lock that holder is WAITING ON
        # 'graph' maps txn_id -> List[Lock] it is waiting for
        for lock_waiting in graph.get(holder, []):
            # If we see a dependent lock that is already GRAY,
            # we found a back edge |> cycle |> deadlock
            if lock_waiting.id in visited and visited[lock_waiting.id] is False:
                return True

            # If not visited, recurse into that dependent lock
            if lock_waiting.id not in visited:
                if detect_cycle_for_locks(lock_waiting, visited, graph):
                    return True

    # All dependencies explored; mark current as BLACK (done visiting)
    visited[lock.id] = True
    return False


def detect_deadlock_using_lock_graph(txn_id: str, locks: Dict[str, Lock]) -> bool:
    # 1) Gather all locks that the target txn is CURRENTLY waiting on (roots)
    locks_waiting_on = []

    # 2) Build txn -> locks it is waiting on (the dependency graph we walk)
    txn_id_to_wanted_locks = {}

    for lock in locks.values():
        # Root set: locks our txn is waiting for right now.
        if txn_id in lock.waiters:
            locks_waiting_on.append(lock)

        # Populate wait map: each waiter txn -> list of locks it wants
        for waiter in lock.waiters:
            if waiter not in txn_id_to_wanted_locks:
                txn_id_to_wanted_locks[waiter] = []
            txn_id_to_wanted_locks[waiter].append(lock)

    # 3) Run DFS from every root lock; if any DFS sees a cycle, it's a deadlock
    for waiting_lock in locks_waiting_on:
        visited = {}
        if detect_cycle_for_locks(waiting_lock, visited, txn_id_to_wanted_locks):
            return True

    # No cycle found along any root path
    return False

Transaction based dependency graph

The lock graph works, but it's more work than needed when I only care about one txn. A cleaner path is to build a wait-for graph: nodes are txns; edges go from a waiting txn to the txn(s) blocking it. Locks are just how we derive these edges.

Why this is nicer:

Basic idea:

  1. Build txn to blockers from the lock table.
  2. Add edges from later waiters to earlier waiters for FIFO.
  3. Run DFS from the target txn; back-edge |> deadlock.

# DFS over the transaction wait-for graph
def detect_cycle_for_transactions(txn_id, visited, blockers) -> bool:
    visited[txn_id] = False  # GRAY
    for next_txn_id in blockers.get(txn_id, ()):
        # back-edge to a GRAY node |> cycle (deadlock)
        if next_txn_id in visited:
            if visited[next_txn_id] is False:
                return True
        else:
            if detect_cycle_for_transactions(next_txn_id, visited, blockers):
                return True
    visited[txn_id] = True  # BLACK
    return False

def detect_deadlock_using_transaction_graph(txn_id, locks) -> bool:
    # Build wait-for graph: txn -> {txns blocking it}
    transaction_blockers = {}
    for lock in locks.values():
        for waiter in lock.waiters:
            if waiter not in transaction_blockers: 
                transaction_blockers[waiter] = []
            transaction_blockers[waiter].extend(lock.holders)

        # FIFO fairness: earlier waiters block later waiters
        for i, w in enumerate(lock.waiters):
            transaction_blockers[w].update(lock.waiters[:i])

    # Run DFS from the given transaction
    return detect_cycle_for_transactions(txn_id, {}, transaction_blockers)

Comparing the 2 algorithms: