Lock-Free Data Structures - Progress Without a Mutex

Day 235: Lock-Free Data Structures - Progress Without a Mutex

Locks preserve invariants by forcing one thread at a time through a critical section. Lock-free structures ask a harder question: can we preserve correctness with atomic state transitions and retries, so that one stalled thread does not stop everyone else?


Today's "Aha!" Moment

After learning locks, it is easy to think concurrency has a standard recipe:

That is often the right first answer.

But locks have a cost:

Lock-free data structures appear when we want a different kind of progress guarantee.

The aha is:

For example, instead of saying:

we say:

If another thread wins the race, we do not block behind it. We retry with the new state.

That changes the failure mode completely:

Why This Matters

Imagine many worker threads pushing jobs onto a shared stack or queue.

With a coarse lock, correctness is straightforward:

But under heavy contention, that one lock becomes a choke point.

Now consider a lock-free stack.

Each push can try to:

  1. read the current head pointer
  2. make the new node point to that old head
  3. atomically replace head with the new node, but only if head still equals the old value

If another thread changed the head first, the compare-and-swap fails and we retry.

No one waits for a lock holder. No thread can freeze the entire structure just by getting descheduled while "holding" something.

That sounds strictly better, but it is not.

Now the design has to answer harder questions:

This is why lock-free structures matter:

Learning Objectives

By the end of this session, you will be able to:

  1. Explain why lock-free structures exist - Describe when mutex contention or blocking progress makes a retry-based design attractive.
  2. Trace the core mechanism - Show how compare-and-swap style updates preserve correctness without a traditional lock.
  3. Evaluate the real trade-off - Connect lock-freedom to retry costs, ABA, memory reclamation, and the coming need to reason about memory ordering.

Core Concepts Explained

Concept 1: Lock-Free Means Global Progress Without Requiring a Thread to Hold a Mutex

The word "lock-free" is easy to misread.

It does not mean:

What it usually means is:

That is a different guarantee from a lock-based design, where one unlucky thread can delay everyone by holding a lock too long.

A simple contrast:

Lock-based:
  one thread owns the critical section
  others wait

Lock-free:
  many threads may race
  losers retry
  some thread still succeeds

This is why lock-free structures are especially attractive under:

But notice what changed:

That is the heart of the design.

Concept 2: The Core Primitive Is Usually Compare-and-Swap Plus Retry

Take a lock-free stack push.

Suppose head points to the current top node.

Thread 1 wants to push node N.

It does something conceptually like:

old_head = head
N.next = old_head
CAS(head, old_head, N)

Where CAS means:

ASCII sketch:

initial:
head -> A -> B

Thread 1 prepares:
N -> A -> B

CAS succeeds:
head -> N -> A -> B

If another thread changed head first, the CAS fails:

Thread 1 saw head = A
Thread 2 changed head to X
Thread 1 CAS(head, A, N) -> fail
Thread 1 retries with head = X

The structure remains correct because no thread installs an update based on stale assumptions without rechecking the shared state atomically.

That is the central pattern of lock-free design:

Concept 3: Removing Locks Also Removes Some Simplicity, Especially Around ABA and Memory Reclamation

This is where lock-free programming becomes much harder than it first appears.

Two classic problems:

Those problems do not disappear just because the top-level algorithm looks elegant.

Example intuition:

Thread 1 reads head = A
Thread 2 pops A, reuses/frees it, then head becomes A-looking again
Thread 1 CAS succeeds based on an assumption that is no longer logically valid

That is why practical lock-free designs often need extra machinery:

And there is one more subtle cost:

That is the bridge to the next lesson.

A lock hid some of the ordering complexity behind a stronger primitive. Lock-free code exposes more of that complexity directly.

Troubleshooting

Issue: "Lock-free means threads never wait."

Why it happens / is confusing: The name suggests all waiting disappears.

Clarification / Fix: Threads may still spin, retry, and lose races repeatedly. The key property is not zero waiting but the absence of a single mutex holder who can block global progress.

Issue: "If CAS succeeds, the algorithm is automatically correct."

Why it happens / is confusing: Atomic success sounds like proof that the whole invariant is safe.

Clarification / Fix: CAS only protects the specific compare-and-install step. Correctness still depends on the surrounding invariant, on how nodes are linked, and on safe memory reclamation.

Issue: "Lock-free is always faster than locking."

Why it happens / is confusing: Avoiding a mutex sounds like pure win.

Clarification / Fix: Under low contention, simple locks may be faster and easier to maintain. Lock-free algorithms shine only when their progress properties or contention profile match the workload.

Advanced Connections

Connection 1: Lock-Free Data Structures <-> Locks & Synchronization

The parallel: Both protect shared invariants. Locks do it by exclusive access; lock-free structures do it by atomic conditional updates and retries.

Connection 2: Lock-Free Data Structures <-> Memory Models & Ordering

The parallel: Lock-free algorithms depend directly on the guarantees of atomic operations and memory ordering, which is why the next lesson becomes essential rather than optional.

Resources

Key Insights

  1. Lock-free does not remove coordination, it changes its form - Atomic compare-and-swap plus retry replaces mutex ownership as the core coordination mechanism.
  2. The main win is progress under contention or delays - A stalled thread does not hold a mutex that blocks everyone else.
  3. The main cost is subtle correctness work - ABA, reclamation, and ordering semantics make lock-free designs much harder to implement and verify.

Knowledge Check

  1. What is the main idea behind a lock-free update?

    • A) Disable the scheduler while the update runs
    • B) Use an atomic conditional update such as CAS and retry if another thread changed the state first
    • C) Remove all shared memory from the process
  2. Why can lock-free structures still perform poorly under contention?

    • A) Because retries and failed CAS operations can waste work
    • B) Because lock-free code cannot use atomic instructions
    • C) Because lock-free algorithms force all threads into one critical section
  3. Why is memory reclamation a central difficulty in lock-free structures?

    • A) Because the compiler removes all freed nodes automatically
    • B) Because another thread may still hold a stale pointer to a node that looks removable
    • C) Because lock-free algorithms never delete nodes

Answers

1. B: A lock-free operation typically reads shared state optimistically, attempts an atomic conditional install, and retries if the state changed first.

2. A: Lock-free avoids mutex blocking, but heavy contention can still create lots of wasted retry work.

3. B: A removed node may still be visible to another thread that observed older state, so freeing it safely is a major part of real lock-free design.



← Back to Learning