Day 234: Locks & Synchronization - Preserving Invariants Under Interleaving

Threads make shared memory cheap. Locks and synchronization are the price we pay to keep that shared memory from destroying our invariants when execution interleaves in unlucky ways.

Today's "Aha!" Moment

Once multiple threads share the same heap, the hard problem is no longer "how do they communicate?"

It is:

how do they communicate without stepping on each other?

Suppose two worker threads both update the same queue length, or both remove the next job from a shared queue, or one thread waits for data while another thread produces it.

The code for each thread may look correct in isolation. The bug appears only when the scheduler interleaves them at the wrong point.

That is the aha:

a lock is not about slowing threads down
it is about making a shared invariant temporarily private while one thread repairs or updates it

And synchronization is broader than locking.

Locks answer:

who may touch this shared state right now?

Other primitives such as condition variables and semaphores answer:

when is it safe to continue?
when should this thread wait?
what event should wake it up?

So the real topic is not "mutex syntax." It is how we preserve order and meaning inside a shared-memory process.

Why This Matters

Imagine a process with several worker threads consuming jobs from one in-memory queue.

If two threads both run:

read head pointer
take current job
advance head pointer

without coordination, both may read the same old head before either one updates it.

Now the system can:

process the same job twice
skip a job
corrupt the queue structure entirely

That is a real correctness problem, not a style issue.

And mutual exclusion is only half the story.

If the queue is empty, a worker should not spin pointlessly or poll in a tight loop forever. It should sleep until a producer thread inserts a new job and signals that the condition changed.

This is why the lesson matters:

locks protect shared state during updates
synchronization primitives coordinate waiting and wake-up around shared state changes

Without this model, teams write code that passes tests sometimes, fails under load, and becomes almost impossible to reason about from logs alone.

Learning Objectives

By the end of this session, you will be able to:

Explain why locks exist - Describe how shared-memory interleaving breaks invariants even when each thread's code looks correct on its own.
Differentiate mutual exclusion from waiting coordination - Explain what locks, condition variables, and related primitives each solve.
Evaluate the trade-off - Recognize when synchronization preserves correctness cheaply and when it introduces contention, deadlock risk, or throughput limits.

Core Concepts Explained

Concept 1: A Lock Protects a Critical Section Around a Shared Invariant

Take a very small shared counter:

counter = 0

Two threads each do:

tmp = counter
tmp = tmp + 1
counter = tmp

This looks harmless, but it is not atomic.

An unlucky interleaving can look like:

Thread 1: read counter -> 0
Thread 2: read counter -> 0
Thread 1: write counter -> 1
Thread 2: write counter -> 1

One increment disappears.

The real invariant is:

every increment should be reflected exactly once in the final shared state

A mutex protects the critical section by ensuring only one thread at a time can manipulate that shared invariant:

pthread_mutex_lock(&m);
counter = counter + 1;
pthread_mutex_unlock(&m);

The lock does not make the operation "faster." It makes the sequence appear indivisible with respect to competing threads.

That is the right mental model:

a lock temporarily turns shared mutable state into a one-thread-at-a-time resource

Concept 2: Synchronization Also Means Waiting for a Condition, Not Just Excluding Others

Now return to the shared job queue.

If the queue is empty, a worker does not need mutual exclusion alone. It needs a way to wait until there is actually work.

That is the role of condition-style synchronization.

A common pattern is:

lock the mutex protecting the queue
check whether the queue is empty
if empty, wait on a condition variable
when woken, re-check the condition
when work exists, remove a job and continue

ASCII sketch:

worker thread                    producer thread
-------------                    ----------------
lock(queue)
while empty:                     lock(queue)
    wait(cond, queue_lock)       push(job)
                                 signal(cond)
                                 unlock(queue)
pop(job)
unlock(queue)

The important subtlety is that:

the condition variable does not replace the lock

It works together with the lock because the shared predicate, such as "queue is non-empty," must be checked and updated under mutual exclusion.

This is why synchronization is a richer topic than "just use a mutex." Real shared-state systems need both:

exclusivity while mutating invariants
structured waiting while conditions are not yet true

Concept 3: Correct Synchronization Preserves Safety, but It Also Introduces Contention and Ordering Hazards

Locks and synchronization solve real problems, but they are not free.

Costs show up as:

contention: many threads serialize behind the same lock
convoying: one slow holder delays everyone behind it
deadlock: two or more threads wait forever on incompatible lock ordering
priority inversion: an important thread is blocked by a lower-priority thread holding a needed lock

This means synchronization design is always a balance.

If we lock too little:

invariants break

If we lock too much:

throughput collapses
latency becomes noisy
the system becomes harder to scale

That is why experienced designs try to:

keep critical sections small
define clear lock ordering
avoid sleeping or calling slow I/O while holding locks
use higher-level coordination primitives when they express intent better

So the trade-off is not "locks or no locks." It is:

how do we preserve correctness with the minimum coordination needed?

That question leads directly into the next lesson on lock-free structures.

Troubleshooting

Issue: "The code looks correct line by line, so it should be thread-safe."

Why it happens / is confusing: Developers mentally execute one thread at a time.

Clarification / Fix: Reconstruct the possible interleavings. If shared state can be read or written concurrently without protection, correctness has to be justified against those interleavings, not against a single-thread reading.

Issue: "A condition variable is just a nicer lock."

Why it happens / is confusing: Both often appear in the same code block.

Clarification / Fix: The mutex protects shared state; the condition variable coordinates waiting for a shared predicate to become true. They solve related but different problems.

Issue: "If we add more locks, the program becomes safer."

Why it happens / is confusing: More coordination sounds like more protection.

Clarification / Fix: Extra locks can create lock-order complexity, contention, and deadlock risk. Good synchronization is not maximal locking; it is minimal locking that still preserves invariants.

Advanced Connections

Connection 1: Locks & Synchronization <-> OS Threads & Processes

The parallel: Threads are what create the shared-memory problem. Synchronization is the mechanism that turns that shared-memory model into something usable under scheduler interleaving.

Connection 2: Locks & Synchronization <-> Lock-Free Data Structures

The parallel: Both are answers to the same question, how to preserve correctness under concurrency. Locks centralize exclusion explicitly; lock-free designs try to preserve progress with atomic primitives and retry instead.

Resources

[BOOK] Operating Systems: Three Easy Pieces
[DOC] pthread_mutex_lock(3p)
[DOC] pthread_cond_wait(3p)
[DOC] futex(7)

Key Insights

Locks protect shared invariants, not just variables - The real unit of protection is the consistency rule around shared state, not a random line of code.
Synchronization includes waiting as well as exclusion - Condition variables and similar primitives coordinate when threads should sleep and when they should resume.
Correctness and throughput pull in opposite directions - Stronger coordination can preserve safety but also creates contention, deadlock risk, and scalability limits.

Knowledge Check

What is the primary purpose of a mutex?
- A) To make code run in parallel more often
- B) To ensure only one thread at a time enters a critical section protecting shared state
- C) To replace the scheduler
Why is a condition variable often used together with a mutex?
- A) Because the shared predicate being waited on must be checked and updated under mutual exclusion
- B) Because condition variables cannot work with shared memory
- C) Because mutexes are only for file I/O
What is a common cost of coarse-grained locking?
- A) More compile-time type safety
- B) Reduced contention and higher throughput by default
- C) More serialization and lower concurrency under load

Answers

1. B: A mutex protects a critical section so that shared-state updates cannot interleave in unsafe ways.

2. A: Waiting and state checking must coordinate around the same protected shared predicate, which is why the mutex and condition variable are used together.

3. C: Coarse-grained locks are often simpler, but they force more threads to wait behind the same exclusion boundary.

← Back to Learning