Day 233: OS Threads & Processes - Isolation, Sharing, and Scheduling

Before we can talk about locks, races, and synchronization, we need a precise answer to a simpler question: what exactly is executing, what memory does it share, and who decides when it runs? Processes and threads are the operating system's answer.

Today's "Aha!" Moment

At first, threads and processes can feel like two names for "things that run."

That is too shallow to be useful.

The real distinction is this:

a process gives us an isolated address space and a protected execution context
a thread gives us another execution path inside an already existing address space

That means the difference is not cosmetic. It is about what is shared and what is isolated.

If two tasks live in different processes:

they do not directly share ordinary memory
crashing one usually does not corrupt the other's address space
communication requires some explicit boundary such as pipes, sockets, or shared memory

If two tasks live in different threads of the same process:

they share heap memory, globals, file descriptors, and most process resources
communication is cheap
mistakes are also cheap to make, because both threads can touch the same data at nearly the same time

That is the aha:

processes buy isolation
threads buy cheap sharing

And the scheduler is what makes both of them feel concurrent by interleaving execution over time.

Why This Matters

Imagine a web server that needs to handle many client requests.

There are several broad ways to structure it:

one process per request
one process with many threads
a smaller pool of worker threads inside one process

Those are not just style choices. They affect:

memory footprint
crash isolation
communication cost
how easily shared-state bugs appear
how the OS scheduler will multiplex work onto CPUs

If we choose processes, each request handler is better isolated, but communication and setup can be more expensive.

If we choose threads, memory sharing becomes natural and cheap, but now the same process can corrupt itself through races unless synchronization is carefully designed.

This matters because many "concurrency bugs" are really misunderstandings about the execution model.

Before we can reason about:

locks
condition variables
thread pools
race conditions

we need to know what exactly the OS is scheduling and what data those scheduled entities can both see.

Learning Objectives

By the end of this session, you will be able to:

Explain the difference between processes and threads - Describe the isolation boundary of a process and the sharing model of threads.
Trace how the OS creates concurrency - Explain how scheduling and context switching make multiple execution paths progress over time.
Evaluate the design trade-off - Decide when stronger isolation is worth more than cheaper sharing, and vice versa.

Core Concepts Explained

Concept 1: A Process Is an Isolation Boundary, Not Just a Running Program

When the OS creates a process, it creates a protected execution context with things like:

its own virtual address space
its own process ID
its own registers and execution state
its own view of code, heap, and stack mappings

That does not mean the process owns nothing in common with the rest of the system. It may still inherit file descriptors, environment variables, and other kernel-managed resources.

But the crucial property is:

ordinary memory writes in one process do not directly modify another process's private address space

That is why process boundaries are so useful.

ASCII picture:

Process A
  code
  heap
  stack

Process B
  code
  heap
  stack

Even if both processes run the same program, they normally see separate memory.

This gives us fault containment:

if Process A corrupts its heap, Process B's heap is not automatically corrupted too

It also means communication is no longer implicit. If two processes need to cooperate, they must cross an explicit mechanism such as:

pipe
socket
shared memory
file

That extra boundary is a cost, but it is also a protection.

Concept 2: A Thread Is a Separate Execution Path Inside One Shared Process

Now compare that with threads.

Threads in the same process each have:

their own program counter
their own register state
their own stack

But they share:

the process heap
global variables
open file descriptors
most of the process-wide resources

ASCII picture:

One process
  shared code
  shared heap
  shared globals

  Thread 1 -> own stack, own registers
  Thread 2 -> own stack, own registers
  Thread 3 -> own stack, own registers

This is why threads are attractive.

If Thread 1 computes something and stores it on the heap, Thread 2 can read it without any explicit IPC boundary. That is fast and convenient.

It is also why threads are dangerous.

If two threads update the same shared structure without coordination, the process can end up with:

races
torn assumptions
corrupted invariants

So threads are not "lightweight processes" in the only sense that matters. They are execution paths with cheap shared state, and cheap shared state is exactly what creates the need for synchronization.

Concept 3: Scheduling and Context Switching Turn Multiple Execution Paths Into Concurrency

Whether we use processes or threads, the OS scheduler decides when each runnable unit gets CPU time.

On a machine with fewer cores than runnable tasks, the OS creates the illusion of simultaneity by interleaving execution:

time -->

CPU:  T1   T2   T1   T3   T2   T1

Each switch requires the kernel to save one execution context and restore another.

That is the basis of context switching.

This matters because many concurrency bugs do not require true parallel hardware execution. They only require the OS to switch at an unlucky moment.

For example:

Thread 1: read counter
Thread 2: read counter
Thread 1: increment and write
Thread 2: increment and write

Even on one core, interleaving can lose an update.

That is the bridge to the next lesson.

Processes and threads tell us:

what is scheduled
what is shared
what is isolated

Scheduling tells us:

why independent-looking lines of code can still interfere with each other

Troubleshooting

Issue: "Threads are just smaller processes."

Why it happens / is confusing: Both are schedulable execution units.

Clarification / Fix: The important difference is not size but sharing. Threads share the process address space; separate processes usually do not.

Issue: "If there is only one CPU core, race conditions cannot happen."

Why it happens / is confusing: People equate concurrency only with true simultaneous execution.

Clarification / Fix: Interleaving is enough. A single core can still switch between threads at exactly the wrong point and violate invariants.

Issue: "Processes are always safer, so we should always prefer them."

Why it happens / is confusing: Isolation sounds strictly better.

Clarification / Fix: Isolation reduces some risks, but it raises communication and coordination costs. The right choice depends on how much shared state and fault containment the design really needs.

Advanced Connections

Connection 1: OS Threads & Processes <-> Locks & Synchronization

The parallel: Threads are what make locks necessary. Once multiple execution paths share memory inside one process, synchronization becomes the tool for preserving invariants under interleaving.

Connection 2: OS Threads & Processes <-> Distributed Systems

The parallel: Processes versus threads locally mirrors a familiar distributed trade-off: stronger boundaries improve isolation, while shared context can improve efficiency but increases coordination risk.

Resources

[BOOK] Operating Systems: Three Easy Pieces
[DOC] fork(2) - Linux manual page
[DOC] pthread_create(3) - Linux manual page
[DOC] pthread_create - Open Group specification

Key Insights

Processes and threads differ mainly in isolation versus sharing - Processes isolate address spaces; threads share one process address space while keeping separate execution state.
Scheduling is what turns multiple execution paths into concurrency - Even one CPU core can interleave tasks in ways that expose races and broken assumptions.
The model you choose shapes the bugs you will get - Processes make communication more explicit; threads make communication cheaper but push more correctness burden onto synchronization.

Knowledge Check

What do threads in the same process usually share?
- A) Separate heaps and separate globals
- B) Shared heap and process resources, but separate stacks and register state
- C) Nothing at all
Why can race conditions happen even on a single CPU core?
- A) Because interleaving by the scheduler can still violate invariants
- B) Because threads stop sharing memory on single-core systems
- C) Because processes are faster than threads
What is the main design advantage of using separate processes instead of threads?
- A) They make locks unnecessary in all situations
- B) They provide stronger isolation boundaries between tasks
- C) They eliminate the need for scheduling

Answers

1. B: Threads share the process address space and many process resources, but each thread still has its own stack and execution context.

2. A: Simultaneous execution is not required; unlucky scheduling interleavings are enough to create races.

3. B: Processes usually offer better fault and memory isolation, though at a higher communication cost.

← Back to Learning