CDN Fundamentals - Global-Scale Content Delivery

LESSON

Caching, Workers, and Performance

021 30 min intermediate

Day 249: CDN Fundamentals - Global-Scale Content Delivery

A CDN is what happens when cache locality stops being a per-process or per-region problem and becomes a planetary routing problem.


Today's "Aha!" Moment

The insight: A CDN is not just "a cache on the internet." It is a globally distributed system that decides where requests should terminate, which copies are good enough to serve, and when the origin should be protected from distance and traffic.

Why this matters: Teams often describe CDNs only in frontend terms, as if they existed to make images load faster. That misses the systems lesson underneath: a CDN is a strategy for moving content closer to users while controlling origin load, latency, and resilience across many PoPs and networks.

The universal pattern: distant authoritative origin -> globally distributed requesters -> strategically placed edge copies -> routing and freshness policies determine what gets served.

Concrete anchor: A user in Madrid requesting an image hosted in Virginia should not pay the full transatlantic round trip if a nearby edge location already holds a valid copy. The CDN exists to make "near enough and fresh enough" the common path.

How to recognize when this applies:

Common misconceptions:

Real-world examples:

  1. Web assets: JS, CSS, images, and video segments benefit immediately from edge locality.
  2. Dynamic platforms: HTML, API responses, and signed objects can also benefit, but only if cacheability and validation rules are explicit.

Why This Matters

The problem: Distance is latency, and origin infrastructure is finite. If every request must travel all the way to the authoritative source, the system pays unnecessary network cost and centralizes load in exactly the place that is hardest to scale globally.

Before:

After:

Real-world impact: CDNs reduce user latency, protect origin capacity, smooth flash traffic, and create a new global control layer for performance and security. They often decide whether a product feels regional or worldwide.


Learning Objectives

By the end of this session, you will be able to:

  1. Explain why CDNs exist as a systems layer - Connect global distance and origin protection to edge caching.
  2. Describe how CDNs work operationally - Reason about PoPs, cache keys, origin fetches, validation, and shared cache behavior.
  3. Evaluate practical trade-offs - Decide what should be cached at the edge, what must remain origin-authoritative, and what failures the CDN changes rather than removes.

Core Concepts Explained

Concept 1: A CDN Solves Locality at Internet Scale

Up to this point in the month, locality has appeared in many forms:

A CDN pushes the same logic outward:

The reason is simple. If a request must traverse continents, multiple ISPs, congested interconnects, and the full origin stack every time, latency and origin cost become the default. A CDN changes that default.

The main elements are:

This makes a CDN more than just storage. It is:

The central trade-off is already visible:

That is why a CDN is one of the clearest examples of the "copies are useful until they become dangerous" pattern.

Concept 2: CDN Behavior Is Mostly About Cacheability, Keys, and Revalidation

From the user's perspective, the ideal path looks simple:

user -> nearby edge -> cache hit -> response

But operationally, the CDN has to answer several questions:

Can this object be shared safely?

Static assets usually can. Personalized or authorization-sensitive content often cannot, unless the cache key and policy are carefully constrained.

What is the cache key?

This is critical. Two requests may look similar but differ by:

If the key is too broad, the CDN may serve the wrong content. If it is too narrow, hit rate collapses.

How does the edge know whether the copy is still valid?

This is where HTTP cache semantics become central:

A CDN is therefore not "magic performance." It is an HTTP and cache-policy execution engine at global scale.

This is also why origin behavior matters so much. If the origin emits poor cache headers, unstable identifiers, or user-specific responses without the right controls, the CDN either becomes ineffective or unsafe.

So the operational mental model is:

Concept 3: A CDN Changes Failure Modes; It Does Not Remove Them

It is tempting to think of a CDN as a layer that only improves things. In reality it shifts where problems appear.

What it improves:

What it introduces or sharpens:

This is the mature CDN mental model:

That is why the next lessons fit naturally:

A CDN is the moment the cache chapter stops being local optimization and becomes global systems design.


Troubleshooting

Issue: "We put a CDN in front of the site, but latency did not improve much."

Why it happens / is confusing: Teams assume the presence of a CDN guarantees useful edge hits.

Clarification / Fix: Check cacheability, key design, and origin headers. A CDN only helps if reusable content can actually be cached and served from the edge.

Issue: "The CDN served stale content, so the CDN is broken."

Why it happens / is confusing: Freshness failures are blamed on the network layer alone.

Clarification / Fix: Staleness is usually a policy problem involving origin headers, validators, TTLs, or purge propagation. The CDN is enforcing some freshness contract, even if it is the wrong one.

Issue: "Dynamic content means the CDN is useless."

Why it happens / is confusing: CDN value is associated only with static files.

Clarification / Fix: Many dynamic responses still have cacheable components, revalidation paths, or partial edge behavior. The right question is not "dynamic or static?" but "what can be safely shared, for how long, and under what key?"


Advanced Connections

Connection 1: CDN Fundamentals <-> Cache Invalidation and Revalidation

The parallel: The invalidation lesson becomes global here. A CDN still lives or dies by freshness control, but now the copies are shared across geography and edge PoPs.

Real-world case: A product launch can look perfect at origin while stale HTML or assets remain at the edge because the purge or validator strategy was weak.

Connection 2: CDN Fundamentals <-> Edge Functions and Global Control Planes

The parallel: Once requests terminate at the edge, the CDN is no longer only a passive cache. It becomes a programmable policy surface for routing, security, and partial computation.

Real-world case: Modern edge platforms combine cache, WAF, request rewriting, and lightweight execution in the same PoP path.


Resources

Optional Deepening Resources


Key Insights

  1. A CDN is a locality layer at global scale - It reduces distance and origin pressure by moving reusable responses toward users.
  2. CDN effectiveness is mostly policy, not magic - Cacheability, cache keys, validators, and origin behavior determine whether the edge can help safely.
  3. CDNs shift failure modes instead of eliminating them - They improve latency and resilience, but they also introduce global freshness, purge, and cache-key correctness problems.

PREVIOUS Cache Invalidation Patterns - Write Strategies & Consistency NEXT Edge Functions - Compute at the CDN Edge

← Back to Caching, Workers, and Performance

← Back to Learning Hub