LESSON

024 30 min intermediate

Day 252: CDN Optimization Techniques - Performance at Scale

A CDN is optimized when it serves the right copies cheaply, keeps the origin cool, and does not destroy correctness to buy hit rate.

Today's "Aha!" Moment

The insight: Optimizing a CDN is not about turning random performance knobs. It is about improving three things together: how often the edge can reuse work, how expensive misses are when they do happen, and how much pressure still leaks back to origin.

Why this matters: Teams often say "our CDN is on, so performance should be good." That hides the real work. A CDN can exist in front of a system and still be badly optimized if the cache key is too fragmented, the origin is still seeing repeated misses, compression is inconsistent, or edge layers are not shielding one another properly.

The universal pattern: reusable content + well-shaped cache keys + layered refill protection -> higher edge reuse -> lower origin load -> better tail latency for users.

Concrete anchor: Imagine a product page requested worldwide. If the cache key varies on every query string and every header, each PoP keeps missing. If the key is normalized, tiered caches collapse refill traffic, and assets are compressed correctly, the same workload suddenly becomes globally cheap.

How to recognize when this applies:

The CDN is enabled but hit rate stays mediocre.
Origin traffic is still high despite "cacheable" content.
Latency looks acceptable on average, but tail latency and origin spikes remain ugly during traffic bursts or deploys.

Common misconceptions:

[INCORRECT] "CDN optimization means just increasing TTL."
[INCORRECT] "Higher hit ratio is always good, regardless of what the cache is serving."
[CORRECT] The truth: Good CDN optimization improves reuse, refill efficiency, and byte delivery while preserving the correct response boundaries.

Real-world examples:

Static assets: Versioned URLs, compression, and tiered caching create stable, cheap global delivery.
Semi-dynamic pages: Normalized keys, validator-based revalidation, and origin shielding reduce load without pretending the content is immutable.

Why This Matters

The problem: A poorly optimized CDN still helps a little, but it leaves most of the system cost structure unchanged. Misses remain expensive, origins stay hot, and global traffic spikes still turn into backend incidents.

Before:

Cache keys are too fragmented or inconsistent.
Edge locations refill independently and repeatedly.
Large objects or dynamic variants produce poor byte efficiency.

After:

Shared edge reuse increases because the key space is better shaped.
Refill traffic is collapsed or shielded before it hits origin.
Users see lower latency and the backend sees less repetitive work.

Real-world impact: Strong CDN optimization improves user experience, lowers infrastructure cost, reduces deploy risk, and makes purge/revalidation events far less dangerous for the origin.

Learning Objectives

By the end of this session, you will be able to:

Explain what a CDN should actually be optimized for - Distinguish hit ratio, byte hit ratio, origin offload, and tail latency.
Describe the main CDN optimization levers - Reason about cache keys, tiered caching, shielding, request collapsing, compression, and variant control.
Evaluate real trade-offs - Improve performance without creating stale leaks, cache fragmentation, or refill storms.

Core Concepts Explained

Concept 1: Optimize the Metric That Matches the Bottleneck

The first mistake in CDN tuning is chasing a single generic number like "cache hit ratio" as if it captured the whole system.

In practice, different metrics answer different questions:

object hit ratio: how many requests are served from cache
byte hit ratio: how much traffic volume is served from cache
origin offload: how much repetitive work the CDN prevents from reaching origin
tail latency: whether slow users and slow paths are improving, not just the average

These metrics matter differently depending on the workload.

For example:

tiny cached icons may improve object hit ratio a lot while barely changing origin pain
video segments, images, or large JS bundles may matter more to byte hit ratio and bandwidth cost
HTML pages with expensive backend rendering may matter more to origin offload and tail latency

So CDN optimization starts with a blunt but important question:

What are we trying to make cheaper: requests, bytes, backend work, or tail latency?

If that question is vague, optimization drifts into folklore:

increase TTL
cache more things
vary on extra headers "just to be safe"

That often creates the wrong result:

higher apparent hit rate
worse correctness
or unchanged backend stress

The mature mental model is:

optimize the specific bottleneck, not the vanity metric

Concept 2: The Biggest Wins Usually Come From Key Shaping and Layered Refill Control

Once the goal is clear, the main CDN levers are usually these:

cache key normalization
tiered caching / shielding
request collapsing
compression and payload shaping

Cache key normalization matters because overly specific keys destroy reuse.

Common causes:

query strings that do not change the content
headers accidentally included in variation
device or geo dimensions added too broadly
edge functions multiplying variants unnecessarily

This is why many CDN wins are really about making more requests count as the same reusable object without crossing correctness boundaries.

Tiered caching and origin shielding matter because even good cacheability is not enough if every edge PoP refills independently.

The idea is simple:

edge PoPs do not all go straight to origin
upper cache tiers or shield layers absorb refill traffic first

That improves:

origin protection
miss cost
resilience during purge or traffic bursts

Request collapsing matters because many simultaneous misses for the same object should ideally become one origin fetch, not hundreds.

Compression and payload shaping matter because a CDN that serves fewer bytes is still doing real optimization even when object count is unchanged.

Examples:

Brotli or gzip for text assets
modern image formats where appropriate
avoiding unnecessary variation that prevents large shared objects from being reused

The pattern across all of these is consistent:

better optimization usually means fewer distinct objects and cheaper refill paths

Concept 3: Every CDN Optimization Trades Against Something

There is no free CDN optimization. Each gain moves some risk or cost elsewhere.

Examples:

longer freshness windows improve reuse but increase stale risk
broader cache keys improve hit rate but may cross correctness boundaries
more variants improve tailoring but reduce sharing
extra compression or edge transforms save bytes but spend edge CPU
more shielding helps origin but can complicate debugging and metric interpretation

This is why CDN tuning has to stay connected to the previous lessons.

From 16/10.md:

edge functions can improve policy placement
but they can also explode the cache key space

From 16/11.md:

purge strategy can restore freshness
but optimization determines whether that purge triggers a manageable refill or a global origin surge

And this lesson also prepares the next block:

once CDN optimization is mostly under control, the focus shifts from macro-system levers to profiling and finding where time is really spent inside the stack

The final takeaway is:

a good CDN is not only fast on hits
it is disciplined on misses

That is the difference between "cache in front" and genuine CDN engineering.

Troubleshooting

Issue: "We enabled caching, but hit rate is still disappointing."

Why it happens / is confusing: Teams assume the content simply is not cacheable.

Clarification / Fix: Check whether the cache key is fragmented by unnecessary query strings, headers, geo/device variation, or edge-generated variants. Many "low cacheability" problems are really key-shaping problems.

Issue: "Latency improved a bit, but origin is still overloaded."

Why it happens / is confusing: A modest edge hit rate can still make the user experience look somewhat better.

Clarification / Fix: Inspect origin offload, not just edge hits. Add shielding or tiered caching, collapse concurrent misses, and look for high-value objects that still refill too often.

Issue: "After purge, performance collapsed even though the CDN is optimized."

Why it happens / is confusing: Optimization is often measured only in steady state.

Clarification / Fix: Re-evaluate refill behavior under transition: shielding, stale handling, conditional revalidation, and purge scope are part of CDN optimization, not separate concerns.

Advanced Connections

Connection 1: CDN Optimization Techniques <-> Cache Purging Strategies

The parallel: Purging changes the system from steady state to refill state. A well-optimized CDN makes that transition survivable through better key discipline, request collapsing, and shielding.

Real-world case: The same purge on two CDNs can look radically different depending on whether refill traffic is coalesced and shielded before it reaches origin.

Connection 2: CDN Optimization Techniques <-> Performance Profiling

The parallel: CDN tuning works at the global request path level, while profiling works inside the service path. Together they answer both "why did this request reach origin?" and "where did origin spend the time once it did?"

Real-world case: A team may improve cache hit rate and still need profiling because the remaining misses are now concentrated in a small number of genuinely expensive code paths.

Resources

Optional Deepening Resources

[DOCS] Amazon CloudFront Developer Guide: Origin Shield
- Link: https://docs.aws.amazon.com/es_es/AmazonCloudFront/latest/DeveloperGuide/origin-shield.html
- Focus: Use it to understand why an extra cache layer can improve origin offload and reduce repeated refill traffic.
[DOCS] Cloudflare Docs: Cache keys
- Link: https://developers.cloudflare.com/cache/how-to/cache-keys/
- Focus: Read it to connect cache-key design directly to hit rate, correctness boundaries, and variant explosion.
[DOCS] Fastly Documentation: Caching content with Fastly
- Link: https://www.fastly.com/documentation/guides/concepts/cache
- Focus: Treat it as a practical reference for read-through edge caching, reuse behavior, and how cache layers reduce backend work.
[DOCS] MDN Web Docs: HTTP Caching
- Link: https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching
- Focus: Revisit freshness, validators, and revalidation so optimization stays grounded in the semantics the CDN is executing.

Key Insights

A CDN should be optimized for the real bottleneck - Hit ratio, byte hit ratio, origin offload, and tail latency are related but not interchangeable.
The largest gains usually come from reuse and refill shape - Better cache keys, shielding, and collapsed misses often matter more than random tuning knobs.
Good CDN optimization is disciplined under change - It preserves correctness and keeps misses cheap, especially during purges, bursts, and global traffic shifts.

← Back to Caching, Workers, and Performance

← Back to Learning Hub