Edge Computing, CDNs, and Geographic Locality

Day 028: Edge Computing, CDNs, and Geographic Locality

Edge systems feel fast because they shorten the trip, not because they make the origin magically smarter.


Today's "Aha!" Moment

When a user in Madrid opens a product page whose origin lives in us-east-1, the request is already carrying a large latency bill before your application code does anything impressive or embarrassing. TLS setup, network round trips, and physical distance can cost more than the HTML generation itself. That is the first mental shift: many "backend latency" problems are really "distance and reuse" problems.

This is why CDNs are so powerful. They do not primarily win by inventing faster CPUs. They win by noticing that many responses are reusable and many users ask for the same things: images, JavaScript bundles, product descriptions, public API responses, even short-lived rendered pages. If a nearby edge location can answer those requests, the user avoids a long trip to the origin and the origin avoids doing repetitive work.

Edge compute extends that idea one step further. Some requests are not fully cacheable, but they still need only a small amount of logic near the user: choose locale, redirect to the right storefront, attach headers, reject obvious bots, or transform an image. The main backend still owns durable business state, but the edge can do the cheap, local, high-frequency work that would otherwise waste a long round trip.

The important pattern is simple: put reusable data and lightweight decisions close to demand; keep authoritative state and deep workflows where consistency and control are strongest. Once you see that boundary clearly, edge architecture stops looking like marketing and starts looking like basic systems design.


Why This Matters

The problem: A global product can spend months tuning origin services while users still feel the system is slow, because the dominant cost is geographic distance plus repeated requests for mostly reusable content.

Before:

After:

Real-world impact: Better p95 latency for global users, lower origin load, reduced bandwidth cost, and fewer self-inflicted performance problems caused by sending every request through the longest possible path.


Learning Objectives

By the end of this session, you will be able to:

  1. Explain why geography changes performance - Describe when latency is dominated by distance rather than compute.
  2. Separate caching from edge execution - Identify when a response should be cached and when a small function should run near the user instead.
  3. Design origin-edge boundaries - Reason about freshness, invalidation, personalization, and which state must remain authoritative.

Core Concepts Explained

Concept 1: A CDN Turns Repeated Long Trips into Nearby Cache Hits

Consider our storefront again. Users in Tokyo, Madrid, and Sao Paulo are all requesting the same product images, JavaScript bundles, and most of the product description page. If every request goes back to one origin region, the origin does repetitive work and each user pays the network distance every time.

The intuition behind a CDN is almost embarrassingly practical: if many users ask for the same bytes, keep copies near them. The edge location becomes the first place a request checks. On a cache hit, the user gets a nearby response. On a miss, the edge fetches from origin, stores the result according to cache policy, and future users benefit.

User -> Nearby Edge POP -> Origin
          |        ^
          | hit    | miss + fill
          v        |
       Response ----

This only works well when you are explicit about reuse. Cache keys, Cache-Control, TTLs, and purge behavior determine whether the edge is accelerating the right thing or serving the wrong thing very quickly. A good CDN design is not "cache everything"; it is "cache what is safe to reuse, for exactly as long as that is useful."

The trade-off is straightforward. You gain lower latency and lower origin load, but you accept more operational thinking about freshness and invalidation. That is a good trade when the content is widely reused and the acceptable staleness is understood.

Concept 2: Edge Compute Is for Cheap, Frequent, Per-Request Decisions

Now imagine the request is not fully cacheable. A user arrives from Spain and should land on the EU storefront, get prices in euros, and maybe receive an image variant optimized for their device. None of that requires a deep transaction or a long-lived workflow, but it is still valuable to decide it before the request travels all the way back to the core backend.

This is where edge compute helps. A small function runs close to the user, reads a few request attributes, and either returns a lightweight response or forwards the request after shaping it. The edge is useful precisely because the logic is small, frequent, and latency-sensitive.

export default {
  async fetch(request) {
    const country = request.headers.get("cf-ipcountry");
    const region = country === "ES" || country === "FR" || country === "DE"
      ? "eu-storefront"
      : "global-storefront";

    const url = new URL(request.url);
    url.pathname = `/${region}${url.pathname}`;
    return Response.redirect(url.toString(), 307);
  }
}

The point is not the redirect. The point is placement. Locale routing, header normalization, bot filtering, image resizing, A/B assignment, and token prechecks are often better at the edge because they are cheap and they happen on nearly every request. Payment authorization, cart ownership, and inventory reservation are not edge jobs because they depend on authoritative state and stricter correctness guarantees.

The trade-off is that distributed code placement raises coordination costs. You gain lower latency and origin protection, but you now need discipline about what data is available at the edge, how configuration rolls out globally, and how to debug behavior spread across many points of presence.

Concept 3: The Hard Part Is Deciding What Can Be Local, Stale, or Recomputed

The most important design question is not "can this run at the edge?" It is "what truth is this request allowed to depend on?" Product images can usually be cached aggressively. Product descriptions may tolerate short staleness. Inventory, carts, and checkout usually cannot be guessed or cached casually because they are correctness-critical and change under contention.

One useful way to classify requests is by freshness sensitivity and ownership:

Safe at edge:
- static assets
- public docs
- image transforms
- short-lived catalog pages

Usually origin-owned:
- carts
- inventory
- payment state
- order creation

Once you frame the problem this way, edge architecture becomes a boundary exercise. Which data has one clear owner? Which responses are projections that can be cached or regenerated? Which personalizations are cheap enough to do near the user without dragging durable state out to the edge? Many bad designs come from mixing these categories and pretending everything can be both instant and authoritative.

The trade-off is between locality and control. More edge logic can make the system feel faster, but it can also multiply invalidation paths, cache key mistakes, stale personalization, and operational ambiguity. The right design pushes outward only the parts that benefit from proximity and can tolerate distributed execution.


Troubleshooting

Issue: "Dynamic" gets interpreted as "never cache anything."

Why it happens / is confusing: Teams think in binary terms: static pages are cacheable, dynamic pages are not. Real systems are messier. A "dynamic" page often contains a large reusable shell plus a small personalized fragment.

Clarification / Fix: Split the response mentally into reusable and authoritative pieces. Cache the stable parts, compute the lightweight request shaping near the edge, and keep correctness-critical state at the origin.

Issue: Edge compute gets treated as a reason to move business logic outward.

Why it happens / is confusing: Vendor messaging highlights how much can run at the edge, so it is easy to confuse "possible" with "architecturally wise."

Clarification / Fix: Put logic at the edge only if it is cheap, frequent, latency-sensitive, and not the source of truth. If the logic needs strong consistency, durable workflows, or rich internal state, it probably belongs deeper in the system.


Advanced Connections

Connection 1: CPU Cache Locality ↔ Internet Edge Locality

The parallel: CPU caches reduce the cost of repeatedly fetching data from far-away memory. CDNs reduce the cost of repeatedly fetching content from far-away origins. Different scale, same core idea: locality is performance.

Real-world case: A hot product image served from an edge POP plays the same architectural role as a hot cache line served from L1 instead of main memory.

Connection 2: CQRS Read Models ↔ Edge Projections

The parallel: A read model exists because the authoritative write path is not always the best place to serve every query. Edge caches and edge-rendered fragments are similar: they are optimized read-side artifacts derived from a deeper source of truth.

Real-world case: A catalog page cached globally for 60 seconds is often a projection of product data owned by core services, just as a CQRS read model is a projection of authoritative events or writes.


Resources

Optional Deepening Resources


Key Insights

  1. Edge systems win by changing placement - They reduce latency mainly by moving reusable content and lightweight logic closer to users.
  2. Caching and edge compute are related but different tools - One reuses responses; the other executes small request-time decisions near demand.
  3. The real design work is in boundaries - The hard question is what can safely be local or slightly stale, and what must remain authoritative at the origin.

Knowledge Check (Test Questions)

  1. Why can a CDN improve global performance even when the origin servers are already fast?

    • A) Because the CDN makes the origin CPU run faster.
    • B) Because many requests are dominated by geographic distance and can be served from a nearby cache.
    • C) Because the CDN removes the need for any origin region.
  2. Which task is the best fit for edge compute?

    • A) Reserving inventory during checkout.
    • B) Running a multi-step payment workflow.
    • C) Choosing locale, rewriting headers, or redirecting users to the right storefront.
  3. Why is invalidation such a central edge-design problem?

    • A) Because a fast response is only useful if it is still correct enough for the product's freshness requirements.
    • B) Because invalidation means the origin is no longer needed.
    • C) Because every cache miss is a sign the edge architecture failed.

Answers

1. B: Many requests are limited by network distance, not by origin CPU time. A nearby cache removes repeated long-haul trips.

2. C: Edge compute is strongest when the logic is lightweight, frequent, and latency-sensitive. Authoritative transactional work still belongs in the core backend.

3. A: Edge speed is only valuable if the cached result stays within acceptable freshness and correctness bounds.



← Back to Learning