LESSON
Day 248: Cache Invalidation Patterns - Write Strategies & Consistency
A cache is easy to fill. The hard part is deciding when the copy is still trustworthy.
Today's "Aha!" Moment
The insight: Invalidation is not a cleanup task after caching. It is the central consistency policy of the cache. The moment the source of truth changes, the system must decide how quickly cached copies stop being acceptable and who is responsible for making that happen.
Why this matters: Teams often add a cache and only later realize the real design problem is not storage but synchronization with truth. That is why cache invalidation feels notoriously hard: it is where performance optimization meets distributed consistency.
The universal pattern: source changes -> copies become potentially stale -> system chooses how staleness is detected, bounded, and repaired.
Concrete anchor: A product price changes in the database. The API cache still holds the old price, the CDN may still serve an older response, and the client may already have read the stale value. The key question is no longer "Do we have a cache?" but "What staleness window are we willing to tolerate, and how do we control it?"
How to recognize when this applies:
- Cached data is derived from a mutable source of truth.
- Reads are cheap only if a copy is reused.
- Writes are rare enough to tempt caching but important enough that stale reads matter.
Common misconceptions:
- [INCORRECT] "Invalidation just means deleting a key after a write."
- [INCORRECT] "TTL alone is enough for any cache consistency problem."
- [CORRECT] The truth: Invalidation is a policy about authority, propagation speed, and acceptable staleness, and different write strategies make different guarantees.
Real-world examples:
- Application caches: Product metadata or user profiles often tolerate small staleness windows, but not arbitrary ones.
- CDN layers: Purge, revalidation, and cache-control policies all exist because copies keep living after origin changes.
Why This Matters
The problem: The whole value of a cache comes from serving a copy instead of consulting the source every time. But the moment the source changes, that optimization becomes a correctness risk unless the system has an explicit invalidation strategy.
Before:
- Caches are treated as neutral performance layers.
- Teams assume stale data will be "rare enough."
- Write paths and read paths are designed independently.
After:
- Freshness is treated as an explicit part of the cache contract.
- Write strategies and read semantics are chosen together.
- Teams can explain not only how the cache speeds things up, but how it returns to truth after change.
Real-world impact: Good invalidation design prevents stale reads from becoming silent business bugs, reduces refill storms, and keeps the source of truth protected even while the cache remains useful.
Learning Objectives
By the end of this session, you will be able to:
- Explain why invalidation is the heart of cache consistency - Connect mutable truth to stale-copy risk.
- Compare write and invalidation strategies - Understand
cache-aside,write-through,write-behind, and explicit purge/revalidation patterns. - Evaluate practical trade-offs - Decide what freshness guarantees are worth paying for in a real system.
Core Concepts Explained
Concept 1: Every Cache Must Choose Who Owns Freshness
Once a cached value can become stale, the system needs an answer to one basic question:
Who is responsible for turning stale copies back into acceptable copies?
That answer can take different forms:
- the writer updates or deletes the cache
- readers detect staleness and refill
- time limits (
TTL) eventually retire old copies - the source emits invalidation events
- intermediaries revalidate against the source
These are not implementation details. They are different consistency contracts.
The first thing to make explicit is the authority model:
- the database, origin, or primary store is authoritative
- the cache is a non-authoritative copy
From there, invalidation becomes a control problem:
- how long may the copy diverge?
- who notices divergence?
- how expensive is repair?
That is why invalidation sits at the center of cache design rather than at the edge. You cannot reason about a cache until you know what counts as "fresh enough."
The fundamental trade-off is:
- stronger freshness usually means more write-path complexity or more read-path coordination
- weaker freshness buys speed and decoupling but tolerates stale answers longer
Concept 2: Write Strategies Encode Different Freshness and Failure Contracts
Several classic patterns keep appearing because they package different answers to the same problem.
cache-aside
This is the common pattern:
- reads check the cache first
- on miss, read the source and populate the cache
- on write, update the source and then invalidate or refresh the cache entry
Why people like it:
- simple mental model
- cache only fills for actually requested data
- the source remains clearly authoritative
Its main risk:
- after a write, stale data may still be served until invalidation or refill completes
- many readers may stampede the source when the item expires or is purged
write-through
Here the write updates the cache as part of the write path, and the cache may propagate the write to the source or stay synchronized with it.
Why people like it:
- cached values become fresh immediately on successful writes
- reads are simpler afterward
Its main cost:
- writes now pay cache coordination cost
- if cache and source update are not handled carefully, failures become more subtle
write-behind / write-back
Here the cache accepts the write first and flushes to the source later.
Why people like it:
- very fast write path
- useful when batching or absorbing bursts matters
Its risk is obvious and serious:
- durability and ordering get harder
- source truth can lag cache state
- failure between cache acceptance and durable write can lose or reorder updates
This is why write-behind is a throughput optimization with a much sharper correctness envelope.
The important systems lesson is that these strategies are not interchangeable knobs. They choose:
- when truth is updated
- when the cache becomes fresh
- where failure can create divergence
Concept 3: TTL, Purge, and Revalidation Control Staleness in Different Ways
After write strategy, the next question is how stale copies stop being served.
There are three broad control styles:
1. Time-based invalidation (TTL)
The cache entry expires after some duration.
This is cheap and simple, but it only guarantees bounded staleness in a coarse way:
- shorter TTL -> fresher data, more misses
- longer TTL -> fewer misses, more stale risk
TTL is attractive when:
- some staleness is acceptable
- explicit invalidation is too expensive
- the source changes less often than it is read
2. Explicit purge / delete
Writers or control planes delete the stale key when the source changes.
This is stronger than TTL because the system does not wait for time to pass. But it creates its own challenges:
- purge propagation may lag
- many consumers may refill at once
- key derivation must be correct, or the wrong copies survive
3. Revalidation
Instead of blindly serving or deleting a copy, the cache checks whether the source has changed, often through validators such as ETags, versions, timestamps, or object generation numbers.
This is especially important for HTTP and CDN layers because it lets the system keep using a copy while asking:
is my copy still valid?
Revalidation is often a very practical middle ground:
- cheaper than always fetching the full object
- stronger than "just trust the TTL"
Across all three, the real design question is:
- what stale window is acceptable?
- what refill pressure can the source survive?
- what coordination cost is worth paying on writes?
That is why invalidation strategy is really a consistency budget decision.
Troubleshooting
Issue: "We invalidate on write, but users still occasionally see stale data."
Why it happens / is confusing: Teams assume the invalidation event and all readers observe the same state instantly.
Clarification / Fix: Check propagation delay, multi-layer caches, and race windows between source update, cache delete, and subsequent reads. Invalidation reduces staleness; it does not magically erase all timing gaps.
Issue: "A TTL should solve the stale data problem."
Why it happens / is confusing: TTL feels like a universal freshness control.
Clarification / Fix: TTL only bounds how long stale data may survive. If the acceptable stale window is smaller than the TTL, or if the miss storm on expiry is too costly, TTL alone is not enough.
Issue: "Write-behind is better because writes become faster."
Why it happens / is confusing: Write latency is visible and attractive to optimize.
Clarification / Fix: Write-behind is a strong trade-off, not a free win. It shifts correctness risk into flush ordering, failure handling, and durability lag.
Advanced Connections
Connection 1: Cache Invalidation <-> CDN Purge and Revalidation
The parallel: Higher-level caches solve the same problem with different tools. A CDN purge, Cache-Control, ETag, and origin revalidation are all variations of the same freshness-control story.
Real-world case: A system may have correct invalidation in Redis but still serve stale data globally because the CDN layer follows a different freshness contract.
Connection 2: Cache Invalidation <-> Consistent Hashing and Fleet Churn
The parallel: When ownership moves between nodes, refilling behaves like a synthetic invalidation event. Good placement stability reduces how often invalidation turns into a fleet-wide warmup problem.
Real-world case: Scaling a cache fleet, expiring hot keys, and purging stale copies can all stress the same source-of-truth path if refill is not controlled.
Resources
Optional Deepening Resources
- [DOCS] MDN Web Docs: HTTP Caching
- Link: https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching
- Focus: Use it as the clearest practical reference for TTLs, validators, and revalidation semantics in a real cache hierarchy.
- [DOCS] Redis key eviction reference
- Link: https://redis.io/docs/latest/develop/reference/eviction/
- Focus: Read it alongside this lesson to separate memory-pressure eviction from freshness invalidation; they interact but are not the same decision.
- [DOCS] Amazon CloudFront: Invalidate files to remove content
- Link: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html
- Focus: Treat it as a concrete example of purge-based invalidation at CDN scale.
- [DOCS] Cloudflare Learning Center: What is cache invalidation?
- Link: https://www.cloudflare.com/learning/cdn/glossary/what-is-cache-invalidation/
- Focus: Use it to connect product-facing CDN invalidation language with the more general systems patterns from this lesson.
Key Insights
- Invalidation is the consistency policy of the cache - The real question is not just whether a copy exists, but how and when it stops being acceptable after the source changes.
- Write strategies choose different failure boundaries -
cache-aside,write-through, andwrite-behindall make different promises about freshness, speed, and what happens when writes fail. - TTL, purge, and revalidation are different control tools - They all manage staleness, but they do so with different trade-offs in coordination cost, source load, and freshness strength.