Day 065: Cache Fundamentals and Core Patterns
A cache is useful when the system can safely reuse a recent answer instead of asking the slow, authoritative place the same question every time.
Today's "Aha!" Moment
Caching is often described as "store data in memory so reads are faster." That is technically true and pedagogically weak. The deeper idea is that a cache lets the system reuse a recent answer without re-paying the full cost of asking the source of truth again. The performance gain is a consequence of reuse, not the primary concept.
Take one concrete example: the learning platform's course details page. Thousands of users may read the same course title, description, instructor, and rating summary during the day. Those fields change occasionally, not on every request. If the backend rebuilds that response from the database each time, it is doing repeated work against an authoritative source that already answered the same question moments ago.
That is the aha. A cache is a copy with a policy. The copy is not authoritative. The source of truth still owns correctness. The policy decides when the copy is filled, how long it is trusted, and what happens when the source changes. Once you see those three parts together, cache patterns make much more sense: they are just different answers to "who updates the copy, and when?"
This is why caches are powerful and dangerous at the same time. They cut latency and protect dependencies, but they also create the possibility that the answer you serve is slightly behind reality. The engineering question is never "should we cache everything?" It is "which answers are worth reusing, and what freshness promise are we making when we do?"
Why This Matters
The problem: Without caching, a backend keeps paying full price for repeated reads even when the underlying answer changes much more slowly than it is requested.
Before:
- The database or dependency is asked the same question over and over.
- Scaling pressure appears first as repeated read load.
- Caches get added reactively, without a clear model of source of truth and freshness.
After:
- Repeated work is treated as a design opportunity.
- The team chooses a pattern based on how data is read and written.
- Freshness, cost, and load are designed together instead of separately.
Real-world impact: Lower latency, reduced database pressure, better tolerance to traffic spikes, and a clearer foundation for later topics like Redis, HTTP caching, and invalidation.
Learning Objectives
By the end of this session, you will be able to:
- Explain what a cache really is - Distinguish the cache copy from the source of truth and the policy that governs it.
- Compare core cache patterns - Understand cache-aside, write-through, and write-behind as different ownership rules for the copy.
- Reason about freshness as part of the design - Understand why every cache pattern is also a staleness policy.
Core Concepts Explained
Concept 1: A Cache Is a Non-Authoritative Copy Used to Reuse Recent Answers
Start with the most important mental model: the cache is not the truth. It is a copy of data or of a computed answer that the system is willing to trust for a while because reusing it is cheaper than recomputing it.
For course details, that copy may hold:
- a full serialized response
- a single course object
- a precomputed rating summary
- a feature flag lookup
The source of truth still matters because it decides what is actually correct. The cache only decides whether it can reuse a recent answer cheaply enough and safely enough.
request
-> ask cache copy
-> if trusted, return it
-> otherwise ask source of truth
This simple distinction prevents a lot of confusion later. If you forget that the cache is non-authoritative, you start designing as if cache state is reality. That is where stale reads, invalidation bugs, and accidental consistency promises become surprising instead of expected.
The trade-off is immediate: fewer expensive reads in exchange for the possibility that the reused answer is slightly behind the source. Whether that is acceptable depends on the data, not on the cache technology.
Concept 2: Cache Patterns Are Really Rules About Who Maintains the Copy
Once you accept that the cache is a copy, the patterns become much easier to understand. They are not magic categories. They are different answers to one operational question: who updates the copy, and at what point in the read or write flow?
cache-aside: read path fills the cache on demand
write-through: write path updates source and cache together
write-behind: write path updates cache first, source later
Cache-aside is the most common starting point because it is explicit and simple. The application checks the cache first, then falls back to the source on a miss.
def get_course_details(cache, db, course_id):
key = f"course:{course_id}:details"
cached = cache.get(key)
if cached is not None:
return cached
course = db.fetch_course_details(course_id)
cache.set(key, course, ttl_seconds=300)
return course
Write-through shifts the responsibility to the write path. When the authoritative value changes, the cache is updated immediately as part of the write flow. That can simplify read freshness for some cases, but it also couples writes more tightly to cache behavior.
Write-behind goes further: the system updates the cache and delays writing to the source of truth. That can reduce write latency, but it changes durability and failure semantics dramatically. It should not be introduced just because it sounds more advanced.
The trade-off across all three is control versus freshness versus failure complexity. The pattern is not chosen by taste. It is chosen by how much freshness you need, how your write path behaves, and what failure behavior the system can tolerate.
Concept 3: Every Cache Pattern Is Also a Freshness and Failure Policy
Students often think the hard part of caching is where to store the data. Usually the harder part is deciding what "fresh enough" means and what the system should do when the cache or source misbehaves.
If the course description is stale for 30 seconds, the user may never notice. If seat availability or purchase eligibility is stale for 30 seconds, the system may show behavior that is actively wrong. That is why cache design is inseparable from product semantics.
This can be stated as a simple rule:
cached answer = reused answer + freshness promise
That promise may be expressed through:
- TTLs
- explicit invalidation on writes
- refresh-on-read
- stale-while-revalidate behavior
And every one of those choices creates failure behavior too:
- what happens on a miss?
- what happens when many requests miss together?
- what happens if the cache is down?
- what happens if the source changes but invalidation is late?
This is the foundation for the rest of the month. Redis, HTTP caching, CDN layers, and invalidation are all just more concrete versions of the same principle: you are managing a copy, its trust window, and its failure modes.
The trade-off is speed and load reduction versus semantic complexity. Caching is worth it when the reused answer is still good enough for the read path you are trying to accelerate.
Troubleshooting
Issue: Treating the cache as if it were the truth once it starts performing well.
Why it happens / is confusing: Once latency drops, it is easy to forget that the cache is only a reused answer with a bounded trust window.
Clarification / Fix: Keep the source-of-truth model explicit. Ask what stale behavior is acceptable on this path before trusting the cached answer too much.
Issue: Picking a cache pattern because it sounds standard, not because it matches the flow.
Why it happens / is confusing: Cache-aside, write-through, and write-behind are often taught as a menu of named patterns instead of as answers to ownership and freshness questions.
Clarification / Fix: Start from the read/write behavior and failure tolerance of the path. Then choose the pattern whose maintenance rule matches that behavior.
Advanced Connections
Connection 1: Caching ↔ Cost Control
The parallel: Reusing answers from cache is often as much about reducing dependency cost and pressure as it is about reducing response latency.
Real-world case: A high-read course catalog may scale economically because most requests never reach the database or aggregation path at all.
Connection 2: Caching ↔ Scalability Thinking
The parallel: Caching teaches a broader systems habit that will keep reappearing later: before scaling out, ask which work is being repeated unnecessarily.
Real-world case: Queueing, worker systems, and load balancing all become easier to reason about once you are already trained to spot repeated work and ownership of state copies.
Resources
Optional Deepening Resources
- These resources are optional and are not required for the core 30-minute path.
- [DOC] Redis Caching Guide
- Link: https://redis.io/learn/howtos/solutions/caching
- Focus: Review common cache patterns in a concrete shared-cache system.
- [DOC] HTTP Caching
- Link: https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching
- Focus: Compare backend cache ideas with HTTP-level freshness and reuse rules.
- [ARTICLE] AWS Caching Best Practices
- Link: https://aws.amazon.com/caching/best-practices/
- Focus: See practical production trade-offs around hit rate, TTL, and invalidation.
- [BOOK] Designing Data-Intensive Applications
- Link: https://dataintensive.net/
- Focus: Connect caching to broader storage and consistency choices.
Key Insights
- A cache is a copy plus a policy - The copy is useful only because the system has rules for when to trust and refresh it.
- Cache patterns are ownership rules - They differ mainly in who updates the copy and at what point in the flow.
- Freshness is not a side detail - Every cache decision is also a promise about how far behind reality the answer may be.
Knowledge Check (Test Questions)
-
What is the most useful way to think about a cache?
- A) As a non-authoritative copy that the system may reuse under a freshness policy.
- B) As a faster replacement for the source of truth.
- C) As a place to store every object that is expensive once.
-
What do core cache patterns mostly differ on?
- A) Who is responsible for updating the cached copy and when that update happens.
- B) Whether the cache is in memory or on disk.
- C) Whether the backend uses Python or Java.
-
Why is caching always also a freshness decision?
- A) Because the reused answer may lag behind the source, so the system must decide what "fresh enough" means.
- B) Because caches can only store immutable values.
- C) Because shorter TTLs remove all trade-offs automatically.
Answers
1. A: The cache is useful precisely because it is a reusable copy governed by rules about trust and refresh, not because it replaces the source of truth.
2. A: Cache-aside, write-through, and write-behind are mainly different operational policies for maintaining the cached copy.
3. A: Any cache can serve an older answer than the source currently holds, so freshness must be chosen deliberately.