LESSON
Day 242: Cache Eviction Policies - LRU, LFU, ARC
Eviction is where a cache stops being just memory and becomes a prediction about the future.
Today's "Aha!" Moment
The insight: A cache is small by design, so it cannot keep everything. The eviction policy is the cache's bet about what will be useful next. LRU, LFU, and ARC are not just algorithms; they are different beliefs about what past access patterns mean.
Why this matters: Teams often talk about cache hits as if the cache were neutral. It is not. The cache keeps certain objects and throws others away according to a policy. If that policy mismatches the workload, a large cache can still behave badly.
The universal pattern: finite capacity -> some entries must leave -> the policy encodes what "valuable to keep" means.
Concrete anchor: Imagine an API cache serving product pages. If users repeatedly revisit a small popular set, LFU can shine. If the hot set changes rapidly with current navigation, LRU may do better. If the workload mixes both behaviors, a hybrid like ARC becomes attractive.
How to recognize when this applies:
- The cache is full and hit rate matters.
- Different traffic patterns produce very different cache behavior.
- Large scans or sudden popularity shifts make the cache feel "unstable."
Common misconceptions:
- [INCORRECT] "Eviction is just deleting the oldest item."
- [INCORRECT] "A bigger cache solves a bad eviction policy."
- [CORRECT] The truth: Eviction is workload prediction under limited capacity, and the wrong prediction can waste memory and hurt latency.
Real-world examples:
- Web/CDN caching: A one-time content sweep can pollute a recency-based cache if recent does not mean reusable.
- In-memory databases: Stable hot keys reward frequency-aware policies more than purely recent access.
Why This Matters
The problem: Once a cache reaches capacity, every new admission forces a choice. If the system evicts the wrong item, the next request pays the miss penalty even though the cache did have enough memory to help something else.
Before:
- Teams treat eviction as an implementation detail.
- Cache misses are blamed on capacity alone.
- Workload shape is ignored when choosing a policy.
After:
- Teams reason explicitly about recency, frequency, scans, and phase shifts.
- Cache behavior is understood as part of performance engineering, not just storage.
- Policy choice becomes tied to workload shape and miss cost.
Real-world impact: A good eviction policy improves hit rate without increasing memory, protects deeper systems from repeated misses, and keeps the cache useful under changing traffic instead of only under ideal steady-state loads.
Learning Objectives
By the end of this session, you will be able to:
- Explain why eviction exists - Connect finite capacity to the need for workload-informed replacement decisions.
- Compare
LRU,LFU, andARC- Understand what each policy assumes about future reuse. - Evaluate policy fit in practice - Recognize when recency, frequency, or adaptive hybrids are more appropriate.
Core Concepts Explained
Concept 1: Eviction Is the Cache's Forecasting Layer
The first cache lesson established that a cache is a smaller, faster copy. This lesson adds the missing pressure: the cache is smaller than the working set, so not everything can stay.
That means a cache must answer one uncomfortable question repeatedly:
If I need room for one more item, which existing item is least worth keeping?
That question is harder than it looks because the cache does not know the future. It only knows the past. So every eviction policy turns past access patterns into a forecast:
- recently used probably means useful again soon
- frequently used probably means globally important
- a mix of both may be needed if the workload changes phase
This is why eviction policy matters even when cache size is fixed. Two caches with the same memory budget can produce very different hit rates if they throw away different items.
The trade-off begins here:
- simple policies are cheap to maintain and easy to explain
- but simple policies often fail badly on workloads they were not designed for
So eviction is not merely about deletion. It is about preserving the most future value per byte of scarce fast storage.
Concept 2: LRU and LFU Encode Different Assumptions About Reuse
LRU and LFU are the two classic baselines because they represent two strong but different intuitions.
LRU - Least Recently Used
LRU assumes that recency predicts future reuse. If an item was touched recently, it is more likely to be touched again soon.
This works well when workloads have temporal locality, such as:
- navigational sessions
- hot loops
- working sets that drift but stay clustered in time
Its strengths:
- intuitive
- usually effective for recency-heavy access
- relatively straightforward to reason about
Its failure mode is equally important:
- a large one-time scan can push out genuinely useful items simply because the scan touched many new objects recently
LFU - Least Frequently Used
LFU assumes long-term popularity predicts future value. Items accessed many times deserve to stay even if they were not touched in the last few moments.
This works well when the workload has stable hot items, such as:
- repeatedly requested product metadata
- popular configuration values
- skewed key popularity in caches and key-value stores
Its strengths:
- good at protecting truly popular items
- resists some recency noise better than
LRU
Its failure mode:
- it can become sticky, keeping yesterday's winners even after the workload changes
So LRU tends to adapt faster but can be fooled by bursts and scans. LFU tends to remember long-term value but can adapt too slowly when the world changes.
That is the core contrast:
LRUbets on what was hot recentlyLFUbets on what has been important over time
Neither is universally right. Each wins when its workload assumption is true.
Concept 3: ARC Tries to Adapt When Recency and Frequency Keep Taking Turns
ARC exists because real workloads rarely stay pure. Many systems mix:
- bursts of recent exploration
- stable popular objects
- shifts between one pattern and the other
ARC can be understood as an adaptive compromise. Instead of hard-coding one balance between recency and frequency, it keeps evidence about both and adjusts online.
At a high level:
- one structure tracks items that look valuable because they were seen recently
- another tracks items that look valuable because they have been reused
- "ghost" history tracks what was evicted, so the algorithm can learn whether it is throwing away the wrong kind of object
That last part matters. ARC does not just store live cache entries; it also stores metadata about recently evicted entries so it can infer whether the workload is currently rewarding recency or frequency more.
The intuition is:
If recently-evicted recent items are requested again,
lean more toward recency.
If recently-evicted frequent items are requested again,
lean more toward frequency.
That is why ARC is conceptually powerful. It turns eviction into a learning problem rather than a fixed one-parameter guess.
But it is not free.
Costs include:
- more bookkeeping than plain
LRU - more implementation complexity
- more state to maintain and reason about
This gives us the practical lesson:
- adaptive policies can protect a cache from workload shifts
- but they cost more, so they make sense only when that adaptability is worth the added machinery
In practice, many production systems also use approximations rather than textbook-perfect policies. That is a sign of engineering realism: exact policy quality matters, but so does metadata overhead, concurrency cost, and simplicity of maintenance.
Troubleshooting
Issue: "The cache is big, but hit rate still collapses during scans."
Why it happens / is confusing: A recency-heavy policy like LRU may treat the scan as if all scanned objects suddenly mattered.
Clarification / Fix: Check whether the workload includes large one-time sweeps or bursty exploration. A frequency-aware or adaptive policy may fit better, or the scan path may need different admission logic.
Issue: "Our LFU cache keeps stale popularity and adapts too slowly."
Why it happens / is confusing: Long-term frequency can preserve yesterday's hot items after demand shifts.
Clarification / Fix: Use aging/decay, approximate counters, or a more adaptive policy when phase changes matter. Frequency without forgetting becomes inertia.
Issue: "ARC is obviously better, so we should always choose it."
Why it happens / is confusing: Adaptive policies sound like universal winners.
Clarification / Fix: ARC is attractive when workloads alternate between recency-heavy and frequency-heavy behavior, but its extra bookkeeping and complexity must still be justified against the miss cost and system constraints.
Advanced Connections
Connection 1: Cache Eviction Policies <-> Allocators and Memory Pressure
The parallel: Both are scarce-resource policies. Allocators decide how scarce fast memory gets carved up; eviction decides which cached entries deserve to stay once that memory is full.
Real-world case: In in-memory systems, poor eviction and poor allocator fit can amplify each other: the cache keeps the wrong entries, while fragmentation makes the memory budget feel even smaller.
Connection 2: Cache Eviction Policies <-> CDN and Application Caches
The parallel: Even at higher layers, the same recency vs frequency tension appears. A CDN facing flash crowds and a local application cache facing hot keys are both solving "what do we keep?" under finite capacity.
Real-world case: A policy that looks fine in a lab can fail in production because real traffic includes scans, bursts, and popularity shifts that stress the wrong heuristic.
Resources
Optional Deepening Resources
- [DOCS] Redis eviction policies
- Link: https://redis.io/docs/latest/develop/reference/eviction/
- Focus: Use it to connect textbook replacement policies with a production system that implements practical approximations of
LRUandLFU.
- [PAPER] ARC: A Self-Tuning, Low Overhead Replacement Cache
- Link: https://www.cs.cmu.edu/~natassa/courses/15-721/papers/arcfast.pdf
- Focus: Read it for the original adaptive idea behind balancing recency and frequency online.
- [DOCS] Caffeine eviction wiki
- Link: https://github.com/ben-manes/caffeine/wiki/Eviction
- Focus: See how a modern production cache talks about eviction in terms of policy quality and operational trade-offs rather than textbook purity alone.
- [PAPER] LRU-K page replacement algorithm
- Link: https://sigmodrecord.org/?smd_process_download=1&download_id=9425
- Focus: Use it to deepen your intuition for why plain recency is sometimes too weak and why richer history can help.
Key Insights
- Eviction is prediction, not cleanup - Once a cache is full, every replacement decision encodes a guess about what future requests will value most.
LRUandLFUare different bets -LRUtrusts recent reuse,LFUtrusts durable popularity, and each fails when its workload assumption stops matching reality.- Adaptive policies exist because workloads shift -
ARCis useful precisely because real systems alternate between recency-heavy and frequency-heavy phases instead of staying pure.