Cache Eviction Policies - LRU, LFU, ARC

LESSON

Caching, Workers, and Performance

014 30 min intermediate

Day 242: Cache Eviction Policies - LRU, LFU, ARC

Eviction is where a cache stops being just memory and becomes a prediction about the future.


Today's "Aha!" Moment

The insight: A cache is small by design, so it cannot keep everything. The eviction policy is the cache's bet about what will be useful next. LRU, LFU, and ARC are not just algorithms; they are different beliefs about what past access patterns mean.

Why this matters: Teams often talk about cache hits as if the cache were neutral. It is not. The cache keeps certain objects and throws others away according to a policy. If that policy mismatches the workload, a large cache can still behave badly.

The universal pattern: finite capacity -> some entries must leave -> the policy encodes what "valuable to keep" means.

Concrete anchor: Imagine an API cache serving product pages. If users repeatedly revisit a small popular set, LFU can shine. If the hot set changes rapidly with current navigation, LRU may do better. If the workload mixes both behaviors, a hybrid like ARC becomes attractive.

How to recognize when this applies:

Common misconceptions:

Real-world examples:

  1. Web/CDN caching: A one-time content sweep can pollute a recency-based cache if recent does not mean reusable.
  2. In-memory databases: Stable hot keys reward frequency-aware policies more than purely recent access.

Why This Matters

The problem: Once a cache reaches capacity, every new admission forces a choice. If the system evicts the wrong item, the next request pays the miss penalty even though the cache did have enough memory to help something else.

Before:

After:

Real-world impact: A good eviction policy improves hit rate without increasing memory, protects deeper systems from repeated misses, and keeps the cache useful under changing traffic instead of only under ideal steady-state loads.


Learning Objectives

By the end of this session, you will be able to:

  1. Explain why eviction exists - Connect finite capacity to the need for workload-informed replacement decisions.
  2. Compare LRU, LFU, and ARC - Understand what each policy assumes about future reuse.
  3. Evaluate policy fit in practice - Recognize when recency, frequency, or adaptive hybrids are more appropriate.

Core Concepts Explained

Concept 1: Eviction Is the Cache's Forecasting Layer

The first cache lesson established that a cache is a smaller, faster copy. This lesson adds the missing pressure: the cache is smaller than the working set, so not everything can stay.

That means a cache must answer one uncomfortable question repeatedly:

If I need room for one more item, which existing item is least worth keeping?

That question is harder than it looks because the cache does not know the future. It only knows the past. So every eviction policy turns past access patterns into a forecast:

This is why eviction policy matters even when cache size is fixed. Two caches with the same memory budget can produce very different hit rates if they throw away different items.

The trade-off begins here:

So eviction is not merely about deletion. It is about preserving the most future value per byte of scarce fast storage.

Concept 2: LRU and LFU Encode Different Assumptions About Reuse

LRU and LFU are the two classic baselines because they represent two strong but different intuitions.

LRU - Least Recently Used

LRU assumes that recency predicts future reuse. If an item was touched recently, it is more likely to be touched again soon.

This works well when workloads have temporal locality, such as:

Its strengths:

Its failure mode is equally important:

LFU - Least Frequently Used

LFU assumes long-term popularity predicts future value. Items accessed many times deserve to stay even if they were not touched in the last few moments.

This works well when the workload has stable hot items, such as:

Its strengths:

Its failure mode:

So LRU tends to adapt faster but can be fooled by bursts and scans. LFU tends to remember long-term value but can adapt too slowly when the world changes.

That is the core contrast:

Neither is universally right. Each wins when its workload assumption is true.

Concept 3: ARC Tries to Adapt When Recency and Frequency Keep Taking Turns

ARC exists because real workloads rarely stay pure. Many systems mix:

ARC can be understood as an adaptive compromise. Instead of hard-coding one balance between recency and frequency, it keeps evidence about both and adjusts online.

At a high level:

That last part matters. ARC does not just store live cache entries; it also stores metadata about recently evicted entries so it can infer whether the workload is currently rewarding recency or frequency more.

The intuition is:

If recently-evicted recent items are requested again,
lean more toward recency.

If recently-evicted frequent items are requested again,
lean more toward frequency.

That is why ARC is conceptually powerful. It turns eviction into a learning problem rather than a fixed one-parameter guess.

But it is not free.

Costs include:

This gives us the practical lesson:

In practice, many production systems also use approximations rather than textbook-perfect policies. That is a sign of engineering realism: exact policy quality matters, but so does metadata overhead, concurrency cost, and simplicity of maintenance.


Troubleshooting

Issue: "The cache is big, but hit rate still collapses during scans."

Why it happens / is confusing: A recency-heavy policy like LRU may treat the scan as if all scanned objects suddenly mattered.

Clarification / Fix: Check whether the workload includes large one-time sweeps or bursty exploration. A frequency-aware or adaptive policy may fit better, or the scan path may need different admission logic.

Issue: "Our LFU cache keeps stale popularity and adapts too slowly."

Why it happens / is confusing: Long-term frequency can preserve yesterday's hot items after demand shifts.

Clarification / Fix: Use aging/decay, approximate counters, or a more adaptive policy when phase changes matter. Frequency without forgetting becomes inertia.

Issue: "ARC is obviously better, so we should always choose it."

Why it happens / is confusing: Adaptive policies sound like universal winners.

Clarification / Fix: ARC is attractive when workloads alternate between recency-heavy and frequency-heavy behavior, but its extra bookkeeping and complexity must still be justified against the miss cost and system constraints.


Advanced Connections

Connection 1: Cache Eviction Policies <-> Allocators and Memory Pressure

The parallel: Both are scarce-resource policies. Allocators decide how scarce fast memory gets carved up; eviction decides which cached entries deserve to stay once that memory is full.

Real-world case: In in-memory systems, poor eviction and poor allocator fit can amplify each other: the cache keeps the wrong entries, while fragmentation makes the memory budget feel even smaller.

Connection 2: Cache Eviction Policies <-> CDN and Application Caches

The parallel: Even at higher layers, the same recency vs frequency tension appears. A CDN facing flash crowds and a local application cache facing hot keys are both solving "what do we keep?" under finite capacity.

Real-world case: A policy that looks fine in a lab can fail in production because real traffic includes scans, bursts, and popularity shifts that stress the wrong heuristic.


Resources

Optional Deepening Resources


Key Insights

  1. Eviction is prediction, not cleanup - Once a cache is full, every replacement decision encodes a guess about what future requests will value most.
  2. LRU and LFU are different bets - LRU trusts recent reuse, LFU trusts durable popularity, and each fails when its workload assumption stops matching reality.
  3. Adaptive policies exist because workloads shift - ARC is useful precisely because real systems alternate between recency-heavy and frequency-heavy phases instead of staying pure.

PREVIOUS Cache Fundamentals - CPU to CDN NEXT MESI Protocol & Cache Coherence

← Back to Caching, Workers, and Performance

← Back to Learning Hub