Replication Models: Primary-Backup, Multi-Leader, and Leaderless

LESSON

Consensus and Coordination

009 30 min intermediate

Replication Models: Primary-Backup, Multi-Leader, and Leaderless

The core idea: Replication models are coordination choices because each one decides where writes become authoritative and where the system pays to reconcile disagreement.

Core Insight

Imagine a product with three kinds of data: account balances, shopping carts, and user preference toggles. Copying each kind of data to several machines is easy to say, but the hard question is different: where is a write allowed to become real?

For account balances, a single strongly controlled write path may be worth the latency because conflicting updates are expensive. For shopping carts, local writes from multiple devices or regions might be acceptable if the merge behavior is clear. For preferences, a leaderless or eventually repaired model may be enough if "last update wins" or another simple rule is acceptable.

Primary-backup, multi-leader, and leaderless replication are three answers to that authority question. They are not just deployment diagrams. They move coordination cost to different moments: before the write, after the write, or across read/write quorum and repair paths.

The trade-off is the whole point. Primary-backup buys clarity by centralizing authority. Multi-leader buys locality by accepting reconciliation. Leaderless buys availability and flexibility by pushing authority into quorums, version metadata, and repair semantics.

Authority, Not Just Copies

A system can have many physical copies while still having one logical write authority. Another system can have many write authorities and therefore more conflict risk. A third system can avoid a permanent leader and instead decide whether enough replicas have participated.

The useful comparison is:

Model           Where writes become authoritative
--------------  ---------------------------------------------
Primary-backup  one primary path
Multi-leader    several regional or shard-local leaders
Leaderless      replica quorums plus version/repair semantics

This framing helps avoid a common mistake: choosing a replication model by how scalable it sounds rather than by how the data behaves under conflict.

A balance transfer, a collaborative document edit, and a preference toggle do not have the same conflict cost. A replication model that is excellent for one may be dangerous or needlessly expensive for another.

Primary-Backup: Pay Early for a Clear Writer

Primary-backup centralizes normal write authority. Clients send writes to the primary; backups learn from it.

client -> primary -> backups

That model is attractive when the system needs a simple answer to "what order did writes happen in?" The primary becomes the normal sequencer for a shard, log, or piece of metadata.

It often fits:

The benefit is a cleaner correctness story. The system can make most write conflicts disappear by not allowing several normal writers for the same authority domain.

The cost is concentration:

Primary-backup is therefore not "simple and free." It pays coordination cost before or during the write so the rest of the system has less reconciliation to do later.

Multi-Leader: Buy Local Writes, Own Conflicts

Multi-leader replication lets several leaders accept writes, often in different regions or sites:

region A leader <----async----> region B leader
region B leader <----async----> region C leader

This can be the right shape when local write latency and regional autonomy matter. A user in Europe can write to a European leader while a user in Asia writes to an Asian leader, even if the inter-region link is slow or temporarily impaired.

The price is that several writes can be legitimate at the same time. That means global order is no longer simple. Conflicts are not protocol accidents; they are part of the design surface.

Multi-leader works best when at least one of these is true:

The practical question is not "can we run more leaders?" It is:

Can this product survive several valid writers before reconciliation finishes?

If the answer is no, multi-leader can look fast in the happy path and become expensive during every edge case.

Leaderless: Move Coordination into Quorums and Repair

Leaderless replication removes the fixed primary role. Clients can write to a replica set, and the system uses quorum choices, version metadata, read repair, anti-entropy, and merge policy to converge.

client -> replicas
          write quorum
          read quorum
          versioning / repair / merge

This model can improve availability and locality because no single always-current leader has to be reachable for every operation. But leaderless does not mean "no coordination." It means coordination has moved.

Instead of asking one leader for the authoritative write path, the system asks:

Did enough replicas participate?
Can reads observe enough evidence?
Can divergent versions be detected and repaired?
Does the application know how to merge conflict?

The hidden cost is real:

Leaderless systems are strongest when availability and partition tolerance matter, and when the data model can absorb reconciliation without surprising users.

Worked Example: Three Data Types

Take three fields in the same product.

account_balance
shopping_cart
theme_preference

For account_balance, conflicting writes can lose money or create double-spend behavior. A primary-backed or consensus-backed model may be worth the coordination cost.

For shopping_cart, two devices may add different items while disconnected. A multi-leader or leaderless model can work if the merge rule is "union added items, preserve removals carefully."

For theme_preference, the system may accept a much weaker rule, such as the most recent timestamped preference wins. A leaderless or eventually repaired path may be acceptable because the cost of conflict is low.

The same replication model across all three may be either too strict or too loose. The right design follows the data's conflict semantics.

Comparison and Misreadings

A compact way to remember the trade-off:

Model           Main gain                 Main cost
--------------  ------------------------  -------------------------------
Primary-backup  simple authority/order    bottleneck and failover pressure
Multi-leader    local writes/autonomy     conflict handling later
Leaderless      availability/flexibility  quorum, repair, and merge logic

"Multi-leader is just faster primary-backup" is wrong. It creates several legitimate write authorities, which means reconciliation becomes a core responsibility.

"Leaderless removes coordination" is also wrong. It removes one coordination shape and replaces it with quorum math, version tracking, and repair.

"Primary-backup is always safest" is too broad. It may be safest for a particular invariant, but it can be too expensive for workloads that need regional locality or high availability under partitions.

Connections

The previous lesson showed ZAB as a strong leader-based design for ordered coordination state. Primary-backup lives near that world: one normal authority makes ordering easier.

The next lesson on distributed logs sharpens the same question. Logs give ordered histories, but the scope of that order depends on partitioning, authority, and the replication model underneath.

Resources

Key Takeaways

PREVIOUS ZAB and Total Order Broadcast in Practice NEXT Distributed Logs and Ordering Guarantees