Day 211: Paxos Fundamentals: Single-Decree Consensus
Single-decree Paxos is not a ritual of extra messages. It is a safety device: every new attempt to choose a value must first learn enough history that it cannot accidentally choose a conflicting one.
Today's "Aha!" Moment
Paxos often feels harder than it is because people first see the message names, prepare, promise, accept, before they see the fear driving them. The fear is simple:
- what if one proposer almost got a value chosen
- then another proposer appears later
- and, without realizing what already happened, chooses a different value?
If that were allowed, safety would be gone.
That is the aha for single-decree Paxos. The protocol is built around one central rule:
- any later attempt to decide a value must inherit enough information from earlier attempts that a conflicting value cannot be chosen
Once we see Paxos through that lens, the phases stop looking ceremonial.
preparesays: "before I try to lead, tell me whether a higher-priority attempt already left history I must respect"promisesays: "I will not help older proposals after hearing this newer ballot"acceptsays: "given what we now know, here is the value we are trying to get chosen"
In other words, Paxos is really a protocol for transferring the obligation to respect history from one leader attempt to the next.
Why This Matters
Suppose three replicas are trying to agree on one configuration value. Proposer P1 gets partway through choosing value A, but some messages are delayed. Then proposer P2 wakes up later and tries to choose B.
If P2 could ignore the partial history of A, the system might end up with two overlapping quorums supporting different outcomes. That is exactly the kind of split decision consensus exists to prevent.
Paxos matters because it solves that problem without assuming perfect failure detection or perfect timing. It preserves safety even when:
- messages are delayed
- proposers race
- leaders are suspected incorrectly
- nodes crash and recover
The price is that the protocol is built for correctness first, not immediate readability. If we do not understand the safety idea underneath, the message flow can feel arbitrary. If we do understand it, Paxos becomes a very crisp answer to the question:
- how do later proposal attempts avoid breaking decisions that earlier attempts may already have constrained?
That is also why this lesson is important before Multi-Paxos and Raft. If the learner misses the single-decree safety story, the optimized protocols will look like magic performance hacks instead of principled simplifications.
Learning Objectives
By the end of this session, you will be able to:
- Explain what single-decree Paxos is trying to protect - Describe why later proposal attempts must learn the highest accepted history from intersecting quorums.
- Trace the two main phases - Understand what
prepare/promiseandaccept/acceptedeach contribute to safety. - Reason about its main trade-off - Recognize why Paxos is elegant for safety but awkward for repeated decisions without optimization.
Core Concepts Explained
Concept 1: Paxos Exists to Preserve Safety Across Competing Proposal Attempts
Concrete example / mini-scenario: A cluster of three acceptors is trying to agree on one value. Proposer P1 starts an attempt with ballot number 10 and value A, but only some nodes hear from it before the network shifts. Later proposer P2 starts with ballot 11 and wants value B.
The protocol's central problem is not just choosing a value once. It is choosing a value safely even when several attempts overlap in time.
The key structural fact Paxos relies on is quorum intersection. If a value could already have been chosen by a majority, then any later majority must overlap with that earlier majority in at least one acceptor.
That overlap is the safety bridge.
ASCII intuition:
majority 1: [A1, A2]
majority 2: [A2, A3]
overlap: A2
If A2 remembers enough about the earlier attempt, then a later proposer can be forced to respect that history.
That is why Paxos revolves around numbered ballots. Ballot numbers are not about wall-clock time. They are a way to say:
- this newer attempt has priority over older ones
- but it must first discover whether older attempts already constrained the value that can safely continue
So the first mental model for Paxos should be:
ballots order proposal attempts
quorum overlap transfers safety-relevant history
Everything else in the protocol is built on top of that.
Concept 2: Prepare/Promise Learns the History; Accept Tries to Finish the Decision
Concrete example / mini-scenario: Proposer P2 wants to use ballot 11. Before proposing a value freely, it contacts a quorum with a prepare(11) request.
This first phase does two jobs.
Job 1: claim leadership priority for this ballot
When an acceptor receives prepare(11), it can promise:
- "I will not accept any proposal with ballot lower than 11 from now on."
That prevents older attempts from continuing as if nothing changed.
Job 2: reveal relevant accepted history
The acceptor also returns the highest-numbered proposal it has already accepted, if any. This is the crucial step. It means the new proposer does not start from ignorance.
If any acceptor in the quorum reports an already-accepted value, the proposer must adopt the value with the highest ballot among those reports.
That is the core safety rule.
Why? Because if some value might already be on the path to being chosen, the overlapping quorum ensures that path leaves traces. The new proposer must continue the safest compatible history rather than invent a conflicting one.
Then comes phase two.
Once the proposer has either:
- learned that no relevant value was previously accepted, or
- adopted the highest accepted value it learned
it sends accept(ballot, value) to a quorum.
If a majority of acceptors accept that pair, the value is chosen.
A minimal flow looks like this:
Phase 1:
proposer -> quorum: prepare(11)
quorum -> proposer: promise(11, highest_accepted_if_any)
Phase 2:
proposer chooses safe value
proposer -> quorum: accept(11, value)
quorum -> proposer: accepted(11, value)
What often confuses learners is the rule:
- "why can't the proposer keep its own value after prepare?"
Answer: because phase one is not only about leadership. It is about inheriting safety obligations from earlier overlapping attempts.
Concept 3: Single-Decree Paxos Is Safety-Centric, So It Is Correct but Operationally Awkward for Repeated Decisions
Concrete example / mini-scenario: A system needs to agree on not one value, but a long sequence of log entries. Running the full two-phase process from scratch for every slot would be expensive and hard to operate.
This is where the trade-off becomes visible.
Single-decree Paxos is beautifully focused on one decision:
- one slot
- one chosen value
- safety under competing proposers and delayed messages
That makes it a powerful teaching foundation, but not yet a practical replicated log by itself.
Its main characteristics are:
- strong safety story under crash faults
- no dependence on perfect failure detection
- awkward repeated costs if every log entry requires a fresh contested leadership dance
This is why later systems introduce leader-based optimization:
- Multi-Paxos tries to reuse a stable leader/ballot across many entries
- Raft makes the leader model and log structure more explicit for humans and implementers
So the right way to think about single-decree Paxos is:
Paxos (single decree)
= the minimal crash-fault safety core for choosing one value
not yet
= the final convenient architecture for a production replicated log
That is not a weakness. It is exactly why the protocol is such a good foundation lesson.
Troubleshooting
Issue: "Why does the proposer sometimes have to abandon its own value?"
Why it happens / is confusing: It feels natural to think a leader should just propose what it wants.
Clarification / Fix: In Paxos, leadership is constrained by history. If the quorum reveals that some earlier accepted value may already be on the path to being chosen, the new proposer must carry that value forward to preserve safety.
Issue: "Does the highest ballot always win because it is newer?"
Why it happens / is confusing: Ballot numbers can look like timestamps.
Clarification / Fix: Higher ballots do not win because they are newer in a physical-time sense. They win because acceptors promise not to help lower ballots afterward, and because later proposers must respect accepted history revealed by the quorum.
Issue: "If a proposer got a majority, does that mean every node immediately knows the chosen value?"
Why it happens / is confusing: "Chosen" sounds like "universally learned."
Clarification / Fix: A value can be chosen once a quorum has accepted it, even if not every participant has learned that fact yet. Learning/commit dissemination is a separate concern from the safety argument about chosen-ness.
Advanced Connections
Connection 1: Single-Decree Paxos <-> FLP
The parallel: Paxos keeps safety regardless of timing, but progress depends on the environment eventually letting one proposer run without endless interference.
Real-world case: Under unstable timing or proposer contention, Paxos may stall without violating correctness, which is exactly the kind of behavior FLP tells us to expect.
Connection 2: Single-Decree Paxos <-> Multi-Paxos
The parallel: Multi-Paxos does not replace the safety core. It amortizes the leadership-establishment cost across many decisions once one leader/ballot is stable enough.
Real-world case: Systems that need a whole replicated log usually keep Paxos's quorum logic but avoid re-running the full competition for every single slot.
Resources
Optional Deepening Resources
- [PAPER] Paxos Made Simple
- Link: https://lamport.azurewebsites.net/pubs/paxos-simple.pdf
- Focus: Read this now that you have the safety story in mind; the role of Phase 1 becomes much easier to follow.
- [PAPER] The Part-Time Parliament
- Link: https://lamport.azurewebsites.net/pubs/lamport-paxos.pdf
- Focus: Useful if you want the original framing and can tolerate a more narrative style.
- [ARTICLE] Paxos Made Moderately Complex
- Link: https://paxos.systems/paper/
- Focus: Good follow-up when you want a more implementation-minded discussion after the conceptual core is clear.
Key Insights
- Paxos is about carrying safety across competing attempts - New proposers must learn and respect the highest accepted history revealed by an overlapping quorum.
- Phase 1 is not ceremony -
prepare/promiseexists to transfer safety obligations before a proposer is allowed to push phase two. - Single-decree Paxos is a foundation, not the whole product - It is the minimal crash-fault safety core for one decision, which later protocols optimize for repeated log entries.
Knowledge Check (Test Questions)
-
Why does a new proposer run Phase 1 before freely choosing a value?
- A) To compress messages for the network.
- B) To learn whether earlier accepted history constrains the value it may safely propose.
- C) To guarantee every node is online first.
-
What is the key role of quorum intersection in Paxos?
- A) It guarantees every node stores the same value immediately.
- B) It ensures a later quorum overlaps with earlier accepted history so safety information can be carried forward.
- C) It removes the need for ballot numbers.
-
Why is single-decree Paxos not yet the most convenient architecture for a replicated log?
- A) Because it cannot preserve safety.
- B) Because it only works with Byzantine faults.
- C) Because repeating the full competition for every decision is expensive, which motivates Multi-Paxos and related optimizations.
Answers
1. B: Phase 1 exists so the proposer learns any prior accepted value that safety now forces it to continue.
2. B: Overlapping quorums are how later attempts inherit enough history to avoid choosing a conflicting value.
3. C: The single-decree protocol is the right safety core for one decision, but repeated decisions motivate leader-based amortization.