Replicated Data Service Capstone

LESSON

Consistency and Replication

024 30 min advanced CAPSTONE

Replicated Data Service Capstone

The core idea: A production consistency design works when it narrows expensive coordination to the business decisions that must be final, then serves everything else from replayable logs and bounded-stale projections with explicit lag budgets.

Core Insight

Harbor Point is ready to ship the replicated reservation service only if it can explain one difficult moment. Two desks, one in Madrid and one in New York, try to reserve exposure for issuer MUNI-77 during market open. At the same time, compliance search is catching up from a CDC stream, a remote replica is a few seconds behind, and one client times out while waiting for a confirmation response.

The useful question is not "should the database be strongly consistent or eventually consistent?" That phrasing is too blunt. The real question is where Harbor Point is willing to pay for coordination. If every read, dashboard, and search query waits on the same strict path as the exposure update, latency and availability suffer everywhere. If the exposure update does not coordinate tightly enough, two desks can both be told "confirmed" and the business invariant is already broken.

The design becomes defensible once Harbor Point names its invariants first and assigns a mechanism to each one. Issuer exposure and live reservations are decided by one shard leader. Confirmation uses a durable reservation token so retries and failover do not replay the business action blindly. Search, dashboards, and compliance views are derived from committed logs and allowed to be stale only within declared budgets. The trade-off is intentional: the system pays coordination cost where ambiguity is unacceptable and exposes lag everywhere else.

Architecture Contract

Harbor Point writes the contract before it chooses the topology:

User-facing action Required guarantee Chosen mechanism What Harbor Point pays
POST /reservations for issuer MUNI-77 One issuer exposure decision at a time Route to the shard leader that owns issuer_exposure and live reservations Leader-routed write plus synchronous local follower acknowledgment
GET /reservation-tokens/{token} after timeout Durable idempotent outcome Store token status in the same commit as the reservation Extra write state, but safe retry handling
GET /issuers/{issuer_id}/open-reservations Read-your-write or bounded staleness Use observed commit tokens and follower freshness checks Occasional leader reads when followers lag
Compliance search across issuers Freshness published, usually within seconds CDC-fed search and compliance index Derived-state infrastructure and lag monitoring
Regional failover RPO at most 5s, RTO under 10m Promote only the remote replica's durable replayed prefix Explicit write gating when remote lag exceeds budget

This table is the real design artifact. It prevents Harbor Point from pretending that every endpoint deserves the same consistency cost. It also prevents the opposite mistake: hiding irreversible financial decisions behind "eventual" language when the product promise requires a final answer.

Decisive Writes Stay Narrow

Harbor Point chooses issuer_id as the authority boundary for the strict path. Each issuer shard has one leader at a time, a same-region synchronous follower for normal durability, and a remote asynchronous follower for disaster recovery:

regional API gateway
       |
       v
shard map: MUNI-77 -> shard 184 -> home region New York
       |
       v
shard 184 leader
       |
       +--> sync follower in same region
       |
       +--> async follower in remote region
       |
       +--> WAL / outbox -> CDC views

The confirmation path for reservation R-88421 is deliberately small:

  1. The gateway resolves issuer_id=MUNI-77 to shard 184.
  2. The shard leader locks or otherwise serializes the issuer_exposure record.
  3. The leader checks remaining headroom and the request's reservation_token.
  4. In one durable commit, it inserts the reservation row, updates exposure, stores the token outcome, and emits an outbox event.
  5. The leader waits for the same-region follower required by the commit rule.
  6. The response includes a commit token such as shard184@982441.

In code-shaped terms, the boundary is this:

def create_reservation(cmd):
    return issuer_shard(cmd.issuer_id).commit(
        token=cmd.reservation_token,
        check=lambda state: state.remaining_exposure >= cmd.notional,
        writes=[
            ("issuer_exposure", cmd.issuer_id, {"reserve": cmd.notional}),
            ("reservation", cmd.reservation_id, {"status": "confirmed"}),
            ("token", cmd.reservation_token, {"reservation_id": cmd.reservation_id}),
            ("outbox", next_event(), {"type": "reservation-confirmed"}),
        ],
    )

The mechanism matters more than the syntax. Harbor Point is not trying to make every region a concurrent writer for the same issuer. It is not putting compliance search, dashboards, or external analytics into the write quorum. The irreversible decision is made once, under the shard that owns the invariant, with a token record that survives timeout, retry, restore, and failover.

The trade-off is visible. Cross-region desks sometimes pay a WAN hop to the issuer's home leader. Harbor Point accepts that cost because it is cheaper than reconciling over-allocated exposure after two users have already seen success.

Logs And Projections Serve The Rest

Once R-88421 commits, Harbor Point stops treating every consumer as part of the write path. The committed log becomes the boundary between authoritative state and derived views:

issuer shard commit
       |
       v
WAL / outbox / CDC stream
       |
       +--> issuer dashboard projection
       +--> compliance search index
       +--> risk reporting view
       +--> backup and replay pipeline

The issuer dashboard can read from a follower or projection if it can prove freshness relative to the caller's commit token. Compliance search can lag by a few seconds because investigators can see the freshness watermark and fall back to an authoritative shard-backed lookup during an incident. Risk reporting can be replayed from the committed log rather than patched by hand.

This is how Harbor Point avoids over-coordination without becoming sloppy. Derived views are allowed to be weaker only because the decisive write path is stronger and because every derived path has a freshness contract. The log is not just plumbing. It is the proof trail that lets the team rebuild indexes, explain token outcomes, restore to a point in time, and run failure tests against client-visible history.

The control plane follows the same rule. Shard ownership changes only through versioned metadata. Rebalancing copies, catches up, and cuts over before routers accept the new owner. Membership changes are committed before nodes vote or serve traffic. Failover promotes only the durable remote prefix and resolves ambiguous writes through token status. Routing, replication, and recovery all answer the same question: who is the authority for this shard at this generation?

Readiness Review

Harbor Point is ready only when the design can survive the uncomfortable cases.

If a client times out after submitting reservation_token=tok-77, support must be able to ask the issuer shard whether that token committed. The answer cannot depend on a best-effort log search or a support engineer re-running the business action.

If the remote replica for shard 184 falls more than five seconds behind, the service must stop advertising a five-second RPO for that shard. It can throttle writes, pause topology changes, or degrade the region's failover claim, but it cannot keep the risk hidden behind a green API health check.

If compliance search is behind, the index must expose its watermark. A stale search result is acceptable only when the endpoint's contract admits staleness and the workflow has a stronger path for authoritative decisions.

If a leader, member, or partition owner changes, every old route must be fenced by term, epoch, or generation. A returning node with stale data is not useful until it rejoins through the membership and catch-up workflow.

Those checks turn a collection of mechanisms into a service contract. The design is not complete because it uses replication. It is complete when every important promise has an authority, a serving path, a lag budget, an operational fallback, and a test that can falsify it.

Operational Failure Modes

"A confirmation timeout means the reservation probably failed, so retry the whole action." Timeouts create uncertainty, not rollback. Persist the reservation token with the authoritative commit and resolve retries against the stored outcome.

"Compliance search should be strongly consistent so nobody sees stale results." That spends global coordination on a derived lookup path. Keep search bounded-stale, publish the freshness watermark, and route final decisions through the issuer shard.

"The remote replica is only for disasters, so lag is just an operations metric." Remote lag is the distance between the advertised RPO and reality. Gate failover claims, planned rebalancing, and write admission on the replay budget.

"A projection can be patched directly during an incident." Direct edits create drift from the committed log. Emit a corrective event, rebuild the projection, or replay from the authoritative source so derived state remains explainable.

"Cross-region failover is just leader election." Promotion is safe only for the durable prefix that has actually reached the remote region, and old owners must be fenced by generation before traffic resumes.

Connections

Resources

Key Takeaways

  1. A consistency design is not one global mode; it maps user-visible invariants to specific coordination points, lag budgets, and recovery rules.
  2. Harbor Point keeps issuer exposure and confirmed reservations on one authoritative shard, then exports committed truth through logs and projections for everything else.
  3. Bounded-stale reads are safe only when the decisive write path remains authoritative and each projection has explicit replay and lag semantics.
  4. Failover, rebalancing, and retries are part of the consistency contract because a design is only real if it can explain ambiguous outcomes after something breaks.
PREVIOUS Replication Failure Mode Check