Sagas and Outbox for Cross-Service Consistency

LESSON

Consistency and Replication

025 30 min advanced

Day 424: Sagas and Outbox for Cross-Service Consistency

The core idea: A saga keeps each service commit local, and the transactional outbox makes the resulting message part of that same local commit; together they replace one global atomic commit with a replayable workflow that can advance, compensate, and recover under failure.

Today's "Aha!" Moment

In 07.md, Harbor Point used two-phase commit to keep one reservation atomic across two shards inside a tightly controlled data system. That worked because every participant could enter prepared state, hold locks, and wait for one durable global decision. The platform now has to solve a different problem. A reservation for CA-MUNI-77 begins in the Inventory service, changes desk exposure in the Risk service, and eventually triggers downstream settlement preparation. Those systems do not share one storage engine, one lock table, or one recovery protocol.

The naive implementation is the classic dual write. Inventory decrements available quantity in its own database and then publishes InventoryReserved to the broker. If the process crashes after the database commit but before the publish, Risk never hears about the reservation. If it publishes first and the database commit fails, Risk reacts to an event for a reservation that does not exist. That gap is small in time and huge in consequence. A few milliseconds of bad ordering is enough to create phantom risk, orphaned inventory holds, or settlement work for trades that were never truly accepted.

The outbox pattern closes that first gap. Inventory writes its reservation row and an outbox row in the same local transaction, so "state changed" and "message must eventually be published" become one durable fact. A relay can publish later, retry later, or even fail over to another process, but it is replaying committed intent instead of guessing what the service meant to publish. The saga pattern then uses those durable events to move the workflow from one service to the next. If Risk rejects the desk because it would exceed its intraday limit, the workflow does not magically roll back the past. It issues a compensating action that releases the inventory hold through another local transaction.

That is the mental shift. Sagas are not "distributed transactions, but easier." They are explicit workflow state machines for systems that cannot or should not hold one global prepared state. The next lesson, 09.md, follows directly from this point: once the outbox is the authoritative source of what must be published, change data capture from the write-ahead log becomes a stronger way to ship those rows than an ad hoc polling loop.

Why This Matters

Harbor Point's reservation flow is latency-sensitive, but its bigger constraint is cross-service correctness. At 09:31, desk ALPHA submits a $2M reservation for CA-MUNI-77. Inventory must hold the bonds immediately so another desk cannot oversell them. Risk must decide whether that extra exposure still fits the desk's limit. Settlement preparation must only start if the reservation is actually accepted. Those are separate responsibilities owned by separate services, and the product does not get to pretend they are one database transaction just because the API call looks singular from the trader's screen.

This is exactly where many production systems fail in subtle ways. A broker publish that is "usually immediate" gets delayed. One consumer retries after a timeout and replays the same step twice. A service deploy lands during the handoff between local commit and message emission. The result is not a dramatic crash. It is a workflow that stalls in an ambiguous state: inventory held, risk unchanged, customer told "pending," operators grepping logs to reconstruct what should happen next.

Sagas and outbox patterns matter because they make those states first-class instead of accidental. The workflow becomes inspectable. Each service owns one local transaction at a time. Every inter-service step becomes durable, replayable, and idempotent by design. The trade-off is equally real: Harbor Point gives up the simplicity of one atomic outcome in exchange for better fit across service boundaries and external side effects.

Learning Objectives

By the end of this session, you will be able to:

  1. Explain why the dual-write gap appears at service boundaries - Show why "commit the database, then publish the event" is not a safe cross-service protocol.
  2. Trace a reservation saga end to end - Follow Harbor Point's inventory hold, risk approval or rejection, and compensation path through durable local steps.
  3. Evaluate when saga plus outbox is the right tool - Distinguish workflows that need local commits plus compensation from cases that should stay inside one database or one stricter commit domain.

Core Concepts Explained

Concept 1: The transactional outbox turns "publish this event" into durable local state

Start with the first service in Harbor Point's reservation flow. Inventory receives Reserve(CA-MUNI-77, desk=ALPHA, qty=2_000_000, saga_id=8841). It has exactly one local invariant it can enforce by itself: either the hold row exists and available quantity is reduced, or nothing changed. What it cannot safely do is promise "and the message definitely reached Kafka" as part of that same database transaction if the broker is outside the database's commit protocol.

The outbox pattern fixes that by narrowing the atomic boundary. Inventory opens one local transaction, inserts or updates the domain rows for the hold, and also inserts an outbox row such as event_id=evt-8841-1, topic=inventory.reserved, payload=..., aggregate_id=CA-MUNI-77, saga_id=8841. When the transaction commits, Harbor Point knows two things are true together: the inventory hold exists, and there is durable evidence that this hold must be published. If the process crashes immediately afterward, another relay can still discover the outbox row and publish it later.

Inventory request
   |
   v
+-----------------------------+
| local DB transaction        |
| 1. decrement available qty  |
| 2. insert reservation hold  |
| 3. insert outbox row        |
+-----------------------------+
   |
   v
commit
   |
   v
relay publishes inventory.reserved

One compact implementation looks like this:

def reserve_inventory(cmd):
    with db.transaction() as tx:
        hold = tx.create_hold(cmd.instrument, cmd.qty, cmd.desk, cmd.saga_id)
        tx.insert_outbox(
            event_id=f"{cmd.saga_id}:inventory-reserved",
            topic="inventory.reserved",
            aggregate_id=cmd.instrument,
            payload={"hold_id": hold.id, "desk": cmd.desk, "qty": cmd.qty},
        )

The mechanism is small, but the operational implications are large. The relay is now allowed to be at-least-once because the event record itself is durable and consumers can deduplicate on event_id. Harbor Point can also monitor outbox lag as a direct measure of "how long committed business intent waits before the rest of the platform sees it." The trade-off is that publication is no longer synchronous with the request path. The workflow gains durability and replayability, but it must tolerate a short window where the Inventory database is ahead of the rest of the system.

Ordering matters here too. If multiple events for CA-MUNI-77 are published out of order, Risk may process a release before the matching hold. Production systems therefore usually partition by aggregate or saga identifier and keep the outbox schema explicit about event ordering, deduplication keys, and publish status. The outbox solves the local database-plus-broker race. It does not remove the need to design consumer semantics carefully.

Concept 2: A saga coordinates cross-service progress through durable steps and compensations

Once Inventory publishes inventory.reserved, Harbor Point still does not have a confirmed reservation. It has one successful local step in a larger workflow. The saga is the mechanism that tracks that workflow across services. In Harbor Point's case, an orchestrated saga is the clearest design: a Reservation Workflow service records the current state for saga_id=8841, listens for step outcomes, and decides what command comes next.

For one reservation, the path looks like this:

1. Inventory: hold bonds for CA-MUNI-77
2. Workflow: wait for inventory.reserved
3. Risk: apply provisional exposure for desk ALPHA
4a. If accepted: emit reservation.confirmed and continue to settlement prep
4b. If rejected: command Inventory to release the hold
5. Inventory: publish inventory.released as compensation complete

The important point is that compensation is a new business action, not an undo button inside someone else's database. If Risk rejects the reservation because desk ALPHA would exceed its limit, Inventory runs a fresh local transaction that releases the hold and writes another outbox event. That means other systems may briefly observe inventory.reserved before they observe inventory.released. Harbor Point is not promising invisible rollback. It is promising that the workflow will converge through explicit next steps.

That requirement changes the data model. Every participating service needs a stable idempotency key such as saga_id + step_name, plus a record of whether it has already applied that step. Risk cannot increment exposure twice just because inventory.reserved was redelivered after a consumer restart. The workflow service also needs timeouts and a durable saga state table. If the response from Risk never arrives, Harbor Point must know whether to retry the command, mark the workflow for operator review, or expire the reservation hold after a deadline.

The gain is that each service keeps ownership of its own storage and invariants. Inventory knows inventory. Risk knows exposure. Settlement knows downstream handoff rules. No service has to hold a prepared lock while waiting on every other service. The cost is more explicit state. Harbor Point now carries statuses like PENDING_RISK, REJECTED_COMPENSATING, and CONFIRMED, and those states are part of the real production contract rather than temporary implementation detail.

Concept 3: The hard part is not starting a saga, but choosing safe boundaries for compensation and visibility

The biggest design mistake with sagas is to think every workflow can be "compensated later" without changing the business contract. That is false. Releasing an inventory hold is compensable because the original hold is still under Harbor Point's control. Sending an irreversible instruction to an external settlement network is different. Once the clearing rail accepts a message, Harbor Point may need a cancel message, a reversal, or a manual exception process. There may be no true rollback.

That is why production saga design usually orders steps from easiest to compensate toward hardest to compensate. Harbor Point should reserve inventory first, because releasing it is straightforward. It should apply provisional risk next, because removing provisional exposure is still local and reversible. It should send the hardest-to-reverse external instruction only after the earlier business conditions are satisfied. In other words, sagas are as much about workflow sequencing as about messaging.

This is also where the comparison with 07.md becomes sharp. Two-phase commit resolves uncertainty before state becomes visible by forcing everyone to wait on one commit decision. Saga plus outbox allows intermediate committed states, then resolves uncertainty by driving the workflow forward or by compensating after failure. One approach spends lock time and availability to buy atomicity. The other spends workflow complexity and observability to fit across service boundaries.

The outbox relay choice is part of that trade-off. A simple polling process can work for moderate throughput, but as Harbor Point's volume rises the team will care about publish latency, ordering, and relay load on the primary database. That pressure leads directly to 09.md: change data capture from the write-ahead log is often the cleaner way to externalize committed outbox rows without building a brittle polling system around published_at IS NULL.

The practical lesson is that "eventual consistency" is not a permission slip for vague behavior. Harbor Point still needs concrete guarantees: which states can be observed, how duplicates are handled, how long a reservation may sit in PENDING_RISK, and what operator path exists when compensation itself fails. A saga is production-ready only when those boundaries are explicit.

Troubleshooting

Issue: Risk exposure is incremented twice for the same reservation after a consumer restart.

Why it happens / is confusing: The outbox relay and broker are typically at-least-once, so a redelivered inventory.reserved event is normal. If Risk treats message delivery as uniqueness, a retry becomes a duplicate business action.

Clarification / Fix: Make the consumer idempotent on a stable business key such as saga_id + step_name or event_id. Store that deduplication decision in the same local transaction as the exposure update.

Issue: Inventory looks correct in its own database, but downstream services are minutes behind during peak traffic.

Why it happens / is confusing: The outbox closes correctness gaps, but it does not guarantee immediate publication. A slow poller, poor indexing, or relay outage turns the outbox into a growing backlog.

Clarification / Fix: Track outbox depth and oldest-unpublished age as first-class lag metrics. Index the unpublished rows well, archive sent rows aggressively, and decide whether a WAL-based CDC relay is warranted.

Issue: A reservation reaches the external settlement adapter, then Risk later rejects the workflow and compensation cannot fully undo the side effect.

Why it happens / is confusing: The workflow put a hard-to-reverse side effect too early in the saga, assuming all failures could be repaired with a simple compensating transaction.

Clarification / Fix: Move irreversible actions to the end of the saga, or model them as a separate workflow with explicit cancel or reversal semantics. If no safe compensation exists, do not treat the step as casually reversible.

Issue: Operators see many reservations stuck in PENDING_RISK after a deploy, but no service reports a hard error.

Why it happens / is confusing: A saga can stall quietly when a consumer group is paused, a topic partition is misrouted, or the workflow service missed a timeout transition.

Clarification / Fix: Give each saga state an age budget, alert on overdue transitions, and persist timeout deadlines in the workflow store so another worker can resume or compensate stalled reservations.

Advanced Connections

Connection 1: 07.md and this lesson solve the same cross-boundary correctness problem with opposite failure strategies

Two-phase commit tries to prevent any partial outcome from becoming visible before one global decision is durable. Saga plus outbox accepts that partial local outcomes will exist, then makes the workflow durable enough to continue or compensate afterward. Harbor Point should prefer 2PC only where participants can actually support prepared state and the business invariant cannot tolerate visible intermediate commits.

Connection 2: 09.md is the transport refinement for the outbox side of this design

An outbox row is useful only if the platform can externalize it reliably and in order. Polling the application tables is the simplest relay, but WAL-based change data capture often becomes the better production mechanism because it follows the database's own commit stream and reduces custom publish logic.

Resources

Optional Deepening Resources

Key Insights

  1. The outbox solves the local dual-write problem, not the whole workflow problem - It guarantees that a committed state change and the intent to publish travel together, but downstream consumers still need ordering and idempotency.
  2. A saga is a durable state machine, not a hidden rollback layer - Failures are repaired by explicit compensating actions that create new committed facts.
  3. Compensation safety depends on step ordering - Put the easiest-to-reverse local actions early and the hardest-to-reverse external effects late, then measure lag and stuck-state age as production signals.
PREVIOUS Distributed Transactions and Two-Phase Commit NEXT Change Data Capture from WAL Streams

← Back to Consistency and Replication

← Back to Learning Hub