Day 023: Saga Pattern - Distributed Transactions Without Transactions
Topic: Saga pattern fundamentals
π‘ Today's "Aha!" Moment
The insight: Distributed transactions (2PC) are dead in microservicesβtoo slow, lock resources, kill availability. Sagas provide eventual consistency through compensating transactions instead of locking.
Why this matters:
Traditional distributed transaction (2PC - Two-Phase Commit):
Coordinator: "Everyone ready to commit?"
Services: "Yes!" (LOCKED, waiting...)
Coordinator: "Commit!" or "Rollback!"
Services: Finally unlock
Problem: If coordinator crashes, services LOCKED FOREVER
2PC sacrifices availability for consistency (violates CAP theorem in distributed systems). One slow service blocks everyone.
Saga pattern: Chain of local transactions, each with compensating action:
Order Saga:
1. Reserve inventory β (Compensate: Release inventory)
2. Charge payment β (Compensate: Refund payment)
3. Ship order β (Compensate: Cancel shipment)
If step fails: Execute compensations in reverse order
Two approaches:
1. Choreography (event-driven, decentralized):
π Why This Matters
The Saga pattern enables reliable distributed transactions in microservices architectures. Companies like Amazon, Uber, and Netflix rely on this pattern to coordinate complex business processes across hundreds of services while maintaining high availability.
Real-world impact:
In traditional monolithic systems, database transactions provide ACID guarantees across the entire application. When you split into microservices, each service owns its dataβyou can't use a single database transaction spanning multiple services. Two-Phase Commit (2PC) theoretically solves this but fails in practice due to:
- Availability issues: If the coordinator crashes, all participants stay locked
- Performance bottlenecks: Synchronous coordination scales poorly
- Network partitions: CAP theorem forces choice between consistency and availability
Sagas provide a pragmatic alternative: accept eventual consistency, design for compensation. This mindset shift enables:
- Horizontal scaling: Services don't wait for distributed locks
- Independent deployment: Services evolve without coordinating transaction protocols
- Failure resilience: Compensations handle partial failures gracefully
- Observable workflows: Each step publishes events for monitoring
Modern e-commerce platforms process millions of orders daily using sagasβorders that would be impossible with 2PC locking.
π― Daily Objective
Master the Saga pattern: learn how to coordinate long-running business transactions across multiple microservices without distributed transactions. Understand choreography vs orchestration and implement both approaches.
π Topics Covered
The Saga Pattern fundamentals:
Sagas solve the distributed transaction problem by decomposing long-running business transactions into a sequence of local transactions. Each local transaction updates the database and publishes an event or message to trigger the next step. If a step fails, the saga executes compensating transactions to undo the changes made by preceding steps.
Key characteristics of sagas:
- Local transactions only: Each service manages its own database with local ACID transactions
- Eventual consistency: The saga completes over time, not atomically
- Compensating transactions: Semantic undo operations (not physical rollback)
- Isolation challenges: Intermediate states are visible to other transactions
- No distributed locks: Saga doesn't lock resources across services
Two implementation patterns:
1. Choreography (Event-Driven Coordination):
Each service listens for events, performs its local transaction, and publishes new events. No central coordinator existsβservices react to events autonomously.
Example flow:
OrderService: Creates order β publishes OrderCreated
InventoryService: Hears event β reserves stock β publishes InventoryReserved
PaymentService: Hears event β charges card β publishes PaymentCharged
ShippingService: Hears event β ships β publishes OrderShipped
If PaymentFailed: publishes PaymentFailed
InventoryService: Hears PaymentFailed β releases stock (compensates)
Pros: No single point of failure, services loosely coupled
Cons: Hard to trace flow, implicit dependencies
2. Orchestration (central coordinator):
SagaOrchestrator:
1. Call InventoryService.reserve()
2. If success: Call PaymentService.charge()
3. If success: Call ShippingService.ship()
4. If any fails: Execute compensations backward
Orchestrator knows full saga, handles rollback
Pros: Clear flow, easy to trace, centralized logic
Cons: Single point of failure (mitigated by making orchestrator reliable)
The pattern: Long-running transactions via local transactions + compensations
Common misconceptions:
β Myth: "Sagas provide ACID transactions"
β
Truth: Sagas provide BASE (Basically Available, Soft state, Eventual consistency). Intermediate states visible (e.g., inventory reserved but payment pending).
β Myth: "Compensations undo everything"
β
Truth: Some actions can't be undone (email sent, API called). Compensate = mitigate effects (send apology email, call cancellation API).
β Myth: "Always use choreography"
β
Truth: Orchestration better for complex flows with many steps. Choreography better for simple flows with few participants.
Real-world examples:
-
Uber ride booking:
-
Reserve driver β Charge rider β Notify driver β Start trip
-
If charge fails: Release driver (compensation)
-
Flight booking:
-
Reserve seat β Charge card β Issue ticket
-
If card declined: Release seat (automatic compensation)
-
E-commerce order:
-
Reserve inventory β Process payment β Create shipment
-
If shipment fails (out of stock): Refund payment + release inventory
-
AWS Step Functions: Orchestration service for sagas (visual workflows, automatic retries, compensations)
-
Netflix Conductor: Orchestration engine for microservices workflows
Meta-insight:
Sagas embrace reality of distributed systems: You can't prevent failures, only handle them gracefully. Instead of locking everything (2PC), progress optimistically and compensate if needed.
This mirrors real life: Hotels overbook (optimistic), then compensate with upgrades if needed. Banks process checks immediately, reverse if bounced (compensating transaction).
The trade-off: Availability > Strong Consistency. Better to complete 99% of orders with occasional visible inconsistency than block everything for perfect consistency.
π Detailed Curriculum (Optimized for 60 min)
-
Video Introduction (15 min) β START HERE
-
Chris Richardson: "Managing Data in Microservices"
- Video Link
-
Focus on Saga pattern section (skip CQRS if needed)
-
Core Reading (15 min)
-
Microsoft: "Saga Pattern"
- Article
- Chris Richardson: "Pattern: Saga"
- Microservices.io
-
Focus: Both choreography and orchestration approaches
-
Quick Synthesis (5 min)
- Draw: Choreography vs Orchestration comparison
- Write: "When to use each approach"
- List: 3 compensating transaction examples
π Resources
π― Core Resources (Use Today)
-
Video: Chris Richardson - Managing Data in Microservices (15 min)
Why it matters: Richardson is the authority on microservices patterns. This explains why 2PC fails and how sagas solve it. -
Article: Microsoft - Saga Pattern
Why it matters: Comprehensive Azure reference architecture with choreography vs orchestration comparisons. -
Article: Microservices.io - Pattern: Saga
Why it matters: Canonical pattern catalog with detailed examples and trade-offs. -
Diagram: Saga Workflow Examples
Why it matters: Visual examples showing choreography and orchestration in action.
β Bonus Resources (If Extra Time)
-
Book: Microservices Patterns - Chris Richardson (Chapter 4)
Why it matters: Deep dive into saga implementations with production patterns. -
Article: Saga Pattern: How to Implement Business Transactions - Bernd Ruecker
Why it matters: Camunda creator's practical guide with workflow examples. -
Video: Saga Pattern | Design Patterns - Defog Tech
Why it matters: Visual explanation with animation showing compensating transactions. -
Case Study: Uber's Approach to Sagas
Why it matters: Production implementation handling millions of transactions daily. -
Tool: Temporal.io for Orchestration
Why it matters: Modern orchestration platform with built-in saga support and fault tolerance.
βοΈ Practical Activities (25 min total)
1. Implement Both Saga Patterns (18 min)
Scenario: E-commerce order placement
Services involved:
- Order Service
- Payment Service
- Inventory Service
- Delivery Service
A. Choreography-Based Saga (8 min)
// Event-driven coordination (no central coordinator)
// ORDER SERVICE
class OrderService {
createOrder(orderId, items, payment) {
// Save order
database.save(Order { id: orderId, status: "pending" })
// Publish event
eventBus.publish(OrderCreated {
orderId, items, payment, userId
})
}
onPaymentSucceeded(event) {
database.update(event.orderId, { status: "paid" })
// Next step happens automatically via event
}
onPaymentFailed(event) {
// Compensate: cancel order
database.update(event.orderId, { status: "cancelled" })
eventBus.publish(OrderCancelled { orderId: event.orderId })
}
}
// PAYMENT SERVICE
class PaymentService {
onOrderCreated(event) {
try {
// Attempt payment
paymentId = processPayment(event.payment)
database.save(Payment {
id: paymentId,
orderId: event.orderId,
status: "succeeded"
})
// Publish success event
eventBus.publish(PaymentSucceeded {
orderId: event.orderId,
paymentId
})
} catch (error) {
// Publish failure event
eventBus.publish(PaymentFailed {
orderId: event.orderId,
reason: error
})
}
}
onOrderCancelled(event) {
// Compensate: refund payment
payment = database.find(event.orderId)
if (payment && payment.status == "succeeded") {
refund(payment.id)
database.update(payment.id, { status: "refunded" })
}
}
}
// INVENTORY SERVICE
class InventoryService {
onPaymentSucceeded(event) {
try {
// Reserve items
for item in event.items {
reserveItem(item.id, item.quantity)
}
eventBus.publish(InventoryReserved {
orderId: event.orderId
})
} catch (error) {
// Can't reserve - trigger compensation
eventBus.publish(InventoryReservationFailed {
orderId: event.orderId
})
}
}
onOrderCancelled(event) {
// Compensate: release reserved items
releaseReservation(event.orderId)
}
}
B. Orchestration-Based Saga (10 min)
// Central coordinator manages the workflow
class OrderSagaOrchestrator {
eventStore
async createOrder(orderId, items, payment, userId) {
saga = new SagaInstance(orderId)
try {
// Step 1: Create Order
saga.addStep({
action: () => orderService.createOrder(orderId, items),
compensation: () => orderService.cancelOrder(orderId)
})
await saga.executeStep(1)
// Step 2: Process Payment
saga.addStep({
action: () => paymentService.processPayment(payment),
compensation: () => paymentService.refund(orderId)
})
paymentId = await saga.executeStep(2)
// Step 3: Reserve Inventory
saga.addStep({
action: () => inventoryService.reserve(items),
compensation: () => inventoryService.release(orderId)
})
await saga.executeStep(3)
// Step 4: Schedule Delivery
saga.addStep({
action: () => deliveryService.schedule(orderId),
compensation: () => deliveryService.cancel(orderId)
})
await saga.executeStep(4)
// Success!
saga.complete()
return { success: true, orderId }
} catch (error) {
// Failure - run compensating transactions
await saga.compensate()
return { success: false, reason: error }
}
}
}
class SagaInstance {
orderId
steps = []
currentStep = 0
addStep(step) {
steps.push(step)
}
async executeStep(stepNumber) {
currentStep = stepNumber
step = steps[stepNumber - 1]
// Save saga state
eventStore.append(SagaStepStarted {
sagaId: orderId,
step: stepNumber
})
// Execute action
result = await step.action()
// Save completion
eventStore.append(SagaStepCompleted {
sagaId: orderId,
step: stepNumber
})
return result
}
async compensate() {
// Run compensations in reverse order
for (i = currentStep - 1; i >= 0; i--) {
step = steps[i]
await step.compensation()
eventStore.append(SagaStepCompensated {
sagaId: orderId,
step: i + 1
})
}
}
complete() {
eventStore.append(SagaCompleted { sagaId: orderId })
}
}
2. Comparison Diagram (5 min)
Draw both patterns side-by-side showing the flow:
Choreography:
OrderService β Event β PaymentService β Event β InventoryService
β β β
Events Events Events
Orchestration:
Orchestrator
β β
Order β Payment β Inventory β Delivery
β (on failure, compensate)
3. Quick Analysis (2 min)
Fill in:
| Aspect | Choreography | Orchestration |
|--------|--------------|---------------|
| Coupling | Loose | Tighter |
| Visibility | Distributed | Central |
| Complexity | Per service | In orchestrator |
| Best for | Simple flows | Complex flows |
π¨ Creativity - Quick Mental Reset (5 min)
Quick Exercise: "The Dance vs The Conductor"
Draw two scenarios:
- Dance (Choreography): Dancers responding to each other's moves
- Orchestra (Orchestration): Conductor directing musicians
Label each with:
- When it works best
- When it becomes chaotic
Purpose: Visualize the core difference between saga approaches.
Take 3 min to draw, 2 min to reflect on which pattern fits which scenarios
π Connections to Previous Learning
From This Week:
- Event Sourcing (Day 1): Saga events are stored in event store
- CQRS (Day 2): Saga commands trigger business operations
From Month 1:
- Distributed Systems (M1W1): Sagas handle distributed coordination
- Consensus (M1W2): Sagas maintain consistency without distributed transactions
- Two-Phase Commit (M1W2): Sagas are an alternative to 2PC
Building Forward:
- Tomorrow (Streams): Saga events flow through event streams
- Week 2 (Tracing): Tracking saga execution across services
- Week 3 (Microservices): Sagas essential for microservices data
β Daily Deliverables (Must Complete)
- [ ] Watch Chris Richardson's Saga pattern video
- [ ] Read Microsoft and Microservices.io Saga articles
- [ ] Implement choreography-based saga with 3 services
- [ ] Implement orchestration-based saga with coordinator
- [ ] Demonstrate compensating transactions (rollback)
- [ ] Draw comparison diagram of both approaches
- [ ] Fill in choreography vs orchestration comparison table
β Bonus (If Extra Time)
- [ ] Add saga persistence and recovery
- [ ] Implement saga timeout handling
- [ ] Handle parallel saga steps
- [ ] Add saga monitoring and observability
- [ ] Explore Temporal or Camunda for orchestration
π― Success Criteria
By the end of today, you should be able to:
- β Explain what sagas are and why they're needed
- β Implement choreography-based saga
- β Implement orchestration-based saga
- β Write compensating transactions
- β Choose between choreography and orchestration
- β Handle saga failures gracefully
β° Total Estimated Time (OPTIMIZED)
- π Core Learning: 35 min (video 15 + reading 15 + synthesis 5)
- π» Practical Activities: 25 min (implementations 18 + diagram 5 + analysis 2)
- π¨ Mental Reset: 5 min (dance vs conductor drawing)
- Total: 60 min (1 hour) β
Note: Pseudocode is perfect for demonstrating saga patterns. Focus on the coordination logic, not implementation details!
π Today's Big Ideas
- No Distributed Transactions: Sagas maintain consistency without 2PC
- Compensating Transactions: Undo operations when saga fails
- Eventual Consistency: Saga completes over time, not atomically
- Two Approaches: Choreography (events) vs Orchestration (coordinator)
- Production Pattern: Used by Uber, Netflix, Amazon for critical flows
π‘ Saga in the Real World
- Uber: Trip booking saga (rider β driver β payment β trip)
- Amazon: Order placement saga (cart β payment β inventory β shipping)
- Netflix: Account signup saga (user β payment β subscription β content access)
- Airbnb: Booking saga (listing β payment β host β confirmation)
π Tomorrow's Preview
Stream Processing: Continuous Data Flows
You'll learn how to process continuous streams of events in real-time. Apache Kafka, stream processing patterns, and building reactive systems!
"Sagas are a mechanism for maintaining consistency in a distributed system without requiring distributed transactions." - Chris Richardson
You're building microservices-ready patterns! πͺ