CRDTs and Coordination Avoidance: Capstone: Design a Mergeable Multi-Region Application
LESSON
CRDTs and Coordination Avoidance: Capstone: Design a Mergeable Multi-Region Application
Core Insight
Imagine you are asked to design a multi-region maintenance platform for hospitals. Technicians must keep working inside buildings with weak connectivity. Regional teams need to see notes, photos, assignments, parts usage, and safety warnings. Auditors need one final close record per incident. Security needs stolen devices to stop affecting sensitive state.
The tempting answer is to say, "Use CRDTs everywhere." The mature answer is better: make the common collaborative work local and mergeable, then draw hard boundaries around uniqueness, rights, trust, and legal closure. The application should feel fast for notes, checklists, photos, and comments. It should be honest when an action must wait for inventory rights, permission authority, or an audit allocator.
This capstone is the full-track synthesis. The goal is not to pick one magic data type. The goal is to produce an architecture where every field has a reason: which facts merge locally, which facts are derived, which facts need authority, which metadata must be retained, and which tests prove that replicas converge without violating the domain.
The trade-off is visible from the start. Coordination avoidance buys local progress and regional resilience. It costs design discipline: operation identity, causal context, admission control, compaction boundaries, compatibility plans, observability, repair workflows, and a user experience that can explain pending authority instead of hiding it.
Capstone Scenario
Design FieldCare, a hospital maintenance platform used across Madrid, Dublin, and Virginia.
The product manages incidents:
incident:
title
notes
checklist
photos
warning_tags
assigned_owner
parts_used
status
audit_number
activity_feed
The system must support:
offline-first clients:
phones and tablets can work for hours without network
edge replicas:
each hospital has a nearby edge node
regional replicas:
Madrid, Dublin, and Virginia exchange operations
authorities:
inventory rights allocator
safety closure authority
permission authority
audit number allocator
The product promises:
1. Technicians can add notes, photos, and checklist observations offline.
2. Warning tags must not disappear because a stale client removed them.
3. Parts usage must not exceed allocated rights.
4. Closing an incident creates exactly one audit number.
5. Permission changes must respect revocation and epochs.
6. Activity feeds and search may lag, but must rebuild from source facts.
7. Old clients must not silently corrupt new merge semantics.
Those promises are the architecture. The storage model follows from them.
Proposed Architecture
Use an operation log plus materialized CRDT views.
client:
local store
durable outbox
sync cursor
conflict/pending-authority UI
hospital edge:
validates common operations
accepts mergeable local work
syncs with regional replicas
buffers dependency waits
regional replicas:
store accepted operations
exchange anti-entropy deltas
build query/read models
authorities:
allocate scarce rights
decide sensitive transitions
emit authoritative operations
Every operation has an envelope:
operation:
op_id: actor:counter
actor_id
object_id
field
schema_version
epoch
dependency_context
capability or authority reference
effect
signature
The envelope lets a replica decide:
authenticate:
did this actor sign the operation?
authorize:
may this actor change this field on this object in this epoch?
validate:
is the operation well formed and within limits?
apply:
can this effect merge locally?
route:
does this effect need rights or authority?
The key split:
local CRDT:
notes
checklist observations
photos metadata
non-safety tags
rich text comments
local with allocated rights:
parts_used
bounded counters for inventory consumption
authority-routed:
audit_number
incident closure
permission grants
safety warning removal
derived and repairable:
activity_feed
search index
counts and dashboards
The system is not coordination-free. It is coordination-aware. It spends coordination where local merge cannot protect the promise.
Data Model Walkthrough
Notes And Photos
Notes are source facts with stable IDs:
notes:
OR-map note_id -> note_body
note_body:
rich text sequence CRDT
A phone can add a note offline:
operation phone:101
field: notes
effect:
add note N101 with body "Filter cracked"
The edge accepts it if the actor can add notes:
validate:
signature valid
note capability valid
body size within limit
merge:
notes[N101] = body CRDT
Photos use metadata in the CRDT and bytes in object storage:
photos:
OR-set photo_id
photo metadata:
hash
content type
storage pointer
upload status
The metadata can merge before the bytes finish uploading. The UI can show "photo metadata synced, upload pending." That is better than pretending binary upload and replicated state are one atomic fact.
Checklist
Checklist observations are mergeable:
checklist_done:
OR-set task_id
If a technician marks "inspect valve" done offline, the operation can merge:
operation phone:108
checklist_done.add(task:inspect-valve)
If the product needs "exactly one certified inspector must sign this task," that is a different promise:
certification:
authority-routed signature or delegated right
The review rule is: observation can merge; certification may need authority.
Warning Tags
Warning tags are safety-sensitive:
warning_tags:
observed-remove set
policy:
add warning tag can be local
remove warning tag requires safety authority
Concurrent add and unauthorized remove:
Madrid:
add gas-risk
stale tablet:
remove gas-risk without authority
merge:
gas-risk remains visible
unauthorized remove is rejected
This is not merely an OR-set choice. It is a domain decision: losing a safety warning is worse than showing a warning for a little too long.
Parts Used
Inventory is bounded. A technician may consume parts locally only if the hospital edge has rights.
inventory_rights:
hospital-edge:Madrid has 40 filter cartridges
A local use consumes a right:
operation edge-madrid:330
parts_used.add(part:filter-cartridge, qty:2)
rights.decrement(2)
If the edge has no rights left, the UI must not fake success:
status:
parts request pending
waiting for inventory rights
Escrow makes some bounded decisions local. It does not make all scarcity disappear.
Status And Audit Number
Closing an incident is authority-routed:
close request:
actor: supervisor-4
object: incident:I9
dependencies:
checklist complete
warning tags reviewed
parts reconciled
The safety closure authority validates and emits:
operation region:9001
status.write(closed)
audit_number.assign(A-2026-0042)
freeze_epoch = incident:I9:closed:9001
Late local edits after the freeze are not silently applied:
operation phone:140
epoch: open
effect: notes.add(...)
current epoch:
closed:9001
decision:
reject, quarantine, or route for post-close amendment
The honest user experience is:
Close requested.
Waiting for safety authority and audit number.
Local notes and photos can continue syncing until the close is accepted.
Sync And Failure Behavior
The sync protocol exchanges missing operations by frontier:
client frontier:
phone:140
edge-madrid:330
region:9000
edge sends:
operations after that frontier
Operations are idempotent:
if op_id already seen:
ignore duplicate
else:
validate and apply
Dependencies are explicit:
reply operation:
depends_on: comment C44
close operation:
depends_on: checklist frontier and authority decision
If dependencies are missing, the replica buffers or hides the operation:
receive reply before parent:
buffer reply
request parent
Snapshots keep rejoin practical:
snapshot:
state at causal frontier F
old operation log compacted before F
clients resume by fetching changes after F
Compaction is allowed only with a stability signal:
safe tombstone cleanup:
all active replicas past frontier F
old clients forced through snapshot rejoin
repair tools understand compacted form
This turns failure behavior into a contract. Dropped messages, duplicate delivery, temporary partitions, and long offline periods are expected, not exceptional.
Observability And Repair
The system needs CRDT-specific operational signals:
frontier lag by replica
outbox age
dependency wait count
rejected operations by reason
quarantined operations
rights exhaustion
conflict registers visible to users
repair operations emitted
snapshot rejoin failures
Debugging starts with facts:
symptom:
tablet shows incident open
web app shows incident closed
fact:
close operation region:9001
tablet:
has region:9000 only
web app:
has region:9001
decision:
expected lag; sync tablet
If the tablet has region:9001 but still shows open, investigate validation, schema, epoch, or projection bugs.
Repairs are operations:
repair region:9100
supersedes: phone:144
reason: operation accepted under expired capability
effect: revoke invalid note visibility
authorized_by: incident-commander-2
Manual edits outside the replicated path are forbidden in the runbook except as last-resort recovery with a follow-up repair operation.
Tests And Evidence
The design is not ready until it has evidence.
Merge laws:
state-based merge is:
commutative
associative
idempotent
Generated histories:
duplicate delivery
message reorder
partition and heal
offline client rejoin
edge accepts local writes during regional outage
snapshot and compaction
old client after migration epoch
stolen device after revocation
Invariant histories:
two regions try to close same incident:
one audit number assigned
two edges consume parts:
total consumption never exceeds allocated rights
stale client removes safety warning:
warning remains visible unless authority removes it
old client writes old tag format after tags-v2:
write is rejected, translated, or quarantined
Operational readiness:
dashboard shows frontier lag
support tool shows operation history
runbook explains missing, rejected, buffered, quarantined, and repaired updates
rollback plan preserves validators after semantic migration
The learner should be able to build a toy version of this architecture with three in-memory replicas, a few operation types, duplicate delivery, and one authority for closure. That toy system is enough to feel the core lesson: most work can be local, but not all correctness can be merged.
Capstone Exercise
Design your own mergeable multi-region application.
Choose one:
field maintenance board
collaborative support desk
multi-region project tracker
hospital incident system
warehouse repair workflow
Produce these artifacts:
1. Domain promises
List 5-8 promises users and operators rely on.
2. Field classification
local CRDT
local with allocated rights
authority-routed
derived and repairable
explicit conflict state
3. Operation envelope
op_id, actor, object, field, version, epoch, dependencies, capability, signature.
4. Sync protocol
frontiers, outbox, idempotence, buffering, snapshots.
5. Coordination boundaries
uniqueness, inventory, permissions, safety, closure.
6. Failure behavior
offline, duplicate delivery, stale client, malicious update, old client, compaction.
7. Evidence
property tests, generated histories, dashboards, repair runbooks.
8. User experience
pending authority, visible conflict, rejected update, repaired state.
Then answer the hard question:
Which coordination point would you remove if latency were the only goal,
and what invariant would break if you removed it?
If you can answer that honestly, you understand the track.
Connections
010.mdsupplied invariant confluence, the main test for whether coordination can be avoided.011.mdsupplied escrow and rights transfer for bounded local decisions.016.mdseparated atomic visibility from global transactions.018.mdsupplied offline-first clients and edge replication.020.mdsupplied observability and repair as production requirements.021.mdsupplied trust boundaries and malicious-update handling.023.mdsupplied the review checklist this capstone now applies end to end.
Resources
- [PAPER] Coordination Avoidance in Database Systems
- Focus: Use invariant confluence to justify each local-vs-authority decision.
- [PAPER] Highly Available Transactions: Virtues and Limitations
- Focus: Recheck which transactional guarantees can remain highly available.
- [ARTICLE] Local-first software
- Focus: Connect the architecture to user-facing local responsiveness and collaboration.
- [PAPER] A comprehensive study of Convergent and Commutative Replicated Data Types
- Focus: Revisit CRDT families, merge laws, and metadata trade-offs.
- [BOOK] Designing Data-Intensive Applications
- Focus: Tie replication, transactions, derived data, and operability into one system design.
Key Takeaways
- A mergeable multi-region application is designed field by field, promise by promise, not by applying CRDTs everywhere.
- Coordination avoidance keeps mergeable work local while routing uniqueness, rights, trust, and safety decisions to explicit boundaries.
- Production readiness requires more than convergence: admission control, compatibility, compaction, observability, repair, and user-visible pending states.
- The final design should include evidence: generated histories, invariant tests, dashboards, and runbooks that prove the trade-offs are intentional.