CRDTs and Coordination Avoidance: Topology, Placement, and Locality Trade-Offs
LESSON
CRDTs and Coordination Avoidance: Topology, Placement, and Locality Trade-Offs
Core Insight
Imagine the project board from the previous lesson. Madrid users edit documents all day, Dublin owns many workspace slugs, Lisbon holds spare storage-quota rights, and some mobile clients work offline for hours. The model may be built from good CRDT fields, but the system still has to answer a physical question: where does each piece of authority and replicated state live?
Topology is the shape of replication: which clients, edge nodes, regional replicas, and authorities exchange state with each other. Placement is the decision of where a specific object, shard, right, or authority should live. Locality is the result: whether the operation the user wants can be handled nearby or has to cross the network.
The non-obvious point is that CRDTs make more topologies possible, but they do not make topology irrelevant. If every replica holds every object, convergence is simple to explain and expensive to operate. If every object has one home region, storage is cheaper but offline and remote writes may get slower. If rights and authorities are placed far from demand, the slow path becomes the normal path.
The trade-off is locality versus control surface. More local replicas can keep users moving through partitions, but they increase metadata, fanout, privacy exposure, and repair work. Fewer replicas are easier to govern, but they make coordination boundaries more visible to users.
The Naive Full-Mesh Design
A first topology is often full replication:
Madrid replica <----> Dublin replica
^ ^
| |
v v
Lisbon replica <----> Mobile clients
Every replica eventually exchanges state with every other replica. Every object can be edited anywhere. Anti-entropy repairs missed messages by repeatedly comparing and merging state.
This is attractive for a small system:
good:
local reads and writes
simple mental model
no single required route for mergeable updates
strong offline story
cost:
every replica may store too much data
every update may fan out widely
privacy and residency boundaries get harder
metadata grows at every participant
slow or abandoned replicas complicate cleanup
The design is not wrong. It is just a placement decision. Full replication is strongest when the shared object is small, the team is small, the data is allowed to travel, and low-latency local editing matters more than storage or governance cost.
It becomes painful when a workspace has millions of documents, region-specific compliance rules, or clients that disappear for months while still holding causal metadata.
Placement Questions
Placement starts by separating four things that are easy to blur:
data replica:
stores mergeable state for reads and writes
authority:
decides a non-mergeable promise, such as slug uniqueness
rights holder:
owns bounded capacity, such as storage quota or seats
repair path:
receives enough state to reconcile, compact, or rebuild
The same machine can play several roles, but the roles are different.
Ask these questions for each domain object:
1. Where do writes usually originate?
2. Which operations must be local for the product to feel usable?
3. Which fields have authorities, rights, or workflow gates?
4. Which replicas are allowed to store the data?
5. How quickly must remote replicas converge?
6. What happens when a replica is offline for a day, a month, or forever?
7. Who can compact metadata safely?
These questions turn topology from an infrastructure diagram into a correctness and product decision.
Worked Example: Workspace Placement
Suppose a workspace has most users in Madrid, a few collaborators in Dublin, and occasional mobile edits from Lisbon. The domain model from the previous lesson contains:
documents:
CRDT JSON objects
workspace_slug:
unique human-readable name
storage_quota:
bounded counter rights
members:
safety-sensitive membership set
search_index:
derived view
A reasonable placement could be:
Madrid:
primary data replica for workspace documents
most document editing happens here
holds most storage quota rights
Dublin:
replica for active shared documents
authority for slugs whose hash maps to Dublin
small quota reserve for local uploads
Lisbon mobile clients:
partial replicas for documents the user opened
can edit cached documents offline
do not hold global slug authority
Background repair service:
receives durable deltas
rebuilds search
helps compact metadata after safe boundaries
Now test three operations.
Madrid user edits a document:
local CRDT update
sync to Dublin and repair service later
Madrid user claims a slug owned by Dublin:
route to Dublin authority
if Dublin is unreachable, expose pending/retry
Lisbon mobile user uploads a large file:
allowed only if the client or nearby region has quota rights
otherwise wait, fail, or request rights
The same application has three locality stories. Document edits are local. Slug claims are local only at the authority. Quota-consuming writes are local only while rights are nearby.
That is the point of placement: the topology should match the promise of each operation, not the average shape of the database.
Topology Patterns
Here are common patterns and the pressure each one handles.
Home Region With Remote Replicas
home region:
authoritative durable copy
remote regions:
cache or partial replica
This fits objects with a natural owner or tenant home. Most writes are fast near the owner. Remote users may still read and sometimes write locally if the operation is mergeable.
The trade-off is remote latency and failover complexity. If the home is down, the system must know which operations can continue and which must wait.
Sharded Authority
authority_for(value) = shard(value)
This fits uniqueness, leases, and workflow gates. Each value has one decision point, so the whole system does not coordinate for every claim.
The trade-off is hot spots and routing. Popular values, tenants, or regions can overload the authority that owns them.
Rights Near Demand
Madrid quota rights: 700 GB
Dublin quota rights: 200 GB
Lisbon quota rights: 100 GB
This fits bounded counters and escrow. Put rights where consumption usually happens.
The trade-off is rebalancing. Demand shifts, and a region can block while another region holds unused rights.
Partial Replication
client stores:
documents recently opened by this user
causal metadata needed for those documents
not the whole workspace
This fits mobile and edge clients. Users get offline behavior without storing the whole organization on every device.
The trade-off is selective sync complexity. The system must define what a client is allowed to forget, how it catches up, and whether missing data changes what the user can safely do.
Hub-And-Spoke Anti-Entropy
clients -> regional hub -> durable repair service -> other hubs
This reduces fanout. Clients sync with a nearby hub, and hubs take responsibility for durable dissemination.
The trade-off is hub dependency. If the hub is unavailable, clients need another route or a clear offline mode.
Locality Is Per Operation
It is tempting to say "this application is local-first" or "this service is multi-region active-active." Those phrases are too coarse.
A better description is per operation:
edit document text:
local on any replica with the document
reserve workspace slug:
local only at slug authority
consume quota:
local only where quota rights exist
remove compromised member:
route through safety authority or epoch change
search documents:
served from derived index that may lag
compact tombstones:
only after the cleanup boundary is known
This list is more useful than a topology diagram alone. It tells product, operations, and engineering what each user action does during a partition.
Failure Modes
- Replicating everything everywhere: Easy to reason about at first, but expensive for storage, privacy, metadata, and cleanup.
- Putting authorities far from demand: The slow path becomes the common path, and users feel coordination latency constantly.
- Treating caches as replicas: A cache can forget data. A replica must preserve enough state to merge correctly.
- Ignoring data residency: A mergeable object can still be illegal or unacceptable to store in every region.
- No repair route: Local writes need a durable path to reach the rest of the system eventually.
- Static rights allocation: Rights placed for last month's traffic can block this month's traffic.
- Partial replication without clear limits: A client that only has half the object must know which operations are safe.
- Compaction before reachability is known: Deleting tombstones or causal metadata too early can allow removed data to reappear.
Practice
Design topology for the replicated workspace from the previous lesson.
regions:
Madrid: 80 percent of edits
Dublin: slug authority for many workspace names
Lisbon: mostly mobile/offline use
objects:
CRDT documents
unique workspace slug
storage quota
member removals
search index
Fill this table:
operation -> local where? -> slow path -> convergence target -> failure behavior
Use these operations:
1. Edit document text.
2. Claim a workspace slug.
3. Upload a file that consumes quota.
4. Remove a compromised member.
5. Search for recently edited documents.
6. Compact old tombstones.
The useful answer is not "use CRDTs everywhere." The useful answer names where each operation is local, what has to be routed, and what can safely lag.
Connections
005.mdintroduced anti-entropy and delta-state synchronization; topology decides who exchanges those deltas with whom.011.mdshowed rights placement for bounded counters; this lesson generalizes placement across objects, authorities, and repair paths.015.mdcontinues with metadata growth and compaction, where topology determines when it is safe to discard causal information.
Resources
- [PAPER] Dynamo: Amazon's Highly Available Key-value Store
- Focus: Study how placement, replication, sloppy quorums, hinted handoff, and anti-entropy shape availability and convergence.
- [PAPER] A comprehensive study of Convergent and Commutative Replicated Data Types
- Focus: Revisit the CRDT merge contract, then ask where each replica and merge path should live.
- [PAPER] Delta State Replicated Data Types
- Focus: Connect delta propagation to practical anti-entropy topology and bandwidth costs.
- [ARTICLE] Local-first software
- Focus: Use the local-first frame to think about client replicas, offline work, and user ownership boundaries.
- [BOOK] Designing Data-Intensive Applications
- Focus: Connect topology choices to replication, partitioning, derived data, and operational trade-offs.
Key Takeaways
- CRDT topology is a design choice about where replicas, authorities, rights, and repair paths live.
- Locality is per operation: a document edit, slug claim, quota-consuming upload, and member removal can have different fast paths.
- Placement should follow the domain promise, traffic shape, data residency, failure behavior, and metadata cleanup needs.
- The best topology makes slow paths explicit instead of pretending every operation can be local everywhere.