CRDTs and Coordination Avoidance: Invariant Confluence, CALM, and Monotonic Programs

LESSON

CRDTs and Coordination Avoidance

010 30 min intermediate

CRDTs and Coordination Avoidance: Invariant Confluence, CALM, and Monotonic Programs

Core Insight

Consider a replicated commerce service with two very different promises. It lets customers add items to wish lists from any device, and it also promises not to sell more units of a limited product than it owns. Both features can use replicated state. Only one is obviously safe to run everywhere without asking another replica first.

The wish list is mostly additive. If Madrid adds camera and Dublin adds tripod, merging the two states gives a larger, still-valid list. The limited-stock promise is different. If there is one unit left and two regions each sell one unit during a partition, each local region may look valid, but the merged state oversells.

CRDTs give deterministic convergence. They do not automatically preserve every application invariant. An invariant is a rule that must stay true, such as "stock sold must not exceed stock available" or "a username maps to at most one account." The central question is whether independently produced valid states can merge into another valid state.

Two related ideas help answer that question. CALM connects coordination-free execution to monotonic programs: programs whose conclusions only grow as more input arrives. Invariant confluence, often shortened to I-confluence, asks whether all independently reachable states that satisfy an invariant can be merged while still satisfying it. The trade-off is direct: the more a rule depends on absence, uniqueness, or global limits, the more likely the system needs coordination, rights allocation, or a redesigned invariant.

The Promise Is The Unit Of Analysis

Coordination avoidance starts by naming the promise, not by choosing a data type.

This field is easy:

wish_list_items:
  add item
  merge by observed-remove set

The user-visible promise might be:

Every item added on any device eventually appears unless it was explicitly removed.

That promise fits mergeable state. The set carries enough causal metadata to distinguish observed removes from concurrent adds. More information generally makes the visible state better, not invalid.

This rule is harder:

stock:
  available = 1
  sold <= available

The invariant is:

The system must not sell more units than exist.

That invariant is not about one replica's local view. It is about the merged global state. A local write that looks safe can become unsafe when merged with another local write.

The practical habit is to write down the invariant before arguing about implementation:

good:
  "Each product has sold_count <= stock_count."

too vague:
  "Inventory should be consistent."

The first sentence can be tested against candidate states. The second cannot.

Monotonic Programs And CALM

A program is monotonic when adding more input cannot retract an output it has already produced. In set terms, more facts may add more conclusions, but they do not invalidate previous conclusions.

A monotonic rule:

if user added item X:
  show item X in the activity feed

More events can add more feed entries. They do not make an earlier feed entry false.

A non-monotonic rule:

if there are no open blockers:
  mark release ready

More information can invalidate the conclusion. A delayed blocker event can arrive after a replica has already marked the release ready. The rule depends on absence: "no blockers." In an asynchronous system, absence is hard to know without a boundary, a quorum, a closed epoch, or another form of coordination.

CALM is the principle that logically monotonic programs can be eventually consistent without coordination, while non-monotonic programs require coordination or some equivalent boundary to be correct under delay and reordering.

This does not mean non-monotonic logic is bad. It means the system must make the point of coordination explicit. For example:

monotonic:
  collect approvals as a grow-only set

non-monotonic:
  decide "no blockers remain"

coordination boundary:
  close the review epoch, then evaluate readiness

The coordination boundary can be a leader decision, a quorum read, a workflow gate, a lease, a finalized epoch, or an allocation protocol. What matters is that the design admits where the non-monotonic decision happens.

Invariant Confluence

Invariant confluence asks a state-centered question:

If two replicas each start from a valid state,
and each independently performs valid operations,
is their merged state always valid?

If the answer is yes, the invariant is compatible with coordination-free execution under that merge and operation set. If the answer is no, there is some pair of independently valid states whose merge violates the invariant.

A simple safe example is a grow-only audit log:

Invariant:
  every record has a valid record id

A adds:
  record r1

B adds:
  record r2

merge:
  {r1, r2}

If each added record has a valid id, the union still satisfies the invariant. The merge does not create an invalid record by combining valid records.

A simple unsafe example is uniqueness:

Invariant:
  username "ana" belongs to at most one account

Replica A:
  create account 101 with username "ana"

Replica B:
  create account 202 with username "ana"

Each replica can locally satisfy the invariant because it sees only one ana. After merge, there are two. The invariant is not preserved by merge. The system needs coordination, preallocated namespaces, a deterministic conflict policy that changes the user-visible promise, or a workflow that repairs the violation after the fact.

The important point is that invariant confluence is about the combination of:

state representation
operations
merge function
invariant

Changing any one of those can change the answer.

Worked Example: Selling The Last Unit

Start with one unit in stock:

stock_count = 1
sold_counter = 0

invariant:
  sold_counter <= stock_count

Two replicas are partitioned. Each sees one unit available.

Replica A:
  sell 1
  sold_counter = {A:1, B:0}
  visible sold = 1
  invariant holds: 1 <= 1

Replica B:
  sell 1
  sold_counter = {A:0, B:1}
  visible sold = 1
  invariant holds: 1 <= 1

Now merge the PN-counter or grow-only sold components:

merge:
  sold_counter = {A:1, B:1}
  visible sold = 2

check:
  2 <= 1 is false

Both local states were valid. The merged state is invalid. That is the counterexample that proves this design is not invariant confluent for the no-oversell invariant.

There are several honest responses:

coordinate every sale:
  preserve the invariant, pay latency and availability cost

weaken the invariant:
  allow oversell, compensate later

allocate rights:
  give each replica a bounded number of local sale rights

change placement:
  route all writes for a product to one authority

The next lesson focuses on the rights-allocation option. Escrow and bounded counters do not make the invariant disappear. They reshape the state so each replica can spend only rights it already owns.

CALM And I-Confluence Together

CALM and invariant confluence look at different levels of the same coordination problem.

CALM is useful when reasoning about program logic:

Does this output only grow as more input arrives?
Does it depend on knowing that something is absent?
Can delayed information retract a decision?

Invariant confluence is useful when reasoning about state and correctness rules:

Can independently valid states merge into an invalid state?
Which invariant fails under which pair of operations?
Would a different merge policy or state decomposition preserve the invariant?

Use both. A program can use CRDTs internally and still make a non-monotonic decision at the edge. A state representation can converge perfectly and still violate a business invariant after merge. The design goal is not "avoid coordination everywhere." The goal is to spend coordination only where the promise requires it.

Operational Failure Modes

Practice

Classify each rule as likely coordination-free, likely coordination-required, or redesignable with rights allocation:

1. Add reactions to a post.
2. Ensure each username is unique.
3. Show every log entry produced by any replica.
4. Sell at most 10 seats for a workshop.
5. Mark a release ready when no blockers remain.

For each answer, name the invariant or monotonicity question you used. Avoid saying only "CRDT" or "not CRDT." The useful answer names the promise, the merge behavior, and where coordination would enter if needed.

Connections

Resources

Key Takeaways

PREVIOUS CRDTs and Coordination Avoidance: Maps, Nested CRDTs, and Schema Evolution NEXT CRDTs and Coordination Avoidance: Escrow, Bounded Counters, and Rights Transfer