CRDTs and Coordination Avoidance: Collaborative Editing, Sequences, and Rich Text

LESSON

CRDTs and Coordination Avoidance

017 30 min intermediate

CRDTs and Coordination Avoidance: Collaborative Editing, Sequences, and Rich Text

Core Insight

Imagine two people editing the same paragraph while one laptop is offline. Ana places the cursor between c and a in cat and types h. Bruno, on another replica, places the cursor between a and t and types r. Both edits feel local and immediate.

If the document is represented as a plain array of characters, both edits may say "insert at index 1" or "insert at index 2" depending on what each user had already seen. After the replicas exchange updates, the system cannot safely replay those indexes against a document that has changed underneath them. Indexes describe a position in one replica's momentary view, not a durable fact that other replicas can merge later.

Collaborative text CRDTs solve this by making order itself mergeable. Characters, embedded objects, or formatting markers get stable identifiers. An insertion says "place this new element after that known element" or "place it at this generated position identifier." A deletion usually marks an element as removed rather than pretending it never existed. Replicas may receive updates in different orders, but they eventually compute the same visible sequence.

The trade-off is that convergence is not the same as perfect user intention. The CRDT can guarantee that replicas agree without a central lock. It still needs careful ordering rules, tombstone or compaction strategy, and rich text semantics that make formatting behave like formatting rather than like accidental character data.

Why Index-Based Editing Breaks

Start with a tiny document:

visible text:
  cat

positions:
  0 c
  1 a
  2 t

Ana sees cat and inserts h after c:

Ana operation:
  insert "h" at index 1

Ana sees:
  chat

Bruno sees the original cat and inserts r after a:

Bruno operation:
  insert "r" at index 2

Bruno sees:
  cart

Now Ana receives Bruno's operation. Should index 2 mean after a, as Bruno intended, or should it mean the third slot in Ana's current text chat?

replay by raw index:
  chat
  insert r at index 2
  chart

maybe acceptable here, but accidental

Try a deletion and the problem becomes sharper:

Ana:
  delete character at index 1  # deletes a

Bruno:
  insert "r" at index 2        # after a, before t

After Ana's deletion, index 2 no longer names the same place. The operation was written in coordinates that expired.

A collaborative editor needs operations that name stable context:

insert "r" after element id A
delete element id A

The visible index can be computed for the user's screen. The replicated operation should refer to identifiers that survive lag, concurrency, and retry.

Sequence CRDTs

A sequence CRDT is a replicated data type for ordered elements. The elements may be characters, paragraphs, list items, image embeds, or formatting boundaries. The core requirement is simple to state and hard to engineer well:

if replicas receive the same set of insert/delete operations,
they must render the same sequence,
even if they received those operations in different orders.

One common family, often called linked-list or RGA-style sequence CRDTs, gives each inserted element a unique identifier:

element:
  id: Madrid:42
  value: "h"
  after: C:1
  visible: true

The document is not "array index 0, 1, 2." It is a set of elements plus order constraints:

HEAD
  C:1 -> "c"
  A:1 -> "a"
  T:1 -> "t"

Ana inserts h after C:1:

H:7 -> "h", after C:1

Bruno inserts r after A:1:

R:3 -> "r", after A:1

When all operations arrive, both replicas can render:

HEAD
  C:1 -> "c"
  H:7 -> "h"
  A:1 -> "a"
  R:3 -> "r"
  T:1 -> "t"

visible:
  chart

What if two users insert at the same place concurrently?

Ana:
  X:9 -> "x", after A:1

Bruno:
  Y:4 -> "y", after A:1

The CRDT needs a deterministic tie-breaker. It may order by actor ID, Lamport timestamp, generated position identifier, or a more careful rule that preserves causal insertion order. The exact choice affects how natural the result feels.

both replicas eventually choose:
  A:1, X:9, Y:4

or:
  A:1, Y:4, X:9

but not one order on Madrid and another on Dublin

Another family, used by Logoot/LSEQ-like designs, gives each element a position identifier drawn from a dense ordered space:

before:
  c at [10]
  a at [20]
  t at [30]

insert h between c and a:
  h at [15]

If two replicas need a position between the same neighbors, the identifier includes enough extra digits and replica identity to stay unique and ordered.

Both families turn "where is the cursor?" into "which stable element or position does this operation reference?" That shift is the small win: local editing can stay fast while merge behavior becomes a data structure problem instead of a global-lock problem.

Deletion, Tombstones, and Compaction

Deletion in a text CRDT is usually not "remove the element from history." It is closer to:

element A:1:
  value: "a"
  visible: false

That hidden element may still be needed because later operations refer to it:

Bruno operation:
  insert "r" after A:1

If Ana deletes A:1 before receiving Bruno's insert, the tombstone lets her place r in the intended neighborhood once the insert arrives.

history:
  C:1 -> "c"
  A:1 -> "a" (deleted)
  R:3 -> "r"
  T:1 -> "t"

visible:
  crt

This is the same theme as earlier lessons on causal context and compaction. Metadata that looks useless to the visible document may still be structural evidence for future merges. Deleting it safely requires a stability signal:

safe to compact tombstone A:1 only when:
  every replica that can send old operations has seen the delete
  or the system has an epoch/snapshot boundary
  or old clients must resync from a canonical snapshot

The trade-off is operational. Tombstones make merges easier and preserve references, but they grow with editing history. Compaction reduces cost, but it needs membership assumptions, causal stability, or a repair path for clients that return from long offline periods.

Rich Text Is More Than Characters

Plain text already has difficult ordering cases. Rich text adds another layer: formatting has structure.

Suppose Ana selects cat and turns it bold:

Ana:
  bold range from C:1 to T:1

At the same time, Bruno inserts s after a:

Bruno:
  insert "s" after A:1

Should the inserted s be bold?

possible renderings:
  cats  where s is bold
  cats  where s is not bold

Neither answer is universally correct. It depends on the user's intention and the editor's model. If Bruno typed inside a bold word, users often expect the new character to inherit bold. If Bruno inserted a comment marker or pasted unformatted text, inheritance may be wrong.

That means rich text cannot be treated as "characters plus CSS after the fact." The formatting model needs merge semantics too.

One practical approach represents formatting as annotations anchored to stable positions:

annotation:
  type: bold
  start: before C:1
  end: after T:1
  behavior_at_edges:
    include_insertions_inside
    exclude_insertions_after_end

Another approach represents rich text as tree-like or JSON-like replicated state:

paragraph:
  children:
    - text run "ca"
    - text run "t" with bold=true
    - mention embed user:42

Now the editor must merge both sequence edits and attribute edits. Concurrent formatting changes need rules:

Ana:
  set bold=true on word

Bruno:
  set bold=false on overlapping range

merge policy:
  keep both as spans and normalize
  or use multi-value conflict state
  or apply a deterministic attribute resolver

This is where coordination avoidance becomes a product decision. A CRDT can make replicas converge, but the application must decide what a meaningful document is. For code, legal contracts, access-control policies, or structured documents, "both edits converged" may be less important than "the resulting document still respects the domain."

Worked Example: Offline Comment Edit

Consider a collaborative issue tracker with rich text comments. A comment body is edited offline on a phone and in a browser.

Initial text:

Ship the cache fix

The underlying sequence uses element IDs:

S:1 h:1 i:1 p:1 SPACE:1 t:1 h:2 e:1 ...

Phone edit:

insert "safe " after "the "

Browser edit:

bold "cache"

The phone does not need to ask the browser for permission. It creates local sequence operations:

P:77 -> "s", after SPACE:2
P:78 -> "a", after P:77
P:79 -> "f", after P:78
P:80 -> "e", after P:79
P:81 -> " ", after P:80

The browser creates a formatting annotation:

B:31:
  bold=true
  range:
    start before c:1
    end after e:2

After synchronization, both replicas can render:

Ship the safe cache fix
         ---- *****

The sequence merge placed the phone's words. The annotation merge kept the browser's formatting attached to stable document positions, not to fragile indexes.

Now change the browser edit:

delete "cache"

When the phone's insertion and browser's deletion merge, the editor may render:

Ship the safe fix

The deleted elements remain as tombstones until compaction is safe. If the bold annotation was attached to the deleted range, the editor must decide whether to drop it, keep it hidden for undo/history, or transform it into a normalized no-op. That is not just storage detail; it affects undo, comments, selections, and user trust.

Failure Modes

Practice

Take this text:

red fox

Represent it as sequence elements:

R:1 e:1 d:1 SPACE:1 f:1 o:1 x:1

Two offline edits happen:

Ana:
  insert "quick " after SPACE:1

Bruno:
  bold "fox"

Sketch:

1. The sequence operations Ana emits.
2. The annotation Bruno emits.
3. The final visible text after both merge.
4. Whether "quick" should inherit any formatting, and why.
5. Which tombstones or anchors must remain if Bruno deletes "fox" instead.

Then answer the coordination question: which part of this design can remain coordination-free, and which part would need an authority if the document were a contract whose clause numbers must be unique and legally stable?

Connections

Resources

Key Takeaways

PREVIOUS CRDTs and Coordination Avoidance: Transactions, Causal Consistency, and Atomic Updates NEXT CRDTs and Coordination Avoidance: Offline-First Clients and Edge Replication