Delivery Semantics: At-Most-Once, At-Least-Once, Exactly-Once

LESSON

Event-Driven and Streaming Systems

021 30 min intermediate

Day 265: Delivery Semantics: At-Most-Once, At-Least-Once, Exactly-Once

Delivery semantics are not broker slogans. They are statements about what can be lost, what can repeat, and at which boundary the system is actually making that promise.


Today's "Aha!" Moment

The insight: The difference between at-most-once, at-least-once, and exactly-once is mostly about when a system decides work is done relative to crashes and retries.

Why this matters: Teams often talk about these semantics as if they were product labels attached to a queue or broker. That is how expensive misunderstandings start. The real question is always:

The universal pattern:

Concrete anchor: An order event is consumed from Kafka and used to send a confirmation email. If the consumer marks the offset as done before sending the email, a crash can lose the email forever. If it marks the offset only after sending, a crash can cause the email to be sent twice. The semantics are created by that timing.

How to recognize when this applies:

Common misconceptions:

Real-world examples:

  1. Metrics pipeline: Duplicate increments may be acceptable, so at-least-once plus aggregation tolerance is often enough.
  2. Billing or email delivery: Reprocessing may be expensive or user-visible, so idempotency keys or transactional boundaries become much more important.

Why This Matters

The problem: Delivery semantics are where nice diagrams meet crash reality. Messages can be read, processed, retried, rebalanced, re-sent, or committed at awkward moments. If the system boundary is vague, teams think they bought stronger guarantees than they actually have.

Before:

After:

Real-world impact: This avoids lost work, reduces duplicate side effects, and makes incident response much faster because the team can say exactly which boundary was guaranteed and which one was not.


Learning Objectives

By the end of this session, you will be able to:

  1. Explain what delivery semantics actually describe - Understand them as crash and commit contracts, not marketing labels.
  2. Describe how at-most-once and at-least-once are created mechanically - Reason from ack and commit timing to loss or duplication outcomes.
  3. Evaluate what exactly-once really means in practice - Distinguish bounded transactional guarantees from true end-to-end business effects.

Core Concepts Explained

Concept 1: Delivery Semantics Are About the "Done" Boundary

The key question is not:

The key question is:

That declaration may happen at several different layers:

Those are not the same boundary.

This is why delivery semantics are tricky. A pipeline can be:

So the mature mental model is:

Whenever someone says "this is exactly-once," the immediate follow-up should be:

If the answer is vague, the guarantee is probably being overstated.

Concept 2: At-Most-Once and At-Least-Once Come From Commit Timing

The cleanest way to understand these semantics is to imagine one consumer processing one message.

At-most-once

The consumer acknowledges or commits first, then does the work.

Shape:

  1. read message
  2. mark it done
  3. process side effect

If the process crashes after step 2 but before step 3:

So at-most-once means:

This is appropriate when:

At-least-once

The consumer does the work first, then acknowledges or commits.

Shape:

  1. read message
  2. perform side effect
  3. mark it done

If the process crashes after step 2 but before step 3:

So at-least-once means:

This is often the default practical choice because losing data is usually worse than replaying it. But it only works cleanly when downstream processing is:

That is why at-least-once is not "safe by itself." It pushes responsibility onto the consumer boundary.

Concept 3: Exactly-Once Is Usually a Bounded Transactional Guarantee

Exactly-once sounds absolute, but in practice it usually means:

In Kafka-style systems, this commonly involves some combination of:

That is powerful, but it is not magic.

If your pipeline consumes from Kafka and writes back to Kafka transactionally, you can get a strong guarantee inside that loop.

But if the consumer also:

then that external side effect is usually outside the broker's transactional boundary.

So the real lesson is:

This is why strong event systems still rely heavily on:

And it sets up the next lessons naturally:


Troubleshooting

Issue: "We enabled exactly-once, but users still received duplicate emails."

Why it happens / is confusing: The team assumed the broker or stream processor's transactional guarantee extended to the email provider.

Clarification / Fix: Treat the external call as a separate side-effect boundary. Use idempotency keys, deduplication, or a delivery record the email layer can check safely.

Issue: "We never see duplicates, but sometimes events seem to disappear."

Why it happens / is confusing: Offsets or acknowledgements are being recorded before the real work is durably finished.

Clarification / Fix: Check whether the consumer is committing too early. That usually means you are operating in at-most-once territory, whether you intended to or not.

Issue: "After a rebalance or crash, some records are processed again."

Why it happens / is confusing: Teams interpret replay as broker failure rather than normal recovery behavior.

Clarification / Fix: This is standard at-least-once behavior when commit and processing are separated by a crash window. Make the consumer idempotent or narrow the transactional boundary.


Advanced Connections

Connection 1: Delivery Semantics <-> Consumer Groups and Rebalancing

The parallel: The previous lesson showed that consumer groups constantly renegotiate partition ownership. Delivery semantics determine what happens when ownership changes while some work is in flight but not yet committed.

Real-world case: Rebalances expose the gap between "message read" and "message durably completed," which is exactly where duplicates or losses appear.

Connection 2: Delivery Semantics <-> End-to-End Stream Processing

The parallel: Later lessons on state stores, event time, and exactly-once pipelines depend on this one. Stateful stream processors are valuable partly because they can coordinate state updates, output writes, and progress tracking more tightly than ad hoc consumers.

Real-world case: A stream job that atomically updates state and output topic records can provide much stronger behavior than a hand-rolled consumer calling arbitrary external services.


Resources

Optional Deepening Resources


Key Insights

  1. Delivery semantics are scoped promises - Always ask which boundary is actually covered: broker write, consumer commit, state update, or final external side effect.
  2. Crash timing creates the guarantee - At-most-once and at-least-once differ mostly in whether you commit before or after work becomes durable.
  3. Exactly-once is usually bounded, not universal - Strong transactional pipelines exist, but external side effects still usually need idempotency and deduplication.

PREVIOUS Kafka Consumer Groups and Rebalancing Internals NEXT Schema Evolution and Data Contracts in Event Streams

← Back to Event-Driven and Streaming Systems

← Back to Learning Hub