LESSON

021 30 min intermediate

Day 265: Delivery Semantics: At-Most-Once, At-Least-Once, Exactly-Once

Delivery semantics are not broker slogans. They are statements about what can be lost, what can repeat, and at which boundary the system is actually making that promise.

Today's "Aha!" Moment

The insight: The difference between at-most-once, at-least-once, and exactly-once is mostly about when a system decides work is done relative to crashes and retries.

Why this matters: Teams often talk about these semantics as if they were product labels attached to a queue or broker. That is how expensive misunderstandings start. The real question is always:

when do we acknowledge or commit?
what happens if we crash right before or right after that point?
does the guarantee cover only the broker, the consumer pipeline, or the final side effect too?

The universal pattern:

acknowledge early -> risk loss, avoid duplicates
acknowledge late -> avoid loss, risk duplicates
coordinate processing and commit atomically -> stronger guarantees, but only inside a bounded system

Concrete anchor: An order event is consumed from Kafka and used to send a confirmation email. If the consumer marks the offset as done before sending the email, a crash can lose the email forever. If it marks the offset only after sending, a crash can cause the email to be sent twice. The semantics are created by that timing.

How to recognize when this applies:

A broker or queue promises reliability, but downstream side effects still duplicate or disappear.
Crashes during processing create arguments about whether the message was "already handled."
Rebalances or retries are exposing gaps between commit timing and real business completion.

Common misconceptions:

[INCORRECT] "Exactly-once means no duplicate side effect can ever happen anywhere."
[INCORRECT] "At-least-once is always safer, so it is always better."
[CORRECT] The truth: Each semantic is a different trade-off between loss, duplication, coordination cost, and scope of guarantee.

Real-world examples:

Metrics pipeline: Duplicate increments may be acceptable, so at-least-once plus aggregation tolerance is often enough.
Billing or email delivery: Reprocessing may be expensive or user-visible, so idempotency keys or transactional boundaries become much more important.

Why This Matters

The problem: Delivery semantics are where nice diagrams meet crash reality. Messages can be read, processed, retried, rebalanced, re-sent, or committed at awkward moments. If the system boundary is vague, teams think they bought stronger guarantees than they actually have.

Before:

Acknowledgements and offset commits are treated as routine plumbing.
Consumers are called "exactly-once" because the broker supports transactions.
Duplicate side effects appear in production and nobody knows which layer lied.

After:

Delivery semantics are treated as crash-boundary design decisions.
Teams distinguish broker guarantees from end-to-end business guarantees.
Idempotency, transactional writes, and commit timing are chosen deliberately.

Real-world impact: This avoids lost work, reduces duplicate side effects, and makes incident response much faster because the team can say exactly which boundary was guaranteed and which one was not.

Learning Objectives

By the end of this session, you will be able to:

Explain what delivery semantics actually describe - Understand them as crash and commit contracts, not marketing labels.
Describe how at-most-once and at-least-once are created mechanically - Reason from ack and commit timing to loss or duplication outcomes.
Evaluate what exactly-once really means in practice - Distinguish bounded transactional guarantees from true end-to-end business effects.

Core Concepts Explained

Concept 1: Delivery Semantics Are About the "Done" Boundary

The key question is not:

"does the broker store messages durably?"

The key question is:

"when does the system declare this message fully handled?"

That declaration may happen at several different layers:

broker acknowledges producer write
consumer commits offset
business logic updates a database
external side effect happens, such as charging a card or sending an email

Those are not the same boundary.

This is why delivery semantics are tricky. A pipeline can be:

durable at the broker layer
replayable at the consumer layer
still duplicate or lose effects at the business layer

So the mature mental model is:

delivery semantics are scoped promises

Whenever someone says "this is exactly-once," the immediate follow-up should be:

exactly once between which boundaries?

If the answer is vague, the guarantee is probably being overstated.

Concept 2: `At-Most-Once` and `At-Least-Once` Come From Commit Timing

The cleanest way to understand these semantics is to imagine one consumer processing one message.

`At-most-once`

The consumer acknowledges or commits first, then does the work.

Shape:

read message
mark it done
process side effect

If the process crashes after step 2 but before step 3:

the message is considered consumed
the side effect may never happen

So at-most-once means:

a message will not be processed more than once
but it may be processed zero times in reality

This is appropriate when:

occasional loss is acceptable
duplicates are more dangerous than drops
the consumer work is cheap or non-critical

`At-least-once`

The consumer does the work first, then acknowledges or commits.

Shape:

read message
perform side effect
mark it done

If the process crashes after step 2 but before step 3:

the system may retry the message
the side effect may happen again

So at-least-once means:

the work should eventually happen
but duplicates are part of the contract

This is often the default practical choice because losing data is usually worse than replaying it. But it only works cleanly when downstream processing is:

idempotent
deduplicated
or tolerant of repetition

That is why at-least-once is not "safe by itself." It pushes responsibility onto the consumer boundary.

Concept 3: `Exactly-Once` Is Usually a Bounded Transactional Guarantee

Exactly-once sounds absolute, but in practice it usually means:

inside a specific pipeline boundary, the system coordinates writes and progress markers so replays do not produce duplicate logical output

In Kafka-style systems, this commonly involves some combination of:

idempotent producers
transactions
atomic commit of output records and consumed offsets
stateful processing that can roll forward consistently after recovery

That is powerful, but it is not magic.

If your pipeline consumes from Kafka and writes back to Kafka transactionally, you can get a strong guarantee inside that loop.

But if the consumer also:

sends an email
calls a payment gateway
triggers a webhook
invokes a third-party API

then that external side effect is usually outside the broker's transactional boundary.

So the real lesson is:

exactly-once is strongest when the whole workflow participates in the same atomic boundary
once you cross into external systems, you often fall back to at-least-once + idempotency

This is why strong event systems still rely heavily on:

idempotency keys
deduplication tables
transactional outbox patterns
carefully designed side-effect boundaries

And it sets up the next lessons naturally:

schema contracts define what the stream means
stream processing lessons later will explain how stronger semantics are maintained through state, windows, and transactional coordination

Troubleshooting

Issue: "We enabled exactly-once, but users still received duplicate emails."

Why it happens / is confusing: The team assumed the broker or stream processor's transactional guarantee extended to the email provider.

Clarification / Fix: Treat the external call as a separate side-effect boundary. Use idempotency keys, deduplication, or a delivery record the email layer can check safely.

Issue: "We never see duplicates, but sometimes events seem to disappear."

Why it happens / is confusing: Offsets or acknowledgements are being recorded before the real work is durably finished.

Clarification / Fix: Check whether the consumer is committing too early. That usually means you are operating in at-most-once territory, whether you intended to or not.

Issue: "After a rebalance or crash, some records are processed again."

Why it happens / is confusing: Teams interpret replay as broker failure rather than normal recovery behavior.

Clarification / Fix: This is standard at-least-once behavior when commit and processing are separated by a crash window. Make the consumer idempotent or narrow the transactional boundary.

Advanced Connections

Connection 1: Delivery Semantics <-> Consumer Groups and Rebalancing

The parallel: The previous lesson showed that consumer groups constantly renegotiate partition ownership. Delivery semantics determine what happens when ownership changes while some work is in flight but not yet committed.

Real-world case: Rebalances expose the gap between "message read" and "message durably completed," which is exactly where duplicates or losses appear.

Connection 2: Delivery Semantics <-> End-to-End Stream Processing

The parallel: Later lessons on state stores, event time, and exactly-once pipelines depend on this one. Stateful stream processors are valuable partly because they can coordinate state updates, output writes, and progress tracking more tightly than ad hoc consumers.

Real-world case: A stream job that atomically updates state and output topic records can provide much stronger behavior than a hand-rolled consumer calling arbitrary external services.

Resources

Optional Deepening Resources

[DOCS] Apache Kafka Documentation
- Link: https://kafka.apache.org/documentation/
- Focus: Use it as the main official reference for producer guarantees, consumer commits, transactions, and delivery semantics.
[DOCS] Apache Kafka Documentation: Semantics
- Link: https://kafka.apache.org/documentation/#semantics
- Focus: Read it for the official project framing of producer and consumer delivery guarantees.
[DOCS] Confluent Documentation: Kafka Message Delivery Guarantees
- Link: https://docs.confluent.io/kafka/design/delivery-semantics.html
- Focus: Use it for a practical explanation of at-most-once, at-least-once, and transactional exactly-once in Kafka ecosystems.
[DOCS] Confluent Documentation: Exactly-Once Semantics
- Link: https://docs.confluent.io/platform/current/streams/concepts.html
- Focus: Read the transactions and exactly-once sections to understand where stronger guarantees are real and where they stop.

Key Insights

Delivery semantics are scoped promises - Always ask which boundary is actually covered: broker write, consumer commit, state update, or final external side effect.
Crash timing creates the guarantee - At-most-once and at-least-once differ mostly in whether you commit before or after work becomes durable.
Exactly-once is usually bounded, not universal - Strong transactional pipelines exist, but external side effects still usually need idempotency and deduplication.

← Back to Event-Driven and Streaming Systems

← Back to Learning Hub

Delivery Semantics: At-Most-Once, At-Least-Once, Exactly-Once