WebSockets, Server-Sent Events, and Long Polling

LESSON

HTTP Protocol and Content Delivery

021 25 min intermediate

WebSockets, Server-Sent Events, and Long Polling

The core idea: realtime HTTP delivery is a connection-management choice: WebSockets, Server-Sent Events, and long polling all move updates across request boundaries, but they differ in directionality, recovery model, intermediary behavior, and connection cost.

Core Insight

Imagine the shop from the previous lessons now has live order updates. A customer places an order and keeps the confirmation page open. The page should show "paid", then "packed", then "out for delivery" without making the user refresh. Warehouse staff also need a dashboard where they can acknowledge tasks and send changes back immediately. Both screens want freshness, but they do not need the same transport.

The customer page mostly needs server-to-client updates. The browser sends normal HTTP requests for actions, and the server pushes status events down to the page. The warehouse dashboard is more interactive: the client and server both send messages frequently, and latency matters in both directions. A single "realtime" label hides two different shapes of traffic.

This is where WebSockets, Server-Sent Events, and long polling fit. A WebSocket upgrades an HTTP connection into a long-lived, bidirectional message channel. Server-Sent Events, usually called SSE, keeps a normal HTTP response open so the server can stream text events to the browser. Long polling repeatedly sends HTTP requests that the server holds until an event is available or a timeout expires.

The trade-off is freshness and interactivity versus connection cost and failure recovery. A long-lived connection can remove repeated request overhead and reduce update latency. It also has to survive proxies, load balancers, mobile network changes, server restarts, heartbeats, backpressure, and reconnects. The right design starts with what must move, in which direction, and how the client catches up after a disconnect.

The Three Shapes

WebSockets start as HTTP and then switch protocols. In the common HTTP/1.1 path, the client sends an upgrade request:

GET /ws/orders HTTP/1.1
Host: api.shop.test
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Key: ...
Sec-WebSocket-Version: 13

If the server accepts, it returns 101 Switching Protocols. After that, the connection carries WebSocket frames rather than ordinary HTTP request-response messages. Both sides can send messages whenever the application protocol allows. This is useful for chat, multiplayer presence, collaborative editing, dashboards with commands, and other bidirectional workflows.

SSE does not switch away from HTTP in the same way. The client opens a request, and the server returns a response with Content-Type: text/event-stream. The response body stays open:

HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache

id: 1042
event: order-status
data: {"orderId":"A123","state":"paid"}

The browser's EventSource API handles reconnects and can send Last-Event-ID when reconnecting. SSE is one-way from server to client, though the client can still use ordinary POST requests for commands. It works well for notifications, feeds, progress updates, and dashboards where server updates dominate.

Long polling is the most conservative shape. The client asks for updates:

GET /orders/A123/events?after=1041

If the server has an event, it responds immediately. If not, it holds the request open for a bounded time. When the response arrives or times out, the client immediately sends another request with the latest known cursor. Long polling uses ordinary HTTP request-response behavior, so it works through many intermediaries, but it creates more request churn and needs careful timeout handling.

The useful first question is not "which is more modern?" It is:

Do clients need to send frequent low-latency messages, or mostly receive updates?
How many concurrent clients will hold connections?
What cursor lets a client recover missed events?
Which proxies and load balancers sit between client and server?

The Durable Update Loop

All three options need the same durable loop. The transport differs, but the state problem is stable:

connect -> identify last seen event -> receive updates -> acknowledge or store cursor
-> detect silence or close -> reconnect -> resume from cursor

The cursor is the important part. It may be an event id, sequence number, timestamp plus tie-breaker, stream offset, or database version. Without a cursor, reconnect means "start somewhere and hope." With a cursor, reconnect means "send me everything after event 1042."

Heartbeats solve a different problem. Many networks fail silently. A phone moves between networks. A corporate proxy drops idle connections. A load balancer closes a connection after an idle timeout. If neither side sends anything, the application may not know the path is dead. A heartbeat is a small periodic message that proves the connection is still usable or causes the client to reconnect when it is not.

Backpressure matters too. If the server produces updates faster than the client can read them, buffers grow. WebSockets make this visible because a single connection can accumulate unsent messages. SSE streams can also back up if the client or network is slow. Long polling limits each response, but the client can still fall behind if it cannot process events. A realtime design needs a policy: drop low-value events, compact state, slow producers, or disconnect clients that cannot keep up.

The formal term is stateful delivery over an unreliable path. The connection gives the illusion of a live pipe. The durable mechanism is the reconnectable event stream behind it.

Worked Path: Order Updates with SSE

The customer order page only needs server-to-client updates, so the team chooses SSE.

At page load, the browser has already fetched the order:

orderId: A123
last event seen: 1041
current state: created

The browser opens:

GET /orders/A123/events?after=1041 HTTP/1.1
Accept: text/event-stream

The server subscribes the request to order A123 and starts streaming:

: heartbeat

id: 1042
event: order-status
data: {"state":"paid"}

id: 1043
event: order-status
data: {"state":"packed"}

The browser records the latest event id after applying each event. If the Wi-Fi drops after 1043, the connection closes. The browser reconnects and includes:

Last-Event-ID: 1043

The server can now resume:

look up events after 1043 for order A123
send 1044 if it exists
otherwise keep stream open

This design handles a normal disconnect because the event id is the recovery boundary. It also gives operators useful signals: connected SSE clients, reconnect rate, heartbeat failures, stream duration, bytes buffered, last event lag, and events replayed after reconnect.

Now compare the same feature with long polling. The client sends:

GET /orders/A123/events?after=1043

If no event is ready, the server waits up to, say, 25 seconds. If 1044 arrives, it responds:

[
  {"id":1044,"type":"order-status","state":"out_for_delivery"}
]

The client immediately asks again with after=1044. Recovery still works because the cursor is explicit. The cost is repeated request setup and timeout churn. For a small system or old proxy environment, that may be acceptable. For hundreds of thousands of clients, it can become expensive.

Now compare WebSockets. The warehouse dashboard opens one WebSocket and sends messages both ways:

client -> server: subscribe order A123
server -> client: event 1042 paid
client -> server: acknowledge task picked
server -> client: event 1043 packed

This fits bidirectional interaction. It also means the team now owns an application message protocol: message types, authentication lifetime, heartbeats, reconnect behavior, idempotent client commands, and fanout from backend workers to whichever server holds each socket.

Intermediaries Change the Design

Realtime delivery crosses the same edge systems as the rest of HTTP: TLS termination, load balancers, reverse proxies, CDNs, and application gateways. Those intermediaries have timeouts and buffering behavior.

For WebSockets, the proxy must support upgrade and keep the upgraded connection open. It should not buffer an infinite stream or apply normal request body assumptions. Load balancers need a policy for long-lived connections. If one server holds a socket, backend events must reach that server. That often requires a pub/sub layer, stream processor, or shared event bus behind the web tier.

For SSE, proxies should not buffer the response until it is "complete", because the response is intentionally long-lived. Idle timeouts still matter, so heartbeats are common. SSE is text-oriented and server-to-client, which keeps the browser API simple, but it is not the right tool for arbitrary binary bidirectional traffic.

For long polling, ordinary HTTP infrastructure works more naturally, but timeouts must be aligned. If the proxy times out at 30 seconds and the app waits 60 seconds before responding, clients will see artificial failures. The server should hold requests for slightly less than the shortest relevant intermediary timeout, then let the client re-issue.

Authentication also changes. A normal HTTP request has a clear start and finish. A long-lived connection may outlive a token refresh window or a permission change. The system needs to decide whether to close connections when auth changes, require periodic re-authentication, or validate permissions per message/event.

Operational Failure Modes

Failure: no replay cursor. Reconnects happen. Without Last-Event-ID, an offset, or another cursor, clients can miss updates or receive duplicates with no clean way to recover.

Failure: idle timeout mismatch. A proxy closes streams after 60 seconds while the server expects to keep them open silently. Heartbeats and aligned timeouts turn silent drops into predictable reconnects.

Failure: WebSocket server affinity without event routing. If events are produced on worker A but the user's socket is held by server B, messages disappear unless there is a shared routing or pub/sub mechanism.

Failure: treating WebSockets as reliable writes. A client command sent over a socket can be duplicated across reconnects or lost around close. Commands with side effects still need ids, acknowledgements, and idempotency rules.

Failure: unbounded slow clients. Long-lived connections can hide memory growth. Measure output buffers, event lag, dropped clients, and per-connection queue size.

Useful signals include open connections, connection duration, reconnect rate, heartbeat misses, event lag by client, replay count, long-poll timeout ratio, WebSocket close codes, bytes buffered, messages dropped, auth failures on live connections, and load balancer timeout errors.

Design Check

Close the lesson and choose one feature that wants live updates. Answer:

Who sends messages: server only, or both sides?
What is the event cursor?
How does a client reconnect without losing updates?
What is the heartbeat interval?
Which proxy or load-balancer timeout is shortest?
Where do backend events go if the socket is on another server?
Which messages are safe to retry?
What metric proves clients are fresh, not just connected?

If the feature only needs server-to-client notifications, start by considering SSE or long polling before jumping to WebSockets. If the feature needs low-latency bidirectional messages, WebSockets may be right, but the message protocol and recovery path are part of the design.

Connections

The cache invalidation lesson focused on freshness of stored HTTP responses. This lesson focuses on freshness of live state: events keep moving after the initial page or API response has already been delivered.

The next lesson on redirects and edge policy returns to visible HTTP behavior at the edge. Realtime endpoints need that policy too: upgrade paths, stream routes, timeout exceptions, and cache bypass rules should be deliberate rather than accidental.

Resources

Key Takeaways

PREVIOUS Cache Purge, Surrogate Keys, and Content Invalidation NEXT Redirects, Rewrites, and Edge Policy