HTTP/2 Multiplexing, Flow Control, and HPACK

LESSON

HTTP Protocol and Content Delivery

014 25 min advanced

HTTP/2 Multiplexing, Flow Control, and HPACK

The core idea: HTTP/2 keeps HTTP semantics but changes the wire shape to frames and streams, trading fewer connections and less HTTP-layer blocking for shared connection fate, flow-control state, and header-compression complexity.

Core Insight

Return to the slow product page from the HTTP/1.1 lesson. The page needs HTML, CSS, JavaScript, a hero image, recommendations, inventory, and a cart summary. Under HTTP/1.1, the browser often opens several connections because one connection has one ordered response path. A slow image body can occupy a lane while small API responses wait for another available lane.

HTTP/2 changes that shape. Instead of treating one connection as one ordered sequence of complete HTTP responses, it splits requests and responses into binary frames and labels those frames with stream IDs. Frames from many streams can be interleaved on one connection:

connection:
  stream 1 HEADERS, DATA
  stream 3 HEADERS
  stream 5 HEADERS, DATA
  stream 3 DATA
  stream 1 DATA

The browser can keep one connection to www.shop.test and still have several HTTP requests in flight. A large image response no longer has to finish before a small cart JSON response can make progress at the HTTP message layer. That is the main practical win: HTTP/2 removes a major HTTP/1.1 head-of-line problem without changing the meaning of GET, status codes, headers, caching, cookies, or authorization.

The trade-off is connection efficiency versus shared fate and stateful control. One connection is easier to reuse and often cheaper than many connections. But now many requests share flow-control windows, header compression state, TCP loss behavior, connection limits, and failure handling. When the connection is healthy, multiplexing feels elegant. When the connection is congested, flow-controlled, or reset, many streams can be affected together.

Same Semantics, Different Wire Shape

HTTP/2 does not replace HTTP's application semantics. A request is still a request. A response is still a response. Cache-Control, ETag, Content-Type, Authorization, Cookie, Range, and status codes still mean what earlier lessons taught. What changes is how those messages are represented on the connection.

HTTP/1.1 sends textual start lines and headers followed by a body:

GET /api/cart-summary HTTP/1.1
Host: www.shop.test
Accept: application/json

HTTP/2 represents the same idea as frames. The request metadata is carried in a HEADERS frame. The body, when present, is carried in one or more DATA frames. The method, scheme, authority, and path become pseudo-headers:

stream 3 HEADERS:
  :method = GET
  :scheme = https
  :authority = www.shop.test
  :path = /api/cart-summary
  accept = application/json

Each frame belongs either to the connection as a whole or to a stream. A stream is one bidirectional exchange within the connection. A client opens a stream for a request. The server sends frames back on that same stream for the response. When both sides are done, the stream closes, but the connection can remain open for other streams.

This gives you three levels to keep separate:

connection -> the shared transport relationship
stream     -> one HTTP request/response exchange
frame      -> a piece of control data, headers, or body bytes

That separation is the core mechanism. HTTP/2 can interleave frames from different streams because the receiver can use the stream ID on each frame to put the pieces back into the right exchange. The cart response does not have to wait for the image response to complete, as long as the connection and flow-control windows allow its frames to move.

Multiplexing in a Product Page

Consider these requests:

S1: GET /product/42.html
S3: GET /assets/app.js
S5: GET /images/hero.jpg
S7: GET /api/cart-summary
S9: GET /api/recommendations?product=42

In HTTP/2, the browser can send HEADERS frames for all of them on one connection:

client -> server:
  HEADERS stream 1  /product/42.html
  HEADERS stream 3  /assets/app.js
  HEADERS stream 5  /images/hero.jpg
  HEADERS stream 7  /api/cart-summary
  HEADERS stream 9  /api/recommendations?product=42

The server can respond as work completes:

server -> client:
  HEADERS stream 1  200 text/html
  DATA    stream 1  first HTML bytes
  HEADERS stream 7  200 application/json
  DATA    stream 7  {"items":3}
  HEADERS stream 5  200 image/jpeg
  DATA    stream 5  image chunk
  DATA    stream 1  more HTML
  DATA    stream 5  image chunk

The small cart response on stream 7 can finish while the image on stream 5 is still transferring. That is what HTTP/1.1 could not do on one connection. The receiver does not get confused because every frame carries the stream identity.

This does not mean requests are now independent in every sense. They still share the same underlying connection. If that connection is closed, all active streams are affected. If the TCP connection loses a packet, later bytes on the TCP stream cannot be delivered to the HTTP/2 layer until the missing packet is recovered. The next lesson explains how HTTP/3 and QUIC change that transport-level behavior. For this lesson, the key distinction is: HTTP/2 removes HTTP/1.1 response-order blocking, but not all forms of shared fate.

Flow Control: Backpressure for DATA Frames

Multiplexing creates a new risk. If many streams can send body bytes at once, a fast sender can overwhelm a slow receiver or cause one large response to crowd out memory needed by other streams. HTTP/2 flow control exists to make body transfer explicit and bounded.

Flow control applies to DATA frames. Each stream has a flow-control window, and the connection also has a connection-level window. Sending DATA consumes window credit. Receiving and processing DATA lets the receiver send WINDOW_UPDATE frames to grant more credit.

The rough loop is:

receiver grants window credit
-> sender sends DATA frames
-> DATA consumes stream window and connection window
-> receiver reads/processes bytes
-> receiver sends WINDOW_UPDATE
-> sender can send more DATA

Suppose the image stream has plenty of data to send, but the cart stream is tiny:

stream 5 image: large DATA, consumes window
stream 7 cart: small DATA, should finish quickly

If the implementation schedules frames well and connection-level credit is available, stream 7 can complete quickly. If the receiver stops reading data, or the connection-level window is exhausted, all DATA-bearing streams can stall. A per-stream window problem affects one stream. A connection window problem affects every stream sharing that connection.

This is the operational meaning of shared fate. A single slow consumer, mis-sized window, or unread response body can become a connection-level bottleneck. HTTP/2 gives you more concurrency over fewer sockets, but it also gives you more state to observe: active streams, stream windows, connection windows, queued frames, resets, and connection errors.

HPACK: Header Compression with State

HTTP requests repeat many headers. On a product page, every request may include the same authority, user agent, accepted encodings, cookies, and authentication context. Sending those headers in full on every stream wastes bytes. HTTP/2 uses HPACK to compress header fields.

HPACK uses a static table for common header names and values, plus a dynamic table maintained by the connection endpoints. Instead of sending the full text every time, a header block can refer to entries in these tables. That is why repeated headers get cheaper after the connection has seen them.

Plainly:

first request:
  send header values, maybe add entries to dynamic table

later request:
  refer to table indexes instead of sending all bytes again

This is useful, but it is not just a zip file over each request. HPACK is stateful per connection. The encoder and decoder have to agree about the dynamic table. Table size limits matter. Memory matters. If sensitive values such as Authorization or session cookies are indexed carelessly, compression behavior can become a security concern. Implementations can mark fields as never indexed, and operators should be careful with header compression around secrets.

HPACK also changes debugging. A packet capture of HTTP/2 bytes is not as readable as HTTP/1.1 text. You need tooling that understands frames, stream IDs, and HPACK decoding. When a proxy terminates HTTP/2 and forwards HTTP/1.1 upstream, the frontend connection behavior and backend connection behavior may be different. The request may be multiplexed from browser to edge, then serialized or pooled differently from edge to origin.

The lesson is not "HPACK is dangerous." The lesson is that header compression is shared connection state. It saves bandwidth, especially on repeated headers, but it creates a table that has size, security, and observability consequences.

Worked Path: One Connection, Many Streams

Trace the product page under HTTP/2.

The browser negotiates HTTP/2 with the server and receives settings:

SETTINGS_MAX_CONCURRENT_STREAMS = 100
SETTINGS_INITIAL_WINDOW_SIZE    = 65535
SETTINGS_HEADER_TABLE_SIZE      = 4096

These numbers are not universal defaults to memorize. They show that the connection begins with policy. The peer announces how many concurrent streams it is willing to handle, how much DATA each stream can send before updates, and how much dynamic header table space it allows.

The browser opens streams:

stream 1: /product/42.html
stream 3: /assets/app.js
stream 5: /images/hero.jpg
stream 7: /api/cart-summary

The server interleaves responses:

stream 1: HEADERS 200, DATA html
stream 7: HEADERS 200, DATA cart json, END_STREAM
stream 5: HEADERS 200, DATA image chunk
stream 3: HEADERS 200, DATA js chunk
stream 5: DATA image chunk

The cart summary is no longer blocked behind the full image body at the HTTP layer. The app becomes more responsive without opening many parallel TCP connections. That is the success path.

Now add pressure. The JavaScript bundle is large, the image stream is large, and the client is on a weak network. DATA frames consume connection-level window credit. TCP loss delays bytes for every stream on the connection. The server may hit the client's concurrent stream limit and queue additional requests. A proxy may terminate HTTP/2 at the edge and use a small HTTP/1.1 pool upstream, moving the queueing point rather than eliminating it.

The debugging question is:

Is this request waiting for stream capacity,
flow-control credit, TCP recovery, proxy upstream capacity,
or application work?

That question is more useful than "is HTTP/2 enabled?" Enabled protocol support is not the same as healthy multiplexing behavior.

Operational Failure Modes

Failure: treating one HTTP/2 connection as unlimited concurrency. Multiplexing is bounded by peer settings, memory, flow-control windows, CPU, and application capacity. MAX_CONCURRENT_STREAMS and server-side admission still matter.

Failure: ignoring connection-level stalls. If the connection flow-control window is exhausted, or the TCP connection is recovering lost bytes, many streams can slow together. Per-request logs may show many independent slow requests, but the shared cause is the connection.

Failure: not reading response bodies. A client that starts many requests but does not consume some bodies can block flow-control progress. In service clients, always close or drain bodies according to the library's rules so streams and connection credit are released.

Failure: assuming priority will save important requests. HTTP/2 has prioritization machinery, but real-world support varies across clients, servers, and intermediaries. Design critical endpoints and assets so they do not depend on perfect priority behavior.

Failure: unsafe or oversized header compression state. Large dynamic tables consume memory. Indexing sensitive headers can create avoidable risk. Limit table sizes and rely on implementations that handle sensitive fields carefully.

Useful signals include negotiated protocol, active streams per connection, streams queued by MAX_CONCURRENT_STREAMS, flow-control window sizes, WINDOW_UPDATE rates, RST_STREAM counts, connection error codes, header table sizes, time to first byte by stream, and whether the edge-to-origin hop is also HTTP/2. Without those signals, teams often blame the application for latency that is actually in the connection layer.

Connections

The HTTP/1.1 lesson showed why clients opened several connections: one connection could not carry many independent response streams. HTTP/2 changes that by adding stream IDs and frames.

The next lesson explains why HTTP/3 exists even after HTTP/2. HTTP/2 multiplexes at the HTTP layer, but it usually runs over TCP, where packet loss can still block delivery of later bytes for every stream on the connection. QUIC changes that transport layer.

Close the lesson and reconstruct a single HTTP/2 request from memory: connection settings, stream ID, HEADERS frame, optional DATA frames, flow-control windows, HPACK state, and what metric would reveal a shared connection stall.

Resources

Key Takeaways

PREVIOUS Persistent Connections, HOL Blocking, and HTTP/1.1 Limits NEXT HTTP/3, QUIC Streams, and Packet Loss Recovery