Content Negotiation, Media Types, and Representation Metadata

LESSON

HTTP Protocol and Content Delivery

006 25 min intermediate

Content Negotiation, Media Types, and Representation Metadata

The core idea: Content negotiation lets a client and server agree on which representation of a resource is being exchanged, while representation metadata tells every parser, cache, proxy, and debugger how to interpret the bytes safely.

Core Insight

Imagine the checkout system from the previous lessons. After a payment request returns 202 Accepted, the mobile app follows GET /payment-attempts/pay_901 to check progress. An old mobile client expects a small JSON shape. A new web dashboard expects richer JSON with risk signals. A finance export tool wants CSV. A Spanish browser prefers Spanish human-readable messages. All of them are looking at the same payment attempt, but they are not asking for the same representation.

If the server always returns "some JSON" and hopes every caller can cope, the first demo works. The problem appears when an old app receives a new field layout, a cache serves a Spanish response to an English client, a CSV export is returned with application/json, or a client sends a body that the server parses under the wrong assumptions. The bytes may arrive successfully and the status may be 200 OK, but the system can still fail because the receiver does not know what the bytes mean.

HTTP separates the resource from the representation. The resource is the thing being addressed: payment attempt pay_901. A representation is one particular set of bytes that describes the current state of that resource for a particular purpose: JSON version 1, JSON version 2, CSV, HTML, Spanish, English, compressed, or uncompressed. Content negotiation is the conversation that chooses the representation. Representation metadata is the label on the result.

The trade-off is flexibility versus cache and test complexity. Negotiation lets one URI serve different clients cleanly, but every negotiable dimension can multiply variants. If Accept, Accept-Language, and Accept-Encoding all change the response, caches and tests must know those headers matter. The lesson is not "always negotiate everything." It is "when bytes can vary, make the variation explicit."

The Visible Pieces

There are two directions to keep separate.

For a request body, Content-Type says what the client is sending:

PATCH /payment-attempts/pay_901 HTTP/1.1
Content-Type: application/json

{"customer_note":"Please email the receipt"}

The server uses Content-Type to choose a parser. If the client sends JSON but labels it as text/plain, the server may reject it with 415 Unsupported Media Type. If the client omits the header and the server guesses, the happy path may work until one caller sends a body that looks almost like JSON but is not valid for the expected parser. Guessing at protocol boundaries is a future incident.

For a response body, Accept says what the client can use:

GET /payment-attempts/pay_901 HTTP/1.1
Accept: application/vnd.shop.payment.v1+json, application/json;q=0.8
Accept-Language: es, en;q=0.7
Accept-Encoding: gzip, br

The client is not commanding the server. It is listing acceptable options, sometimes with quality values such as q=0.8 to express preference. The server compares those preferences with the representations it can produce. Then the response says what was actually chosen:

HTTP/1.1 200 OK
Content-Type: application/vnd.shop.payment.v1+json; charset=utf-8
Content-Language: es
Content-Encoding: gzip
Vary: Accept, Accept-Language, Accept-Encoding

Content-Type labels the media type of the representation. Content-Language says which audience language the representation is intended for. Content-Encoding says that the representation was encoded, commonly compressed, and must be decoded before the media type is interpreted. Vary tells caches which request headers affected the selected response.

The important relationship is this:

request preferences -> server variant selection -> response metadata -> parser/cache/debug decision

If any part is missing, a later component has to guess.

Media Types Are Meaning, Not File Extensions

A media type is a protocol-level label such as application/json, text/html, text/csv, or application/problem+json. It is more than a file extension. It tells a recipient which parser and semantic expectations are appropriate.

Consider these two responses:

HTTP/1.1 200 OK
Content-Type: application/json

{"status":"pending","retry_after_seconds":5}
HTTP/1.1 200 OK
Content-Type: text/csv

payment_id,status
pay_901,pending

Both can describe the same resource. They are different representations. A generic JSON client cannot parse the CSV just because the URI is familiar. A spreadsheet import cannot use the JSON without a different parser. A cache may store either representation, but only if its key includes the headers that caused the variant.

Custom media types are often used for API versioning:

Accept: application/vnd.shop.payment.v2+json

This can be cleaner than putting every version in the URL when the resource identity is stable but the representation shape evolves. It also has a cost. Every client, gateway, mock, contract test, cache rule, and API document must understand the versioned media type. If most callers cannot set custom Accept headers easily, URL versioning may be more practical. The design decision is not moral. The contract must match the clients and infrastructure that will actually use it.

The safest APIs make the selected representation obvious in both directions: clients state what they can send and receive, and servers label what they accepted or returned.

Worked Path: One Payment Attempt, Three Clients

Start with one resource:

/payment-attempts/pay_901

Its authoritative state is stored once:

payment_id = pay_901
order_id = 842
status = pending
next_check_after = 2026-06-18T12:00:05Z
risk_review = required

Client A is an old mobile app. It only understands version 1 JSON:

GET /payment-attempts/pay_901 HTTP/1.1
Accept: application/vnd.shop.payment.v1+json
Accept-Language: es
Accept-Encoding: gzip

The server chooses the v1 Spanish JSON representation and returns:

HTTP/1.1 200 OK
Content-Type: application/vnd.shop.payment.v1+json; charset=utf-8
Content-Language: es
Content-Encoding: gzip
Vary: Accept, Accept-Language, Accept-Encoding

{"id":"pay_901","status":"pending","message":"Pago en revision"}

Client B is the new dashboard. It can use v2 JSON:

GET /payment-attempts/pay_901 HTTP/1.1
Accept: application/vnd.shop.payment.v2+json
Accept-Language: en

The response can include richer fields:

HTTP/1.1 200 OK
Content-Type: application/vnd.shop.payment.v2+json; charset=utf-8
Content-Language: en
Vary: Accept, Accept-Language, Accept-Encoding

{
  "id": "pay_901",
  "status": "pending",
  "next_check_after": "2026-06-18T12:00:05Z",
  "risk_review": "required"
}

Client C is a finance export job:

GET /payment-attempts/pay_901 HTTP/1.1
Accept: text/csv

The same resource can become a CSV representation:

HTTP/1.1 200 OK
Content-Type: text/csv; charset=utf-8
Vary: Accept

payment_id,status,risk_review
pay_901,pending,required

Nothing about this requires three resources. It requires the server to know which variants it can produce, the client to say what it can consume, and the response to label the selected variant accurately.

What Happens When Negotiation Fails

Negotiation should fail explicitly when the boundary cannot produce or consume the requested representation.

If a client asks only for XML and the API has no XML representation, the server can return 406 Not Acceptable:

GET /payment-attempts/pay_901 HTTP/1.1
Accept: application/xml
HTTP/1.1 406 Not Acceptable
Content-Type: application/problem+json

{
  "type": "https://api.shop.test/problems/not-acceptable",
  "title": "No available representation matches the Accept header",
  "status": 406,
  "available_types": [
    "application/vnd.shop.payment.v1+json",
    "application/vnd.shop.payment.v2+json",
    "text/csv"
  ]
}

If the client sends a request body with a media type the server does not support, use 415 Unsupported Media Type:

PATCH /payment-attempts/pay_901 HTTP/1.1
Content-Type: application/xml

The corrective action is different. 406 means "I cannot send you any representation you said you accept." 415 means "I cannot understand the representation you sent me." Mixing them weakens client behavior and support diagnostics.

Many real APIs are lenient with Accept: */* or missing Accept, often defaulting to JSON. That is fine if it is documented. The risky path is silent negotiation that changes over time. If version 2 suddenly becomes the default for clients that never opted in, old callers may break even though their status code remains 200.

Vary Is the Cache Contract

The Vary header is where content negotiation meets caching. It tells a cache, "Do not key this response only by URI; these request headers also influenced the representation."

Suppose the first request through a CDN is from the v2 dashboard:

GET /payment-attempts/pay_901
Accept: application/vnd.shop.payment.v2+json

The origin returns v2 JSON but forgets Vary: Accept. The CDN stores a response under the URI alone:

cache key = /payment-attempts/pay_901
stored body = v2 JSON

Now an old mobile client requests:

GET /payment-attempts/pay_901
Accept: application/vnd.shop.payment.v1+json

If the cache key ignores Accept, the mobile app may receive v2 JSON. The bug is not in the JSON parser, the mobile app, or the resource identity. The bug is that the representation varied but the cache was not told which request header caused the variation.

Vary also has a cost. Vary: Accept-Language can multiply cached variants by language. Vary: User-Agent can destroy cache efficiency because user agents are highly variable. A precise Vary makes negotiation safe; an overly broad Vary can make caching nearly useless. That is the core trade-off in operational form.

The operational signal is usually indirect. You may see parse errors rising only for one mobile version, cache hit rate dropping after adding language negotiation, support tickets where users see the right resource in the wrong language, or dashboards where 200 OK stays healthy while client-side failures increase. Those symptoms point back to the same boundary question: did the response metadata describe the actual representation, and did every shared cache know which request headers mattered?

Failure Modes to Look For

Confusing Accept and Content-Type. Accept describes what the client wants back. Content-Type describes what the sender is actually sending in this message. A request can have both: Content-Type for the request body and Accept for the response body.

Returning bytes without metadata. A response body that "looks like JSON" is still underspecified without Content-Type. Clients may guess correctly for months, then fail when a proxy, browser, SDK, or security policy becomes stricter.

Negotiating versions without contract tests. Versioned media types only help if old and new representations are tested separately. A handler that always serializes the newest object shape while returning a v1 media type is a compatibility bug.

Forgetting Vary at the edge. If a CDN or shared proxy does not know which request headers affect the response, it can serve a valid representation to the wrong client. That is often harder to debug than a simple 500 because every individual component appears to be "working."

Review one endpoint that returns JSON today. Name the resource, list the representations it supports, write the Accept headers two different clients would send, and write the exact Content-Type and Vary headers the response should include. If you cannot write those headers confidently, the representation contract is not finished.

Connections

Status codes explain what outcome the server claims. Representation metadata explains how to interpret the bytes that carry that outcome. A 409 Conflict body, a 202 Accepted status resource, and a successful 200 OK response all still need correct media types.

The next lesson on conditional requests adds freshness evidence. Once the client knows which representation it received, it can use validators such as ETag and Last-Modified to ask whether that representation is still current or whether a write is safe.

Resources

Key Takeaways

PREVIOUS Status Codes and Failure Contracts NEXT Conditional Requests: ETags, Last-Modified, and Precondition Logic