Reverse Proxies, Load Balancers, and Header Trust

LESSON

017 25 min intermediate

Reverse Proxies, Load Balancers, and Header Trust

The core idea: a proxy or load balancer is not just a pipe in front of an app; it becomes a decision point that chooses an upstream, rewrites connection facts into headers, and creates a trust boundary the backend must handle deliberately.

Core Insight

Imagine a payments API behind a CDN, an edge load balancer, and an internal reverse proxy. A customer reports that checkout sometimes redirects from HTTPS back to HTTP. At the same time, fraud detection says thousands of requests appear to come from the same private IP address. The application code did not suddenly forget how redirects or IP addresses work. It is reading the wrong facts from the wrong boundary.

The browser connected to the public edge. The edge terminated TLS, selected an internal upstream, and forwarded an HTTP request to the application. To preserve facts about the original request, intermediaries added headers such as X-Forwarded-For, X-Forwarded-Proto, and X-Forwarded-Host. The application then used those headers to build redirects, enforce allowlists, log client addresses, and decide whether a request was secure.

That is useful, but dangerous. Headers are ordinary HTTP fields. A client can send a fake X-Forwarded-For unless the trusted proxy removes or overwrites it. A backend can safely trust forwarding headers only when it knows which proxy inserted them, which previous values were discarded, and which network path made the request impossible for a public client to forge.

The trade-off is centralized traffic control versus hidden trust coupling. Proxies and load balancers make rollout, routing, failover, TLS termination, compression, and observability easier to centralize. They also split the truth about a request across multiple hops. The app sees a local peer address, a set of headers, and maybe a trace ID. The user's real path lives at the edge. Good HTTP delivery makes that translation explicit instead of hoping every layer interprets headers the same way.

The Pieces in the Path

A reverse proxy is a server that accepts a client request and makes a new request to an upstream service on the client's behalf. The client thinks it is talking to api.shop.test; the proxy may talk to checkout-v7.internal:8080. The proxy can terminate TLS, normalize headers, compress responses, buffer bodies, enforce size limits, and retry or fail over before the application sees anything.

A load balancer is the decision mechanism that chooses where traffic goes. It may operate at the transport level, where it mostly sees connections, or at the HTTP level, where it can inspect hostnames, paths, headers, cookies, and health signals. In real platforms the same component often does both jobs:

browser
  -> CDN edge
  -> public load balancer
  -> ingress reverse proxy
  -> application instance

Each hop has two views of the request. It sees the peer that connected to it, and it may know something about the original client from a previous hop. For the application instance, the direct peer may be the ingress proxy. That peer address is real, but it is not the user's address. The user's address has to be carried as metadata.

Forwarding headers are that metadata. Common examples are:

X-Forwarded-For: 203.0.113.17, 10.0.4.8
X-Forwarded-Proto: https
X-Forwarded-Host: api.shop.test
Forwarded: for=203.0.113.17;proto=https;host=api.shop.test

X-Forwarded-For usually grows as a comma-separated chain. The leftmost value is often the original client, and later values are proxies that forwarded the request. "Often" is doing important work. This convention is not enough by itself for security. If the internet-facing edge appends to a value supplied by the client instead of replacing it, a fake address can appear at the left of the list.

X-Forwarded-Proto tells the app whether the original client used https or http at the public edge. This matters because the app may build absolute redirects, mark cookies as secure, or reject insecure requests. If TLS terminates before the app, the local connection from proxy to app may be plain HTTP even though the user used HTTPS. The app needs the edge's fact, not the local socket's scheme.

X-Forwarded-Host carries the host requested by the client before internal routing changed it. This matters for canonical URLs and multi-tenant routing. It is also dangerous if used carelessly, because host-derived values can influence links, redirects, password reset URLs, cache keys, and tenant selection.

The formal term is a trusted proxy boundary. It is the point where the system decides, "from here inward, forwarding metadata is authoritative because only known infrastructure can set it." Outside that boundary, those same headers are just user input.

What Actually Happens on a Request

Return to the checkout API. A browser sends:

GET /pay HTTP/2
Host: api.shop.test

The public edge sees a TLS connection from 203.0.113.17. It terminates TLS and forwards to the internal ingress. A careful edge policy does three things:

1. remove any incoming X-Forwarded-* headers from the public request
2. set X-Forwarded-For to the observed client IP
3. set X-Forwarded-Proto and X-Forwarded-Host from the public request facts

The forwarded request might become:

GET /pay HTTP/1.1
Host: checkout.internal
X-Forwarded-For: 203.0.113.17
X-Forwarded-Proto: https
X-Forwarded-Host: api.shop.test

The internal ingress then selects an application instance. If it appends its own address to the chain, the application may see:

X-Forwarded-For: 203.0.113.17, 10.0.4.8
X-Forwarded-Proto: https
X-Forwarded-Host: api.shop.test

The application should not simply take "the first IP in X-Forwarded-For" in every environment. It should apply a configured trust rule. A common rule is: trust forwarding headers only when the direct peer is a known proxy subnet; then parse the chain according to the number of trusted hops or trusted CIDR ranges. If the direct peer is not trusted, ignore the headers and use the socket peer address.

That decision turns a vague header into an inspectable mechanism:

direct peer: 10.0.9.12
is direct peer trusted proxy? yes
X-Forwarded-For chain: 203.0.113.17, 10.0.4.8
trusted internal proxy: 10.0.4.8
client address used by app: 203.0.113.17
scheme used by app: https
host used by app: api.shop.test

If the same request arrives directly from the internet to the app port, the decision changes:

direct peer: 198.51.100.44
is direct peer trusted proxy? no
ignore X-Forwarded-* headers
client address used by app: 198.51.100.44

The mechanism is not "read these headers." The mechanism is "convert transport facts observed by a trusted intermediary into application metadata, and reject that metadata when it did not arrive through the trusted path."

Load Balancing Is a State Decision

Once traffic is inside the proxy layer, the load balancer decides which upstream handles the request. The simplest mental model is round-robin: request one goes to instance A, request two to B, request three to C. Real systems add more state.

A load balancer may consider active health checks, passive error rates, connection counts, latency, locality, request path, tenant, canary weight, sticky cookies, or consistent hashing keys. These choices change user-visible behavior. For example:

/pay     -> checkout stable pool
/pay     -> 5% checkout canary pool
/static  -> asset service
/admin   -> admin service, stricter auth gateway

Health checks are especially easy to misunderstand. A health check does not prove the application can handle every user request. It proves a specific probe succeeded recently. If the probe is GET /healthz and checkout depends on a database, the pool can look healthy while payment writes fail. If the probe is too strict, one slow dependency can remove all instances and cause an outage through self-protection.

Connection reuse adds another wrinkle. A proxy may keep persistent upstream connections open and send many client requests through them. That improves latency and reduces handshake cost, but it means application logs may show a small set of proxy IPs and long-lived connections. The real client identity must come from trusted metadata, and request-level trace IDs become more important than connection-level assumptions.

Retries also belong at this boundary. A proxy may retry a failed upstream request if no response was received, or if a configured status code appears. That can improve availability for safe reads. It can duplicate side effects for unsafe operations if the application and proxy do not agree on idempotency. A load balancer retrying POST /pay is not a harmless transport detail.

The design question is therefore:

Which layer chooses the upstream, which state does it use,
which request facts does it rewrite, and which operations may it retry?

If that question is answered only by default settings, incidents will reveal the contract at the worst possible time.

Worked Path: The Fake Client IP Incident

The shop team rate-limits password reset requests by client IP. During an attack, the limit fails. Many requests include this header:

X-Forwarded-For: 127.0.0.1

The application picks the first value in X-Forwarded-For and treats it as the client IP. The attacker supplied the value. The public edge appended the observed address instead of replacing the header, so the backend saw:

X-Forwarded-For: 127.0.0.1, 203.0.113.66

The naive application selected 127.0.0.1. The rate limiter grouped many attackers into a trusted-looking loopback address, and a separate admin allowlist almost accepted the same fake identity.

The fix has two sides. At the public edge, strip incoming forwarding headers from untrusted clients and set fresh values:

public request headers -> remove X-Forwarded-*
edge observed client: 203.0.113.66
new X-Forwarded-For: 203.0.113.66

At the application, configure trusted proxies explicitly:

trusted proxy CIDRs: 10.0.0.0/8
direct peer: 10.0.9.12
header chain: 203.0.113.66, 10.0.4.8
selected client: 203.0.113.66

Now test the negative case:

direct peer: 198.51.100.44
header chain: 127.0.0.1
selected client: 198.51.100.44
security log: untrusted forwarding header ignored

The important result is not just a correct IP address. The system now has a boundary rule that can be tested: public clients cannot assert forwarding facts; only known proxies can.

The same pattern fixes the HTTPS redirect bug. If the app sees the direct upstream connection as http, it may redirect to an insecure URL or fail to set secure cookies. Instead, the app should use X-Forwarded-Proto: https only when it came from a trusted proxy. If an untrusted client sends that header directly, it should be ignored.

Operational Failure Modes

Failure: trusting headers from the internet. Any X-Forwarded-* or Forwarded value that arrives from an untrusted peer is user-controlled input. Use edge stripping, trusted proxy lists, and framework settings that define the number or range of trusted hops.

Failure: preserving the wrong host. Host and forwarded host values affect redirects, absolute links, tenant routing, and sometimes cache keys. Validate allowed public hosts instead of reflecting arbitrary host headers into links or password reset emails.

Failure: health checks that prove the wrong thing. A shallow check can keep broken instances in rotation. An overly deep check can remove good instances during a dependency incident. Separate "process is alive" from "this dependency path is ready for user traffic."

Failure: retrying unsafe operations. Proxy retries can turn partial network failures into duplicate writes. Restrict automatic retries to safe or explicitly idempotent operations, and make idempotency keys visible to the layer that may retry.

Failure: losing request identity at the boundary. If the proxy generates a trace ID but the app logs a different one, debugging requires joining partial evidence by time. Standardize request IDs and propagate them through every proxy hop.

Useful signals include direct peer address, selected client address, trusted proxy decision, forwarded host and proto, upstream chosen, load-balancer pool, health-check result, retry count, response flags, request ID, trace ID, and whether the request used a canary or stable route.

Architecture Review

For one service, close the lesson and reconstruct the request path from memory:

public client
-> first trusted edge
-> load-balancing decision
-> internal proxy or ingress
-> application instance

Then answer five questions:

Which component is allowed to set forwarding headers?
Which incoming headers are stripped at the public edge?
How does the app decide whether the direct peer is trusted?
Which layer may retry, and for which methods?
Which log line proves the selected client IP, scheme, host, upstream, and request ID?

If any answer is "the framework probably handles it," the boundary is not yet explicit enough. Framework defaults can be good, but only after they are configured to match the actual proxy chain.

Connections

The previous lesson explained where HTTPS terminates and how the edge proves identity to the browser. This lesson starts immediately after that: once the edge decrypts traffic and forwards it inward, the backend needs a trustworthy way to recover original client facts.

The DNS and CDN lessons that follow will add more intermediaries before the request reaches origin infrastructure. The same habit carries forward: name which component observed a fact directly, which component merely forwarded it, and which component is allowed to make a decision from it.

Resources

[RFC] Forwarded HTTP Extension RFC 7239
- Focus: Use it for the standardized Forwarded header and the problem it tries to solve.
[DOC] MDN: X-Forwarded-For
- Focus: Read the security notes on selecting client IPs through trusted proxies.
[DOC] NGINX Reverse Proxy Guide
- Focus: Use it for concrete examples of proxying, passing headers, and upstream configuration.
[DOC] Envoy HTTP Connection Manager
- Focus: Use it to see how modern edge proxies combine routing, headers, retries, tracing, and request processing.
[RFC] HTTP Semantics RFC 9110
- Focus: Use it for the underlying request, header, method, and connection semantics that proxies must preserve or transform deliberately.

Key Takeaways

Reverse proxies and load balancers create new request boundaries: they choose upstreams, rewrite metadata, and may retry or buffer before the app sees a request.
Forwarding headers are trustworthy only when they were set or sanitized by a known proxy path. Outside that path, they are ordinary user input.
Header trust should be configured as a rule: trusted proxy ranges or hop counts, stripped public headers, validated hosts, and observable selected client facts.
Load-balancer decisions are part of application behavior because health checks, retries, connection reuse, and routing policy change what users experience.

← Back to HTTP Protocol and Content Delivery

← Back to Distributed Systems

← Back to Learning Hub