Distributed Schedulers and Control Planes: Admission, Policy, and API Control Surfaces

LESSON

Distributed Schedulers and Control Planes

013 35 min advanced

Distributed Schedulers and Control Planes: Admission, Policy, and API Control Surfaces

The core idea: Admission control is the boundary where requests become desired state, so the design trade-off is between rejecting unsafe work early and keeping the control-plane front door available, understandable, and fast.

Core Insight

Suppose a team submits a new risk-api deployment while the scheduler policy canary is still active. The request asks for four GPU-backed replicas, a high-priority class, a zone preference near the fraud database, and a label that opts into scheduler-policy-v5. If the API server simply stores the object, every later controller must discover whether the request is allowed, well formed, within quota, compatible with tenant policy, and safe for the canary scope.

Admission control exists because some decisions should happen before an object enters desired state. The API surface is not just a data mailbox. It is the front door to the control plane: it authenticates callers, authorizes actions, fills defaults, rejects invalid combinations, enforces policy, and records objects in a form that downstream controllers can safely reconcile.

The tempting mistake is to put every rule into admission because early rejection feels cleaner. But admission is on the write path. If it becomes slow, unavailable, inconsistent, or too clever, the whole platform becomes harder to operate. Good admission design separates rules that must block writes from rules that can be handled later by reconciliation, scheduling, backpressure, or human review.

The API Write Path

A simplified write path looks like this:

request
  -> authenticate caller
  -> authorize verb and resource
  -> decode and schema-check object
  -> apply defaults and allowed mutations
  -> validate invariants and policy
  -> reserve or check quota when needed
  -> persist desired state
  -> controllers observe and reconcile

Each step answers a different question:

For risk-api, defaulting might add a standard topology spread rule. Validation might reject an invalid priority class. Policy might allow scheduler-policy-v5 only in the canary namespace. Quota might reject a GPU request that would exceed the team's reservation. These checks happen before the scheduler ever sees a pending pod.

Defaulting, Mutation, and Validation

Admission systems often support both mutation and validation. Mutation changes the request before storage. Validation accepts or rejects the final object.

Mutation is useful for mechanical, predictable normalization:

Validation is useful for invariants:

The order matters. A validator should usually inspect the final object after defaulting and mutation. Otherwise a request can be rejected for a field that the platform would have filled, or accepted before mutation creates a conflicting state.

Mutation also has a trust cost. If a request enters as one thing and gets stored as another, users and controllers need clear visibility into what changed. Hidden mutation makes debugging difficult. For important fields, explicit defaults, audit records, and stable API conventions are easier to operate than surprising rewrites.

Policy Placement

Not every policy belongs in the same place. A useful design asks where the decision has the best information and the safest failure mode.

Admission is a good home for rules that:

Reconciliation is a better home for rules that:

Scheduling is a better home for placement choices:

For example, admission can reject risk-api if it claims a protected priority class without permission. The scheduler should decide which valid node gets the replica. The quota controller may own durable accounting. The rollout controller may decide whether the canary scope should expand. A clean API control surface keeps those responsibilities separate enough that each failure has a visible owner.

Staleness and Failure Policy

Admission often wants context: tenant quota, active rollout phase, allowed image registries, policy versions, or namespace labels. Some of that context may come from local caches like the ones in the previous lesson. That creates a hard question: what happens if the context is stale or the policy engine is unavailable?

There are two broad choices:

Fail closed protects the platform from unsafe writes, but it can turn a policy service outage into a platform-wide write outage. Fail open keeps users moving, but it may admit work that violates isolation, quota, or security expectations. The right answer depends on the consequence.

Security and ownership boundaries usually fail closed. A request that might run privileged code, consume protected GPU capacity, or bypass tenant isolation should not be admitted from uncertain policy state. Low-risk hints, labels, or non-critical defaults may fail open with an audit event and later repair.

Freshness should be explicit. If admission uses cached namespace labels to decide whether scheduler-policy-v5 is allowed, the decision should record which policy version and cache version it used. If a canary gate changed ten seconds ago, admission should not silently use yesterday's view.

Worked Example: Admitting a GPU Workload

Imagine a tenant submits:

service: risk-api
namespace: payments-prod
replicas: 4
resources: 1 GPU, 8 CPU, 32 GiB memory per replica
priorityClass: recovery-critical
schedulerPolicy: v5-canary
zonePreference: zone-b

A disciplined admission path can produce this result:

1. Authenticate the caller as payments-deployer.
2. Authorize create on deployments in payments-prod.
3. Default missing topology-spread and scheduler profile fields.
4. Validate that GPU memory and CPU requests are declared.
5. Check that payments-prod may use recovery-critical priority.
6. Check that v5-canary is active for this namespace and service.
7. Reserve or verify quota for the requested GPUs.
8. Store the normalized object with policy and quota decision annotations.

If the request fails, the rejection reason should be actionable:

rejected: schedulerPolicy v5-canary is not active for namespace analytics-dev

That is better than storing the object and letting it sit pending with a vague scheduling failure. It is also better than a generic "forbidden" message that forces the user to guess which policy blocked them.

Now imagine the quota service is unavailable. If the workload claims protected GPU capacity, fail closed may be appropriate because admitting it could overcommit a recovery lane. If the request only adds a non-critical annotation, fail open with audit may be enough. Admission design is partly about classifying these consequences before the outage happens.

Operational Failure Modes

Connections

Resources

Key Takeaways

PREVIOUS Distributed Schedulers and Control Planes: Watch Streams, Caches, and Staleness Boundaries NEXT Distributed Schedulers and Control Planes: Multi-Tenant Isolation and Noisy Neighbor Control