Validation at Backend Boundaries

LESSON

009 30 min intermediate

Validation at Backend Boundaries

The core idea: Validation is a boundary-knowledge trade-off: each layer should reject the problems it can know with confidence, while deeper layers protect meaning, authorization, and committed state.

Core Insight

Imagine a backend endpoint for course reviews:

POST /courses/{course_id}/reviews
{
  "rating": 5,
  "comment": "Clear and practical."
}

Several things can go wrong before this review becomes durable state. The JSON may be malformed. The rating may be outside the accepted range. The caller may be unauthenticated. The caller may be authenticated but not enrolled in the course. The course may be archived. Two browser tabs may submit the same valid review at nearly the same time.

Those failures are not one generic category called "bad input." They cross different boundaries. The HTTP edge can reject malformed shape. The authentication layer can establish caller context. The application or domain use case can decide whether this learner is allowed to review this course. The database can enforce integrity when concurrent writes race.

The common mistake is to ask, "Should validation live in controllers, services, or the database?" That turns validation into a turf war. The better question is, "Which boundary has enough knowledge to reject this problem correctly, early, and with the right failure semantics?"

Good validation is layered without being random duplication. Each layer should narrow the assumptions for the next one. By the time domain code runs, it should not be parsing raw JSON. By the time the database commits, it should not be the first component to discover that the user-facing action is meaningless.

Four Validation Boundaries

The review request passes through four useful kinds of checks:

raw request
   -> structural validation
   -> caller and context validation
   -> domain validation
   -> persistence validation

Structural validation is about shape. Is the body valid JSON? Is course_id parseable? Is rating an integer? Is it in the allowed range? Is comment short enough for the API contract? These checks are cheap and local. The backend does not need to load enrollment state to reject { "rating": "great" }.

Caller and context validation is about trust. Is there an authenticated user? Does the request carry a tenant, account, or role context that the route needs? Is the caller allowed to attempt this class of operation at all? This connects to the earlier authentication lesson: identity at the boundary gives the backend a trustworthy caller, but it does not automatically prove resource-specific permission.

Domain validation is about meaning. A well-formed request from Alice can still be invalid if Alice never enrolled in course 417, if the course is archived, if reviews are disabled, or if Alice already submitted a review. The domain layer can answer those questions because it can read product state and apply product policy.

Persistence validation is about committed integrity. Even if the use case checks that Alice has not reviewed the course, two concurrent requests can race. A unique constraint on (user_id, course_id) is still needed to protect the invariant at commit time.

The important distinction is not "early checks are good, late checks are bad." Early checks are cheaper and clearer when the boundary has enough knowledge. Late checks are stronger when only the storage system can protect the invariant under concurrency.

Worked Review Submission

Here is a compact path for POST /courses/417/reviews:

1. HTTP boundary parses the request.
2. Request schema validates rating and comment shape.
3. Auth middleware creates caller context: user_id=alice.
4. Use case loads course and enrollment facts.
5. Domain policy checks whether Alice may review course 417.
6. Repository attempts to insert the review.
7. Database constraints protect duplicate and referential integrity.
8. Handler maps the result or failure into an API response.

In code, the shape should make the boundary movement visible:

class CreateReviewRequest:
    rating: int
    comment: str


class CreateReviewCommand:
    user_id: str
    course_id: str
    rating: int
    comment: str


def create_review(command, courses, enrollments, reviews, unit_of_work):
    course = courses.get(command.course_id)
    if course.is_archived:
        raise ReviewRejected("course_archived")

    if not enrollments.exists(command.user_id, command.course_id):
        raise ReviewRejected("not_enrolled")

    with unit_of_work:
        reviews.add_once(
            user_id=command.user_id,
            course_id=command.course_id,
            rating=command.rating,
            comment=command.comment,
        )

The request DTO protects what the client may send. The command carries trusted caller context plus accepted input. The use case checks product meaning. The repository and unit of work hide persistence details while still allowing the database to enforce uniqueness.

A useful API also keeps failures distinct:

malformed JSON or bad rating       -> 400 Bad Request
missing authentication             -> 401 Unauthorized
authenticated but not enrolled     -> 403 Forbidden or domain error
course archived / review closed    -> 409 Conflict or 422 Unprocessable
duplicate caught at commit time    -> mapped duplicate-review response

The exact status code depends on the API's conventions. The design point is that the backend should not collapse all failures into "validation failed." A caller should be able to tell whether the request was malformed, unauthorized, domain-invalid, or rejected by an integrity guard.

Overlap Without Confusion

Layered validation sometimes looks like duplication. The service checks for an existing review, and the database also has a unique constraint. That overlap is justified because the two checks protect different moments.

The service check gives a clear user-facing answer before attempting a write. The database constraint protects the invariant if two writes race or another code path bypasses the normal use case. If both checks ask "has this user already reviewed this course?", they are not redundant when one is a friendly preflight and the other is the final integrity guard.

Other duplication is wasteful. If the controller validates rating <= 5, the service validates rating <= 5, and the repository validates rating <= 5 for the same reason, the code is repeating ceremony. The better design is to name the invariant and decide which boundary owns it. The database may still keep a check constraint if the value must never be committed outside range, but the use case should not mechanically repeat the same schema rule unless it adds meaning.

The mental model is:

same question, same reason       -> likely duplication
same concept, different failure  -> often justified layering

Operational Failure Modes

Issue: Letting malformed transport data reach domain code.

Clarification / Fix: Parse and validate request shape at the edge. Domain services should receive typed input and trusted caller context, not raw HTTP noise.

Issue: Moving all business rules into request schemas.

Clarification / Fix: Keep schema validation focused on structure. Rules that depend on enrollment, course state, ownership, or workflow belong in the application or domain layer.

Issue: Treating database constraints as the entire validation strategy.

Clarification / Fix: Constraints are the integrity backstop. They should not be the only way users learn that a request was malformed, forbidden, or domain-invalid.

Issue: Calling every layered check "defense in depth."

Clarification / Fix: Write the question each boundary answers. Keep overlap only when it protects a different failure mode, timing window, or bypass path.

Close the lesson and trace the review request from memory. For each boundary, write one sentence: "This layer can reject X because it knows Y." If a layer rejects something without knowing enough to explain it, the validation rule is probably in the wrong place.

Connections

The previous lesson used SOLID to diagnose change pressure and dependency direction. Validation uses the same habit: boundaries should match knowledge and responsibility, not convenience.

The next lesson on DTOs and mapping continues the same path inward. Once validation turns raw requests into accepted input, mapping decides which shapes cross each boundary and which internal fields stay private.

Validation also connects directly to API security. Input validation, authentication, authorization, and persistence constraints are separate controls, but together they define what assumptions the backend may safely make after each boundary.

Resources

[ARTICLE] OWASP Input Validation Cheat Sheet
- Focus: Review validation as part of API and application security, especially allowlists, canonicalization, and boundary handling.
[DOC] Pydantic Documentation
- Focus: See one practical approach to typed request models and structured boundary validation in Python.
[DOC] PostgreSQL Constraints
- Focus: Connect uniqueness, check constraints, and foreign keys to persistence-level integrity.
[BOOK] API Design Patterns
- Focus: Compare validation, error semantics, and resource contracts as part of a deliberate API surface.

Key Takeaways

Validation follows boundary knowledge: each layer should reject the problems it can know and explain correctly.
Structural validity, caller trust, domain permission, and committed integrity are different checks with different owners.
Persistence constraints are essential backstops, but they do not replace clear edge and domain validation.
Layered validation is useful when each layer protects a distinct failure mode, not when it repeats the same rule by habit.

← Back to Backend and API Architecture

← Back to Architecture And Platforms

← Back to Learning Hub