Repositories, Units of Work, and Persistence Boundaries

LESSON

Backend and API Architecture

007 30 min intermediate

Repositories, Units of Work, and Persistence Boundaries

The core idea: Persistence boundaries let application services speak in business operations while one explicit place owns query mechanics, transaction scope, and commit timing.

Core Insight

Imagine the EnrollmentService now handles paid course enrollment. The use case must check whether the learner is already enrolled, reserve a seat, create an enrollment row, and append an audit record. A quick implementation can do all of that with ORM calls directly inside the service.

That version works, but it changes the meaning of the service. The code no longer reads like "enroll this learner if the business rules allow it." It reads like a mix of product policy, query construction, session state, transaction timing, and row-mapping detail. When something fails, the reader has to separate business intent from persistence machinery before they can reason about correctness.

A repository and a unit of work solve two different parts of that problem. A repository gives the use case a domain-facing vocabulary for data operations. A unit of work makes the commit boundary visible: these writes succeed together, or they do not become durable together.

The misconception is that these patterns are mainly about hiding the database forever. They are not. They are about making persistence decisions easier to see. Sometimes direct ORM use is fine. The patterns earn their place when query details or transaction boundaries are drowning out the use case.

The Pressure: Queries Creep Upward

Here is an ordinary version of the problem:

def enroll(learner_id: str, course_id: str, session):
    course = (
        session.query(CourseRow)
        .filter(CourseRow.id == course_id)
        .with_for_update()
        .one()
    )

    existing = (
        session.query(EnrollmentRow)
        .filter(
            EnrollmentRow.learner_id == learner_id,
            EnrollmentRow.course_id == course_id,
        )
        .first()
    )

    if existing:
        raise EnrollmentRejected("already_enrolled")
    if course.available_seats <= 0:
        raise EnrollmentRejected("course_full")

    course.available_seats -= 1
    enrollment = EnrollmentRow(learner_id=learner_id, course_id=course_id)
    session.add(enrollment)
    session.add(AuditRow(kind="learner_enrolled", entity_id=enrollment.id))
    session.commit()
    return enrollment

This code is not bad because SQL or an ORM appears. The problem is that every concern is competing in one place:

If this use case appears in an HTTP handler, a worker, and a backfill script, those details tend to spread. If the database schema changes, application policy code changes. If transaction handling changes, every copied workflow has to be checked.

The useful question is:

Which persistence details should the use case be allowed to know?

The answer is rarely "none." A service may need to know that seat reservation and enrollment creation belong to one durable action. But it should not need to know every query shape and ORM session behavior just to express the workflow.

Repositories Give the Use Case a Better Language

A repository is a persistence boundary shaped around the questions and actions a use case needs. For enrollment, useful repository operations might be:

courses.get_for_enrollment(course_id)
enrollments.exists(learner_id, course_id)
seats.reserve(course_id)
enrollments.create(learner_id, course_id)
audit.append(kind, entity_id)

That vocabulary is not generic CRUD. It is closer to the application language. The repository implementation can still use SQL, an ORM, indexes, locks, and row mappers. Those details move downward:

application service
    "reserve a seat and create enrollment"

repository boundary
    "translate that request into queries and rows"

database
    "execute storage operations with indexes, locks, constraints"

A repository is useful when it makes the service easier to read and safer to change:

class EnrollmentService:
    def __init__(self, courses, enrollments, seats, audit):
        self.courses = courses
        self.enrollments = enrollments
        self.seats = seats
        self.audit = audit

    def enroll(self, learner_id: str, course_id: str):
        course = self.courses.get_for_enrollment(course_id)

        if self.enrollments.exists(learner_id, course_id):
            raise EnrollmentRejected("already_enrolled")
        if not course.has_available_seat:
            raise EnrollmentRejected("course_full")

        self.seats.reserve(course_id)
        enrollment = self.enrollments.create(learner_id, course_id)
        self.audit.append("learner_enrolled", enrollment.id)
        return enrollment

This code is not pretending the database disappeared. It is stating the business flow in terms a backend engineer can review without mentally parsing every query. The storage implementation can still be tested separately with real database behavior.

The trade-off is boundary code versus clarity. A repository adds methods and files. That cost is justified when it localizes storage mechanics or gives the use case a better language. It is wasteful when it only wraps save, find, and delete without reducing any coupling.

Units of Work Make Commit Timing Explicit

Repositories answer "how does the use case ask for data operations?" A unit of work answers a different question: "which persistence changes belong to one commit?"

Consider the enrollment writes:

reserve seat
create enrollment
append audit record

If the seat is reserved but the enrollment row is not committed, the system can lose capacity. If enrollment is committed but the audit record is required and missing, operations may lose traceability. These writes have a shared consistency boundary.

A unit of work makes that boundary visible:

def enroll(self, learner_id: str, course_id: str):
    with self.uow as uow:
        course = uow.courses.get_for_enrollment(course_id)

        if uow.enrollments.exists(learner_id, course_id):
            raise EnrollmentRejected("already_enrolled")
        if not course.has_available_seat:
            raise EnrollmentRejected("course_full")

        uow.seats.reserve(course_id)
        enrollment = uow.enrollments.create(learner_id, course_id)
        uow.audit.append("learner_enrolled", enrollment.id)
        uow.commit()

    return enrollment

The exact API is not the point. The point is that the reader can see when the transaction starts, which repositories participate, and when the commit happens.

This also creates a cleaner place for rollback behavior. If an exception happens before commit(), the unit of work can roll back the transaction. If commit succeeds, the database state becomes durable. That makes the consistency story reviewable.

The trade-off is ceremony versus transactional clarity. A unit of work is usually worth it for multi-write workflows, request-scoped transactions, and services that coordinate several repositories. It is often unnecessary for a simple read or a single write with no meaningful transaction question.

Worked Example: Database Commit vs External Side Effects

Now add a notification. After enrollment succeeds, the learner should receive an email or event:

1. reserve seat
2. create enrollment
3. append audit record
4. publish EnrollmentCreated

It is tempting to put the publish call inside the same unit of work and believe the whole operation is atomic. That is the dangerous mistake. A database transaction can roll back database writes. It cannot automatically roll back an email already sent, a payment provider call, or a message published to a broker unless you have a stronger distributed transaction mechanism.

A safer shape is to commit the state change and record the side effect intent inside the same database transaction:

with uow:
    enrollment = uow.enrollments.create(learner_id, course_id)
    uow.outbox.add(
        event_type="EnrollmentCreated",
        payload={"enrollment_id": enrollment.id},
    )
    uow.commit()

Then a separate publisher reads the outbox table and delivers the event. This pattern does not make the world magically atomic. It makes the handoff recoverable: if the process crashes after commit but before publish, the outbox row is still there.

This is where persistence boundaries connect to broader backend reliability. The unit of work defines the durable database fact. The outbox or workflow mechanism handles communication with the outside world after that fact exists.

Failure Modes and Design Limits

The first failure mode is a repository that mirrors the ORM instead of the use case. If the service still thinks in sessions, eager-loading flags, query builders, and row-specific exceptions, the boundary is not doing much.

The second failure mode is a generic base repository that makes every operation look the same. A method named save(entity) may be fine in some systems, but it does not say whether enrollment creation should reserve a seat, enforce uniqueness, or append audit data.

The third failure mode is hidden commits. If repository methods commit internally, the service cannot see which writes rise or fall together. That creates accidental partial success.

The fourth failure mode is pretending a unit of work controls more than it does. It usually controls one persistence store or one transactional resource. External APIs, message brokers, and email systems need explicit delivery, retry, idempotency, or compensation strategies.

The limit is that no boundary removes the need to understand the database. Constraints, isolation levels, locks, indexes, and query plans still matter. A good persistence boundary keeps those concerns localized; it does not make them irrelevant.

Connections

The previous lesson on dependency injection explained how application services receive collaborators instead of constructing them. Repositories and units of work are common collaborators that the composition root wires to concrete database-backed implementations.

The next lesson on SOLID principles gives vocabulary for judging these boundaries. Repositories often apply dependency inversion, but they violate the spirit of SOLID when they become broad, leaky interfaces with unclear responsibilities.

Later tracks on database internals and distributed transactions will revisit the harder limits: isolation levels, deadlocks, two-phase commit, sagas, and outbox-based reliability.

Resources

Key Takeaways

PREVIOUS Dependency Injection and Runtime Composition NEXT SOLID Principles in Backend Design