Day 032: Writing and Reviewing Architecture Proposals

A proposal is not there to prove that the author is smart; it is there to make the design legible enough that the team can challenge it before production does.

Today's "Aha!" Moment

By the time a team writes an architecture proposal, the technical problem is usually not "can we imagine a solution?" The real problem is social and operational: can we make our reasoning explicit enough that other engineers can test it, find the weak assumptions, and understand the costs we are about to lock in? A proposal is therefore not a slide deck, and it is not a ceremony. It is a decision instrument.

Think about the learning platform we have used across this month. The team wants to redesign progress tracking so that learner state converges reliably across devices, notifications stop dropping during bursts, and downstream analytics no longer interfere with the user-facing path. Several architectures could plausibly work. The proposal matters because the choice affects data ownership, retries, observability, rollout risk, and what future teams will inherit. If the document only says "we will use events and a read model," it has not done its job.

The right mental model is this: a good proposal is an argument with evidence. It explains the problem, the invariants, the options considered, the selected design, the failure behavior, and the rollout path. It also makes the costs explicit. What latency improves? What consistency gets weaker? What operational burden moves to another team? What new debugging difficulties appear? If those things are not written down, the implementation team will discover them later, when changing course is more expensive.

That is why proposal writing is the natural capstone for this month. Everything we covered earlier, events, CQRS, edge boundaries, resilience patterns, locality, workload shape, system evolution, only becomes useful in a team when it can be communicated and reviewed clearly. A good proposal turns private architectural intuition into shared engineering judgment.

Why This Matters

The problem: Architecture decisions often live in fragmented conversations, diagrams without context, and undocumented assumptions, which makes reviews shallow and implementation riskier than it needs to be.

Before:

The team debates tools instead of agreeing on the problem and constraints.
Failure modes, rollout risk, and operating cost are left implicit.
Review comments feel subjective because the proposal does not expose its assumptions clearly.

After:

The proposal states the problem, non-goals, invariants, and pressure points up front.
Alternatives and trade-offs are visible enough to review meaningfully.
Implementation and operations teams inherit a decision record, not just a diagram.

Real-world impact: Better proposals reduce rework, improve cross-team coordination, create durable engineering memory, and make incidents easier to analyze because the original assumptions were written down.

Learning Objectives

By the end of this session, you will be able to:

Write a proposal as a technical argument - Structure problem framing, constraints, alternatives, and chosen design so the decision is inspectable.
Review proposals for substance, not polish - Check invariants, failure behavior, rollout plan, and operational cost instead of reacting only to the diagram.
Turn architecture into team memory - Use proposals as durable records that explain why a design was chosen and what risks were accepted.

Core Concepts Explained

Concept 1: A Proposal Must Frame the Problem So Clearly That the Reader Can Disagree Productively

The first section of a proposal should make the reader dangerous in the right way. After one page, they should understand what is broken, who is affected, what must remain true, and what the proposal is intentionally not solving. If that context is weak, every later debate becomes noisy because different reviewers are solving different problems in their heads.

For the learning platform, a weak opening would say: "We propose migrating progress tracking to an event-driven architecture." A strong opening would say something closer to: "Learner progress diverges across devices under retries and burst traffic. Notifications coupled to the write path amplify user-visible latency. We need cross-device convergence within seconds, durable recording of accepted progress events, and decoupled downstream processing. This proposal does not redesign quiz scoring or billing."

That difference matters because it gives the reader a stable evaluation frame. They now know the invariants, the target behavior, and the boundary of the proposal. Only then does the actual architecture discussion become fair.

One useful structure is:

problem
-> user/business impact
-> invariants and constraints
-> non-goals
-> current pain with evidence

The trade-off is that strong framing takes discipline and often exposes uncertainty early. But that is exactly the point. Ambiguity discovered in the proposal is far cheaper than ambiguity discovered during rollout.

Concept 2: The Middle of the Proposal Should Compare Real Options and Show Both Happy and Failure Paths

A proposal becomes credible when it proves the team evaluated alternatives instead of merely documenting the first idea that seemed workable. For the progress-tracking redesign, maybe the team considered:

keep synchronous writes and harden retries
add an append-only event log with asynchronous projections
split the user-facing write path from downstream notification and analytics work

Reviewers should not just see the chosen option. They should see why the rejected options lost. Maybe synchronous coupling kept consistency simpler but could not isolate notification bursts from user writes. Maybe a fully event-sourced design was powerful but too operationally heavy for the current team. Those are design facts, and they belong in the document.

Just as important, the proposal should show both the happy path and at least one meaningful degraded path:

Client write
-> API accepts validated progress event
-> durable store/event log
-> projection updates learner view
-> notifier consumes asynchronously

If notifier is down:
-> accepted event remains durable
-> learner progress still converges
-> notifications retry later
-> alerting fires on backlog growth

This is where many documents improve dramatically. A design that looks elegant on the happy path may become much less attractive once retries, backlog, idempotency, or rollback are made explicit. The trade-off is document length versus decision quality. Shorter is not always better if the missing sections hide the actual risk.

Concept 3: The Best Reviews Test Assumptions, Rollout Safety, and Operating Cost

A good review is not a vote on aesthetics. It is a structured attempt to break the reasoning before production does. That means reviewers should ask questions like:

which invariant is most likely to be violated first?
what dependency failure hurts the user path most?
what metric tells us the rollout is safe?
how do we roll back if data has already flowed into the new path?
what new on-call or debugging burden are we creating?

This is also why rollout and observability belong in the proposal itself. If the document says only what the steady state should look like, it ignores the part where real systems are most dangerous: migration and partial adoption. The proposal should tell the team how to measure success, what fallback exists, and which signals prove the new design is healthy.

One compact review checklist looks like this:

Can I restate the problem?
Can I name the invariant?
Can I explain why this option beat the alternatives?
Can I describe one failure path?
Can I describe the rollout and rollback?
Can I see the new operational burden?

The trade-off is speed versus rigor. Lightweight reviews feel faster, but shallow approval often only postpones the argument until the incident. Strong reviews front-load the disagreement where it is cheapest.

Troubleshooting

Issue: The proposal is mostly diagrams and component names.

Why it happens / is confusing: Diagrams feel efficient, and authors often assume the context is already shared. In reality, different reviewers infer different goals from the same picture.

Clarification / Fix: Add plain-language sections for the problem, invariants, non-goals, chosen trade-offs, and one degraded-path walkthrough. The diagram should support the argument, not replace it.

Issue: Review comments become opinion battles.

Why it happens / is confusing: When assumptions and alternatives are implicit, feedback sounds subjective because reviewers are attacking different invisible versions of the design.

Clarification / Fix: Force the proposal to state assumptions, alternatives, rollout criteria, and operating costs. That gives review a shared object to examine instead of turning it into a taste contest.

Advanced Connections

Connection 1: Architecture Proposals ↔ ADRs

The parallel: A large proposal explains a broad design move; ADRs preserve smaller decisions over time. Both exist to record why the team made a choice, not just what the final system looks like.

Real-world case: A migration proposal may eventually produce several ADRs covering data model choices, retry semantics, or observability decisions that were initially discussed in the larger document.

Connection 2: Proposal Reviews ↔ Incident Analysis

The parallel: Good proposals and good postmortems ask many of the same questions: what assumptions failed, what degradation path was expected, and what part of the system was under-modeled.

Real-world case: When a proposal documents backlog behavior, rollback strategy, and dependency failure handling, later incident analysis becomes much clearer because the original intent is visible.

Resources

Optional Deepening Resources

These resources are optional and are not required for the core 30-minute path.
[ARTICLE] Documenting Architecture Decisions
- Link: https://adr.github.io/
- Focus: A lightweight format for preserving architectural reasoning over time.
[DOC] Google SRE Workbook
- Link: https://sre.google/workbook/table-of-contents/
- Focus: Operational thinking for rollouts, reliability, and failure handling that proposals should surface early.
[ARTICLE] The C4 Model for Visualising Software Architecture
- Link: https://c4model.com/
- Focus: A simple way to make diagrams readable without letting them replace written reasoning.

Key Insights

A proposal is a reviewable argument - Its job is to expose reasoning, assumptions, and accepted costs clearly enough that others can challenge them.
Happy path alone is not architecture - A credible proposal shows alternatives, degraded behavior, and rollout safety.
Good reviews create team memory - They turn private design intuition into a durable record that future engineers can operate, revisit, and improve.

Knowledge Check (Test Questions)

What should a strong proposal make clear before discussing tools or topology?
- A) The full implementation schedule.
- B) The problem, invariants, constraints, and non-goals.
- C) The preferred vendor stack.
Why should an architecture proposal include at least one degraded or failure path?
- A) Because the most important trade-offs often appear only when dependencies slow down, retry, or fail.
- B) Because reviewers prefer longer documents.
- C) Because every proposal should predict every possible outage in detail.
What makes an architecture review genuinely valuable?
- A) Approving the diagram quickly so implementation can start.
- B) Testing assumptions, alternatives, rollout safety, and operational cost before the design becomes expensive to change.
- C) Limiting feedback to style and naming consistency.

Answers

1. B: Without a clear problem frame, reviewers cannot evaluate whether the chosen design is appropriate or whether the trade-offs are justified.

2. A: Degraded paths expose the real behavior of the design under stress, which is where many critical trade-offs become visible.

3. B: The point of review is to challenge the reasoning while change is still cheap, not to rubber-stamp the first coherent-looking diagram.

← Back to Learning