Sensitivity Analysis - Identifying What Matters

LESSON

System Dynamics and Causal Modeling

008 30 min intermediate

Day 376: Sensitivity Analysis - Identifying What Matters

The core idea: Sensitivity analysis takes a calibrated, validated model and measures which assumptions actually move the outputs that matter, so teams know where to collect better data, where to add safety margin, and where apparent uncertainty is mostly noise.

Today's "Aha!" Moment

In 07.md, Harbor City validated its Seawall District flood-and-evacuation model on a holdout cloudburst. That was the hard gate: the model had to predict unseen behavior without being retuned. Now the city faces a more operational question. It cannot spend equal effort on every uncertain assumption in the model. Should it prioritize better pump telemetry, debris monitoring at the West Tunnel entrance, more precise household departure surveys, or another round of wall-design refinement?

Sensitivity analysis answers that by asking a narrower question than validation did. Instead of "does the model travel to new evidence?" it asks "when the remaining credible assumptions move, which ones change the outputs we care about, and by how much?" Harbor City reruns the validated model across calibrated ranges for pump derating, tunnel debris accumulation, household departure delay, and surge height. It then measures what happens to tunnel closure time, flooded parcels, and the ranking of the candidate resilience plans.

The important realization is that not all uncertainty deserves the same attention. If a 5% change in pump derating barely changes parcel flooding but a modest shift in debris accumulation moves tunnel closure by twenty minutes, the city's next dollar should probably go toward tunnel monitoring and contingency planning rather than toward refining a low-impact parameter. Sensitivity analysis is how the team turns "the model is still uncertain" into "these are the assumptions that actually control the decision."

That does not make sensitivity analysis a proof of causation. It is leverage analysis inside the model and its chosen uncertainty ranges. Harbor City can learn that evacuation delay strongly influences route availability under the current model, but that alone does not prove which real-world intervention will shorten delay. That distinction becomes the subject of 09.md, where the curriculum moves from sensitivity to causal inference.

Why This Matters

Harbor City is choosing between a taller fixed barrier and a cheaper staged plan built around pump upgrades, debris management, and earlier evacuation triggers. After calibration and validation, both strategies still look plausible. The problem is that "plausible" is not enough for a budget decision tied to public safety. City staff need to know which assumptions can flip the recommendation and which can safely stay rough.

Without sensitivity analysis, every uncertainty gets discussed at the same volume. Analysts spend weeks refining low-impact parameters, decision-makers treat every caveat as equally threatening, and monitoring budgets drift toward what is easy to measure rather than what changes the outcome. The result is a model that appears sophisticated but still gives weak operational guidance.

Sensitivity analysis changes the workflow. It identifies which variables dominate tunnel closure timing, which ones mostly affect long-run flood depth, and which interactions create the dangerous region where the cheaper plan stops being safe. That lets Harbor City make three production-grade moves: collect better data on the assumptions that matter, attach safety margins to the outputs that are fragile, and stop pretending that every parameter needs the same level of precision.

Learning Objectives

By the end of this session, you will be able to:

  1. Explain what sensitivity analysis measures - Distinguish output sensitivity from validation accuracy and describe why the choice of output metric matters.
  2. Choose an appropriate sensitivity workflow - Compare local and global methods and match them to the decision, model structure, and interaction risk.
  3. Use sensitivity results without overclaiming - Turn ranked drivers into monitoring, data-collection, and policy-buffer decisions while keeping the limits of the model explicit.

Core Concepts Explained

Concept 1: Sensitivity is always about a specific output over a specific uncertainty range

The first mistake Harbor City could make is asking whether the model is "sensitive" in the abstract. Sensitive to what? Tunnel closure time, number of flooded parcels, emergency-route availability, capital cost regret, and plan ranking are different outputs. The same input can matter a great deal for one of them and hardly at all for another. Pump derating might strongly affect parcel flooding during a long event, while debris accumulation near West Tunnel might dominate the exact minute when buses can no longer pass.

That is why sensitivity analysis starts by naming the output tied to the decision. Harbor City is not building a generic leaderboard of important parameters. It is asking a set of decision-linked questions: which assumptions move tunnel closure enough to change evacuation routing, which assumptions change the ranking between the tall-wall plan and the staged plan, and which assumptions mostly widen uncertainty bands without changing the recommended action?

The second thing the city must specify is the uncertainty range. A parameter is not "important" because it can produce a huge effect under absurd values. It is important if it changes the outcome within the calibrated, decision-relevant range the city still considers plausible. That is why sensitivity analysis belongs after 06.md and 07.md. Calibration constrains the ranges, and validation establishes that the model is worth interrogating at all.

One practical way to keep that honest is to map outputs to the assumptions most likely to move them:

output                          key uncertain inputs
-----------------------------   ----------------------------------------
west_tunnel_close_min           debris choke rate, pump derating
homes_above_flood_threshold     surge height, drain blockage, pump derating
households_without_safe_route   departure delay, tunnel closure threshold
preferred_plan                  interaction of cost, closure timing, and depth

The trade-off is that decision-specific sensitivity work produces several ranked views instead of one clean headline number. That is more complex to explain, but the complexity is real. A single "importance score" often hides the fact that different parts of the same model drive different operational choices.

Concept 2: Local and global methods answer different questions about leverage

Once Harbor City has chosen its outputs and ranges, it still needs the right method. A local sensitivity analysis changes one parameter slightly around a baseline case and watches how the output responds. This is useful when the city wants to know whether today's operating point is fragile. If a 10% worsening in debris accumulation advances tunnel closure by twelve minutes near the current baseline, that is a strong signal that the evacuation trigger should include extra time buffer.

Local analysis is fast and interpretable, but it can miss the structure that actually matters. Seawall District is full of interactions. Departure delay may be harmless when the tunnel stays open late, yet become critical when debris accumulation is high and pump performance is degraded. That is where global methods matter. Harbor City samples the full calibrated ranges, reruns the model many times, and estimates which parameters contribute most to output variation and which combinations create nonlinear jumps.

In practice, the city can stage the analysis instead of choosing one method forever:

samples = sample_parameter_space(calibrated_ranges, method="morris")
results = [run_harbor_city_model(s) for s in samples]
screened = rank_elementary_effects(results)

focused_ranges = narrow_to_high_impact_inputs(screened)
sobol_report = estimate_variance_contributions(
    model=run_harbor_city_model,
    ranges=focused_ranges,
    outputs=["west_tunnel_close_min", "preferred_plan"],
)

The workflow reflects two common production needs. Screening methods such as Morris help when the model has many uncertain knobs and the team needs a cheap first pass. Variance-based methods such as Sobol are slower, but they separate main effects from interaction effects and give Harbor City a stronger basis for saying not only that debris matters, but whether it matters mostly on its own or mainly through its interaction with departure delay and surge height.

The trade-off is compute and interpretation burden. Global analysis needs many more runs, careful handling of stochastic simulations, and credible uncertainty ranges. But skipping it when interactions are plausible creates false confidence. Harbor City's decision is not fragile because one variable is noisy; it is fragile because several variables can push the district across the same threshold together.

Concept 3: The result is an action hierarchy, not just a chart

Suppose Harbor City's final sensitivity report shows three patterns. First, predicted tunnel closure time is dominated by debris accumulation near the tunnel mouth, with pump derating as a secondary driver. Second, total flooded parcels are driven mostly by surge height and only moderately by tunnel assumptions. Third, the policy ranking between the tall barrier and the staged plan flips only when departure delay and debris accumulation are both in the worse half of their calibrated ranges. That is already more useful than a generic claim that "uncertainty remains."

Those results change what the city should do next. If debris sensitivity is high, the city should fund better debris monitoring, inspect the choke point more often, and include a conservative route-closure buffer in operations. If departure delay matters only in combination with early tunnel loss, the emergency office should test whether alert timing and transport staging can keep the district out of that interaction regime. If a parameter barely moves any decision-relevant output, Harbor City can stop spending scarce analyst time refining it.

Sensitivity analysis therefore turns into governance. It tells the city where to place sensors, which uncertainty deserves scenario buffers in policy memos, which assumptions must be refreshed before the next budget cycle, and which outputs should appear on a decision dashboard. In a production setting, that is the real payoff. The goal is not a prettier tornado chart. It is a shorter list of assumptions that leadership must actively manage.

But the city still has to be disciplined about what the ranking means. A parameter can be highly sensitive because the model structure amplifies it, because the chosen range is wide, or because the output threshold is sharp. That does not automatically tell Harbor City which intervention will causally improve the district. A sensitivity ranking is a map of leverage inside the modeled system. The next lesson, 09.md, takes the next step and asks which levers can be interpreted as causal interventions rather than as correlated model drivers.

Troubleshooting

Issue: Every parameter looks important, so the ranking does not help prioritize anything.

Why it happens / is confusing: The team may be mixing very wide ranges, incomparable outputs, or uncalibrated parameters, which lets almost any variable produce dramatic movement somewhere.

Clarification / Fix: Constrain the ranges using calibration, analyze one decision-linked output at a time, and normalize comparisons so Harbor City is comparing plausible influence, not arbitrary scale differences.

Issue: A one-at-a-time check says pump derating dominates, but later analysis shows debris and departure delay matter more together.

Why it happens / is confusing: Local sensitivity sees the slope near one baseline. It does not automatically reveal interactions or threshold behavior elsewhere in the plausible range.

Clarification / Fix: Keep local analysis for near-baseline fragility, then add a global method when nonlinear interactions are credible. In Seawall District, joint failure modes are part of the real mechanism, not an edge case.

Issue: Decision-makers treat the most sensitive parameter as the one they should intervene on immediately.

Why it happens / is confusing: Sensitivity results feel quantitative, so they are easy to mistake for causal proof or implementation priority by themselves.

Clarification / Fix: Pair the ranking with intervention feasibility, data quality, and causal reasoning. A high-sensitivity variable tells Harbor City where the model is exposed, not automatically which real-world policy will work best.

Advanced Connections

Connection 1: Validation ↔ Sensitivity Analysis

Validation in 07.md established that Harbor City's model could predict a holdout storm without retuning. Sensitivity analysis builds on that foundation. It asks which remaining uncertainties still control the outputs after the model has already earned conditional trust. If validation fails, sensitivity rankings are mostly noise. If validation passes, those rankings become a disciplined way to focus monitoring and safety margin.

Connection 2: Sensitivity Analysis ↔ Causal Inference

Sensitivity analysis ranks leverage inside the model. Causal inference, which arrives in 09.md, asks whether changing a variable in the real world would actually produce the modeled improvement rather than merely move alongside some hidden cause. Harbor City needs both views. Sensitivity tells the city where the decision is exposed; causal inference tells it which exposed levers are defensible targets for intervention.

Resources

Optional Deepening Resources

Key Insights

  1. Sensitivity is output-specific - An input that strongly moves tunnel closure may barely matter for total flooded area, so importance must be tied to the decision surface.
  2. Method choice changes what you can see - Local analysis reveals near-baseline fragility, while global analysis reveals interactions, nonlinear thresholds, and full-range influence.
  3. The real deliverable is prioritization - Good sensitivity work tells a team where to measure better, where to add margin, and where not to waste effort.
PREVIOUS Validation - Testing Model Predictions NEXT Causal Inference - Interventions, Counterfactuals, and Confounding

← Back to System Dynamics and Causal Modeling

← Back to Learning Hub