Laws of Caution: Amdahl's Law and Gustafson's Law
Before you invest in parallelizing a workload, two fundamental laws tell you what speedup is theoretically achievable. They often deliver sobering news — and that is precisely why they are called laws of caution.
The Core Problem
Every real program has two parts:
- A sequential fraction
s(0 ≤ s ≤ 1): code that cannot be parallelized (initialization, I/O, serial algorithms, synchronization barriers) - A parallel fraction
(1 - s): code that can run on N processors simultaneously
The sequential fraction is the bottleneck. Both laws quantify its impact — but they make different assumptions.
---
Amdahl's Law (1967)
Assumption: The problem size is fixed. You are speeding up a fixed workload.
Formula:
Speedup(N) = 1 / ( s + (1 - s) / N )
Where:
s= sequential fraction (fraction of execution time that is inherently serial)N= number of processors1 - s= parallelizable fraction
Worked Example — Amdahl
Let s = 0.10 (10% sequential) and N = 16 processors.
Speedup(16) = 1 / (0.10 + (1 - 0.10) / 16)
= 1 / (0.10 + 0.90 / 16)
= 1 / (0.10 + 0.05625)
= 1 / 0.15625
≈ 6.40×
Even with 16 processors and only 10% serial code, you get at most 6.4× speedup — not 16×. The serial 10% becomes the dominant bottleneck.
Amdahl's Ceiling (as N → ∞):
Speedup_max = 1 / s = 1 / 0.10 = 10×
No matter how many processors you add, you can never exceed 10× speedup if 10% of the code is serial. This is a hard upper bound.
---
Gustafson's Law (1988)
Assumption: The problem size scales with the number of processors (weak scaling). More processors are used to solve bigger problems in the same time.
Formula:
Speedup(N) = N − s · (N − 1)
Equivalently: Speedup(N) = s + N · (1 − s)
Worked Example — Gustafson
Same parameters: s = 0.10, N = 16.
Speedup(16) = 16 - 0.10 × (16 - 1)
= 16 - 0.10 × 15
= 16 - 1.5
= 14.5×
Gustafson's Law predicts 14.5× speedup — much more optimistic than Amdahl's 6.4×. Why? Because with 16× more processors, you solve a 16× larger problem. The serial fraction stays the same absolute size but becomes a smaller fraction of the total (now much larger) work.
---
Conceptual Contrast
Comparison Table
| Dimension | Amdahl's Law | Gustafson's Law |
|---|---|---|
| Workload size | Fixed (strong scaling) | Grows with N (weak scaling) |
| Scaling type | Strong scaling | Weak scaling |
| Serial fraction | Grows as % of total work as N increases | Stays constant in absolute terms |
| Implication | Hard ceiling on speedup; discourages large N | Encourages scaling to large N for big problems |
| Optimistic? | Pessimistic | More optimistic |
| Best applied to | Speeding up an existing fixed job | Solving larger problems in the same time budget |
Which Law to Use?
- Amdahl: Use when you want to know "How fast can I make this specific job?"
- Gustafson: Use when you want to know "If I add more resources, what bigger problem can I solve in the same time?"
Cloud example: A company wants to train a larger neural network model. Each doubling of GPUs (Gustafson) doubles the model size they can train in the same wall-clock time — near-linear scaling. But speeding up a fixed model (Amdahl) hits diminishing returns past ~64 GPUs due to communication overhead (the serial fraction s).
Practical Implications
- Minimize s: Every 1% reduction in the serial fraction is more valuable than doubling your hardware.
- Profile first: Identify the serial bottleneck before buying more nodes.
- Choose the right law: Don't use Amdahl's formula to justify weak-scaling workloads, or you'll underestimate potential gains.
- Communication overhead: In real distributed systems,
sincludes synchronization and network latency — often larger than the algorithmic serial fraction.