3.1 NP-hard in practice: the coping strategies
P ≠ NP (presumably) forbids only exact + worst-case-fast + general. Drop any one:
| Drop | Strategy | Example |
|---|---|---|
| exactness | approximation algorithms (this lesson) | 2-approx vertex cover |
| worst-case speed | exact exponential / branch-and-bound / SAT solvers | CDCL solvers |
| generality | special cases, parameterised algorithms | trees, small treewidth, FPT |
| guarantees | heuristics & metaheuristics | simulated annealing, genetic algorithms |
3.2 Approximation algorithms — guaranteed near-optimality
An algorithm A is a ρ-approximation for a minimisation problem if for every instance, A(x) ≤ ρ · OPT(x) (for maximisation: A(x) ≥ OPT(x)/ρ). Polynomial running time + a proved bound — not a vibe, a theorem.
Vertex Cover — the classic 2-approximation:
while edges remain:
pick any edge (u, v); add BOTH u and v to the cover;
delete every edge touching u or v
Proof of ratio 2: the picked edges form a matching M; any cover (even optimal) needs ≥ 1 endpoint of each matched edge, so OPT ≥ |M|; we used 2|M| vertices. ∎ (Improving 2 − ε is impossible under the Unique Games Conjecture — even approximation has a complexity theory.)
Metric TSP — Christofides' algorithm (ratio 3/2): MST + minimum perfect matching on odd-degree vertices + Euler tour + shortcut. Ratio: MST ≤ OPT, matching ≤ OPT/2. For general (non-metric) TSP, any constant-factor approximation is NP-hard — approximability splits where exact complexity didn't.
Set Cover — ln n-approximation: greedily take the set covering the most uncovered elements; ratio H(n) ≈ ln n, and Θ(ln n) is provably the limit unless P = NP.
Knapsack — an FPTAS: for any ε, a (1+ε)-approximation in time poly(n, 1/ε) by scaling values down and running the pseudo-polynomial DP. The best possible world for an NP-hard problem.
The approximability spectrum (each level provably distinct unless P = NP):
FPTAS (knapsack) ⊃ PTAS (Euclidean TSP) ⊃ constant ρ (VC: 2, metric TSP: 1.5)
⊃ logarithmic (set cover: ln n) ⊃ near-none (general TSP, max-clique: n^(1-ε) hard)
That hardness-of-approximation column exists because of the PCP theorem — which belongs to the second half of this lesson.
3.3 Interactive proofs — proofs as conversations
Generalise "certificate + verifier": allow a dialogue between an all-powerful (untrusted!) prover and a polynomial-time randomised verifier who must be convinced with high probability (completeness: true claims convincible; soundness: false claims rejected w.h.p. against any prover strategy).
IP = languages with such interactive proof systems.
The flagship example — Graph Non-Isomorphism (GNI): no short classical certificate is known ("these graphs differ" — how would you show it?), yet a two-round interaction works:
Verifier: secretly picks i ∈ {1,2} at random, sends a random
vertex-relabelling (isomorphic copy) H of G_i.
Prover: must answer which graph H came from.
If G₁ ≇ G₂, the all-powerful prover identifies the source every time → always convinces. If G₁ ≅ G₂, H is statistically identical from either source → the prover can only guess: caught with probability 1/2 per round, amplified by repetition. ∎ Randomness + interaction certify a statement that seemed uncertifiable.
The astonishing theorem (Shamir 1992):
IP = PSPACE.
Interaction + randomness exactly captures polynomial space — a conversation can convince you of TQBF truths, i.e. of winning strategies. (Proof core: arithmetisation — turn the formula into a polynomial identity and have the prover defend it sum-by-sum.) This is also a quintessential non-relativising result — the kind of technique P vs NP is waiting for.
Zero-knowledge proofs (the applied face): interactive proofs revealing nothing but validity — the verifier learns the statement is true and zero else. Born theoretical (Goldwasser–Micali–Rackoff), now deployed: authentication protocols, privacy-preserving credentials and blockchain systems (zk-SNARKs) run on this unit's ideas.
PCP — the bridge back to approximation: the PCP theorem (Arora–Safra, Arora–Lund–Motwani–Sudan–Szegedy 1992): every NP statement has a proof checkable by reading O(1) bits at random positions (NP = PCP(log n, 1)). Equivalent restatement: approximating MAX-3SAT beyond a fixed constant is NP-hard — the source of essentially all hardness-of-approximation results in §3.2.
3.4 Two worked approximation analyses (exam-ready)
Vertex cover, traced. G has edges (a,b), (b,c), (c,d), (d,e). Pick (a,b) → add a and b, delete (a,b) and (b,c). Pick (c,d) → add c and d, delete (c,d) and (d,e). Cover = {a, b, c, d}, size 4. The picked edges {(a,b), (c,d)} form a matching, so OPT ≥ 2 — and indeed {b, d} covers everything: OPT = 2, ratio exactly 2. The bound is tight: on this very family the algorithm genuinely pays double, so the analysis cannot be improved — only the algorithm could be (and, under the Unique Games Conjecture, essentially cannot).
Metric TSP — the simpler 2-approximation (know it even when Christofides is the headline):
1. Build a minimum spanning tree of the cities.
2. Walk the tree depth-first, listing vertices as first visited ("walk it twice around").
3. Shortcut repeated visits — the triangle inequality guarantees shortcuts never cost more.
Cost ≤ 2·MST ≤ 2·OPT, because deleting one edge of the optimal tour leaves a spanning tree, so MST ≤ OPT. ∎ Christofides replaces "walk twice" by "add a minimum matching on the odd-degree vertices", paying OPT/2 instead of a second OPT — that is the entire 3/2 story in one sentence.
3.5 PTAS and FPTAS — definitions with teeth
| Scheme | Guarantee | Running time | Example |
|---|---|---|---|
| PTAS | (1+ε)·OPT for every fixed ε > 0 | polynomial in n for each fixed ε (may be n^O(1/ε)) | Euclidean TSP |
| FPTAS | (1+ε)·OPT | polynomial in both n and 1/ε | knapsack |
Knapsack's FPTAS in three sentences. Let v be the largest item value; scale every value down to ⌊vᵢ/K⌋ with K = εv/n, and run the value-indexed pseudo-polynomial DP on the scaled instance — time O(n³/ε). Scaling loses at most nK = εv ≤ ε·OPT of value, so the returned solution is ≥ (1−ε)·OPT. The trick — trading numeric precision for DP speed* — is the canonical exploitation of pseudo-polynomiality (weak NP-completeness, Unit 3).
Converse fact worth a mark: a strongly NP-complete problem has no FPTAS unless P = NP — an FPTAS with ε small enough would recover the exact optimum in polynomial time, contradicting strong hardness. One more clean bridge between Unit 3's classification and what is achievable here.
3.6 IP formally — the two probability promises
A language L has an interactive proof system if there is a polynomial-time randomised verifier V with:
- Completeness: x ∈ L ⟹ some prover strategy makes V accept with probability ≥ 2/3.
- Soundness: x ∉ L ⟹ every prover strategy (even an all-powerful cheater) makes V accept with probability ≤ 1/3.
The constants are arbitrary: k independent repetitions drive the error to 2^−Ω(k) (amplification). Note who is trusted: nobody — the prover is adversarial; only the protocol protects the verifier. In the GNI protocol, completeness holds with probability 1 and the soundness error is exactly 1/2 per round.
Exam pointer
① "Define ρ-approximation; give the vertex-cover algorithm and prove ratio 2" — algorithm + matching argument: six marks of pure preparation. ② "Short note: PTAS vs FPTAS" — §3.5's table plus the knapsack sentences. ③ "Define interactive proofs; describe the GNI protocol" — the two promises, the coin-flip protocol, why a cheating prover wins at most half the rounds. ④ "State Shamir's theorem / the PCP theorem and their significance" — IP = PSPACE; NP = PCP(log n, 1) and its equivalence to hardness of approximating MAX-3SAT.
Self-check
- Why do the picked edges in the vertex-cover algorithm form a matching, and where exactly does the proof use it?
- Prove MST ≤ OPT for metric TSP in one sentence.
- Which property of knapsack makes an FPTAS possible, and why does strong NP-completeness forbid one?
- In GNI, why can a cheating prover succeed with probability at most 1/2 per round when G₁ ≅ G₂?
- What does NP = PCP(log n, 1) say, in words, about checking proofs?
3.7 The course in one paragraph
We built the measuring tools (Big-O, P, NP), formalised the machines (TMs, non-determinism, circuits, RAMs), proved one problem universal (Cook–Levin) and learned to spread its hardness by reduction (3-SAT, CLIQUE, gadgets). Facing the P vs NP wall, we mapped the territory above it (PSPACE, EXP, games) and the engineering responses to it (approximation with proved ratios, interaction with randomness). The single most valuable habit you take away: when you meet a problem, locate it on the map before you try to conquer it.