1.1 The statement
Cook–Levin Theorem (1971/1973). SAT — the satisfiability problem for Boolean formulas in CNF — is NP-complete.
This is the keystone: the first problem proved NP-complete without using a reduction from another NP-complete problem (there were none yet!). Every subsequent NP-completeness proof stands on it.
1.2 Reminder: what must be proved
- SAT ∈ NP — easy: certificate = a satisfying assignment; verification = evaluate the CNF formula, linear time. ✓
- SAT is NP-hard — the hard part: every language L ∈ NP reduces to SAT. We know nothing about L except that some NTM M accepts L in time nᵏ. So the reduction must encode "machine M accepts input x" as a formula. The slogan:
Computation is local — and local rules are Boolean logic.
1.3 The tableau: computation as a table
Fix L ∈ NP with NTM M running in time t = nᵏ. For input x, an accepting computation of M is a tableau: a t × t table whose row i is M's configuration (tape + state + head position) at step i.
row 0: q0 x1 x2 x3 ... ⊔ ⊔ ← start configuration
row 1: (configuration after step 1)
row 2: (configuration after step 2)
...
row t: ... q_accept ... ← accepting configuration
x ∈ L ⟺ some valid accepting tableau exists. The reduction builds a formula φ that says exactly "a valid accepting tableau exists".
1.4 The variables and the four constraint groups
Variables: x[i, j, s] = "cell (i, j) of the tableau contains symbol s", where s ranges over tape symbols and (state, symbol) composites. Count: t × t × |C| = O(n²ᵏ) variables — polynomial ✓.
φ = φ_cell ∧ φ_start ∧ φ_move ∧ φ_accept:
| Part | Says | How |
|---|---|---|
| φ_cell | every cell holds exactly one symbol | per cell: (at least one) ∧ (no two): (⋁ₛ x[i,j,s]) ∧ ⋀_{s≠s'} ¬(x[i,j,s] ∧ x[i,j,s']) |
| φ_start | row 0 is the start configuration on x | conjunction of literals fixing row 0 |
| φ_move | each row follows from the previous by one legal move of M | windows — see below |
| φ_accept | the accept state appears somewhere | ⋁ᵢⱼ x[i, j, q_accept] |
1.5 φ_move — the 2×3 window trick (the heart of the proof)
Because a TM changes only the cells around its head, correctness of a step is checkable locally: row i+1 follows legally from row i iff every 2×3 window of adjacent cells is "legal" — consistent with δ (or with cells far from the head simply staying put).
col j-1 col j col j+1
row i [ a q,b c ] e.g. δ allows (q,b) → write d, move L, state p
row i+1[ p,a d c ] ← a LEGAL window
M has a constant number of states and symbols, so there is a constant list of legal window patterns. φ_move = ⋀ over all O(t²) window positions of "this window's contents form a legal pattern" — each a constant-size Boolean condition.
1.6 Wrapping up the proof
- φ is satisfiable ⟺ a valid accepting tableau exists ⟺ M accepts x ⟺ x ∈ L. (Non-determinism is absorbed naturally: φ_move allows any δ-legal successor, so satisfying assignments correspond to accepting branches.)
- Size check: O(n²ᵏ) variables, O(n²ᵏ) constant-size clause groups — the formula is polynomial in n and computable in polynomial time. ✓
- Conversion to CNF costs only a constant factor per window (each constant-size condition expands to constantly many clauses). ∎
Hence every NP problem ≤ₚ SAT: SAT is NP-complete.
1.7 Corollary machine — why one theorem built a field
With Cook–Levin paid for, proving any new problem B NP-complete no longer needs tableaux — just:
- B ∈ NP (exhibit a certificate), and
- KNOWN ≤ₚ B for one known NP-complete problem (transitivity of ≤ₚ does the rest).
Karp's 21 problems (1972) cascaded from exactly this, and Levin proved the same result independently in the USSR — hence the joint name. The theorem also reframes P vs NP concretely: P = NP ⟺ SAT has a polynomial-time algorithm. One concrete problem now carries the whole question.
1.8 Why windows of size 2×3 (and not single cells)?
A cell of row i+1 cannot be validated in isolation: whether it should change depends on whether the head was adjacent in row i. A 2×3 window is the smallest frame that always carries enough context — in one step the head affects at most the cell it sits on and one neighbour, so every illegal transition is caught inside some 2×3 frame, while genuinely legal tableaux pass all frames (cells far from the head must simply copy themselves, and "copy yourself" is itself a legal window pattern). If asked "why do local checks suffice?", the one-line answer: a Turing machine step is a local event.
1.9 The size audit, made explicit (a favourite follow-up question)
Let t = nᵏ and let c₀ be the (constant!) number of tableau symbols — tape symbols plus (state, symbol) composites:
| Piece of φ | How many | Size of each | Total |
|---|---|---|---|
| Variables x[i,j,s] | t² positions × c₀ symbols | — | O(n²ᵏ) |
| φ_cell | t² cells | O(c₀²) clauses | O(n²ᵏ) |
| φ_start | one literal per cell of row 0 | O(1) | O(nᵏ) |
| φ_move | t² window positions | O(1) clauses — the legal-window list is constant | O(n²ᵏ) |
| φ_accept | one disjunction | t² literals | O(n²ᵏ) |
Everything is polynomial because the machine M is fixed: |Q|, |Γ| and hence the legal-window list are constants independent of the input x. That sentence — "constant because the machine is fixed" — is the examiners' favourite checkpoint.
1.10 A seven-line reproduction skeleton (memorise this shape)
- Claim: SAT is NP-complete. SAT ∈ NP: certificate = a satisfying assignment; evaluate in linear time. ✓
- Hardness: let L ∈ NP be decided by NTM M in time t = nᵏ; we reduce L to SAT.
- Accepting computations of M on x correspond to t×t tableaux of configurations.
- Variables x[i,j,s]: "cell (i,j) holds symbol s"; s ranges over a constant symbol set.
- φ = φ_cell ∧ φ_start ∧ φ_move ∧ φ_accept, where φ_move says every 2×3 window is legal (a constant pattern list, since M is fixed).
- φ satisfiable ⟺ a valid accepting tableau exists ⟺ M accepts x ⟺ x ∈ L; size and construction time O(n²ᵏ). ✓
- Hence every L ∈ NP satisfies L ≤ₚ SAT; with line 1, SAT is NP-complete. ∎
Practise writing these seven lines from memory — at 8–10 marks this is the single most valuable proof in the course.
1.11 What Cook–Levin does NOT say (viva traps)
- It does not say SAT is unsolvable, or even provably exponential — only that SAT is as hard as everything in NP. A polynomial SAT algorithm is fully consistent with the theorem; it would simply prove P = NP.
- The reduction does not simulate M at runtime — it writes a formula about the tableau; no computation of M is executed while constructing φ.
- CNF is not a lucky accident: each window condition is constant-size, so converting it to CNF clauses costs a constant factor — which is exactly why CNF-SAT (and then 3-SAT) inherit completeness cleanly.
Self-check
- Which half of SAT's NP-completeness is the easy half, and what is the certificate?
- Why is the list of legal windows constant-sized?
- Where does the proof use that M is non-deterministic — and what would change for a deterministic M? (φ_move permits any δ-choice; for a DTM the windows are functionally determined — the proof works verbatim.)
- State the variable count as a function of n and k.
- Complete the sentence: "P = NP if and only if ______ has a polynomial-time algorithm."