1.1 Why we need a formal machine
Statements like "no polynomial algorithm exists" quantify over all possible algorithms — impossible unless "algorithm" has a mathematical definition. The Turing machine (Alan Turing, 1936) is that definition: simple enough to prove theorems about, powerful enough (by the Church–Turing thesis) to express every effective computation.
1.2 The model
A Turing machine has an infinite tape of cells, a read/write head, and a finite control (program). Formally a 7-tuple M = (Q, Σ, Γ, δ, q₀, q_accept, q_reject):
| Component | Meaning |
|---|---|
| Q | finite set of states |
| Σ | input alphabet (no blank symbol ⊔) |
| Γ | tape alphabet, Σ ⊂ Γ, ⊔ ∈ Γ |
| δ : Q × Γ → Q × Γ × {L, R} | transition function: read a symbol → write a symbol, move Left/Right, change state |
| q₀, q_accept, q_reject | start / accept / reject states |
A configuration = (state, tape contents, head position) — a full snapshot. A computation is a sequence of configurations; the machine accepts, rejects, or loops.
- A language is decidable if some TM accepts every string in it and rejects every string not in it (always halts).
- TIME(f(n)) = languages decidable by a TM within f(n) steps; SPACE(f(n)) likewise for cells visited. Then P = ⋃ₖ TIME(nᵏ).
1.3 A concrete TM — deciding {0ⁿ1ⁿ}
Idea: repeatedly cross off one 0 and one matching 1.
1. If tape starts with 1 → reject. If blank → accept.
2. Read a 0, overwrite with X, move right past 0s and Ys.
3. Find the first 1, overwrite with Y; if none → reject.
4. Return to the leftmost unmarked symbol; repeat.
5. When only X/Y remain with no leftover 0s or 1s → accept.
Each pass walks the tape (O(n)) and crosses off two symbols, so n/2 passes: time O(n²), space O(n) — your first complexity analysis of a machine rather than of pseudocode.
1.4 Variants — and why they change (almost) nothing
| Variant | What changes | Simulation cost on a basic TM |
|---|---|---|
| Multi-tape (k tapes) | separate input/work/output tapes | t(n) steps → O(t²(n)) single-tape |
| Two-way infinite tape | tape extends both directions | constant-factor (fold the tape) |
| Larger alphabet | more symbols per cell | constant-factor (encode in binary) |
| Multi-head / 2-D tape | extra flexibility | polynomial |
| Non-deterministic | δ returns a set of moves | exponential (as far as we know!) — next lesson |
| Universal TM | one machine simulating any other from its description | polynomial overhead |
The takeaway (state this in exams): all deterministic variants simulate each other with at most polynomial overhead, so the class P is model-independent — robust exactly as claimed in Unit 1. The one variant that may cost exponentially is non-determinism, and that gap is the P vs NP question.
The Universal TM deserves a sentence of awe: a single machine U that, given ⟨M, x⟩, simulates M on x. It is the theoretical blueprint of the stored-program computer — software as data.
1.5 Limits: decidability vs complexity
Complexity theory lives strictly inside the decidable world:
- The halting problem — does M halt on x? — is undecidable (no TM solves it at all, in any amount of time; proved by diagonalisation).
- Complexity asks the next question: among problems that can be solved, how much time/space must we pay?
Diagonalisation also yields the theorems that give complexity theory its skeleton — the hierarchy theorems: with (slightly) more time or space, a TM can decide strictly more. Hence TIME(n) ⊊ TIME(n³), P ⊊ EXP, L ⊊ PSPACE. More resources genuinely buy more power — the classes of Unit 4 really are different rungs of a ladder.
1.6 Worked trace — the {0ⁿ1ⁿ} machine on input 0011
Track the tape pass by pass (X = crossed-off 0, Y = crossed-off 1):
Pass 1: 0011 → X011 (cross first 0) → scan right → X0Y1 (cross first 1) → return left
Pass 2: X0Y1 → XXY1 (cross next 0) → scan right → XXYY (cross next 1) → return left
Check: the head sweeps XXYY⊔ — no unmarked 0 or 1 remains → q_accept ✓
Now the two failure modes: on input 011 the second pass finds a 1 with no matching 0 left — reject; on input 0 0 1 a 0 survives the final check — reject. Tracing one accepting and one rejecting run is exactly what "design a TM and show its working" questions expect, and it is also where you read off the bound again: 2 passes of length ≤ 4 — in general n/2 passes of O(n) steps = O(n²).
1.7 Configurations, yields, and acceptance — the formal layer
Write a configuration compactly as u q v: tape contents uv, current state q, head on the first symbol of v. The start configuration on input x is q₀x. Configuration C yields C′ if one application of δ transforms C into C′ (for example, ua q bv yields u q′ acv when δ(q, b) = (q′, c, L)). Then:
M accepts x iff there is a sequence C₀, C₁, ..., Cₖ with C₀ = q₀x, each Cᵢ yields Cᵢ₊₁, and the state in Cₖ is q_accept.
This yields relation is precisely what the Cook–Levin proof converts into Boolean window constraints in Unit 3 — learn it here, reuse it there.
1.8 The Church–Turing thesis — and its complexity-theoretic sharpening
Church–Turing thesis: every effectively computable function is computable by a Turing machine. It is a thesis, not a theorem — it links an informal notion ("effective procedure") to a formal one, so it cannot be proved; every proposed model (λ-calculus, RAMs, your laptop) has confirmed it.
Extended / Cobham version: every reasonable deterministic model simulates a TM with at most polynomial overhead — hence P is the same class in all of them. Quantum computing is the one serious challenger to the extended version (BQP may exceed classical polynomial time); it does not touch the original thesis at all.
1.9 The hierarchy theorems, stated for the exam
| Theorem | Statement (informal but precise enough) | Tool |
|---|---|---|
| Time hierarchy | if f(n)·log f(n) = o(g(n)), then TIME(f) ⊊ TIME(g) | diagonalisation against all f-time machines |
| Space hierarchy | if f(n) = o(g(n)) (both ≥ log n), then SPACE(f) ⊊ SPACE(g) | same idea, cleaner (no log factor) |
Corollaries worth quoting verbatim: P ⊊ EXP, NL ⊊ PSPACE, PSPACE ⊊ EXPSPACE. Proof flavour: build a machine D which, on input ⟨M⟩, simulates M on ⟨M⟩ within the smaller bound and outputs the opposite answer; D needs slightly more resource to host the simulation, and differs from every small-bound machine on at least one input — so it lives strictly in the larger class. This is Cantor's diagonal argument wearing a complexity-theoretic coat.
Exam pointer
Three reliable question shapes: ① "Define a TM formally and design one for {0ⁿ1ⁿ} (or palindromes, or 0ⁿ1ⁿ2ⁿ)" — give the 7-tuple, the plan in words, a short trace, and the time/space bound; ② "Explain TM variants and why P is robust" — reproduce §1.4's table and the takeaway sentence; ③ "Differentiate decidability from complexity" — §1.5 plus the halting problem as witness.
Self-check
- Write the 7-tuple and say which components are finite. (All of them — only the tape is infinite.)
- Simulate the {0ⁿ1ⁿ} machine on inputs 01 and 10 — one accepts, one rejects on the very first step. Which is which?
- Which single TM variant is not known to be simulable with only polynomial overhead?
- State the time hierarchy theorem and one corollary.
- Why does "space O(1)" collapse to finite automata? (Only finitely many configurations per head position — no usable memory beyond the states.)