4.1 Why the lambda calculus?
The lambda calculus (Alonzo Church, 1930s) is the mathematical core of every functional language. It has exactly three kinds of expression, yet it can express every computable function — it is equivalent in power to the Turing machine.
4.2 Syntax — three constructions only
A lambda term is one of:
| Form | Name | Reading |
|---|---|---|
x | variable | a name |
λx. M | abstraction | "the function taking x and returning M" |
M N | application | "apply function M to argument N" |
Conventions: application is left-associative (M N P = (M N) P); the body of a λ extends as far right as possible (λx. x y = λx. (x y)); λx y. M abbreviates λx. λy. M.
Haskell connection: λx. x + 1 is written \x -> x + 1.
4.3 Free and bound variables
In λx. M, the λ binds x inside M. A variable that is not bound by any enclosing λ is free.
- In
λx. x y:xis bound,yis free. - In
(λx. x) x: the firstxis bound; the finalxis a different, free variable.
This matters because substitution must never capture a free variable (turn it accidentally into a bound one).
4.4 The three conversions
α-conversion (renaming). Bound variables can be renamed: λx. x ≡α λy. y. Used to avoid capture.
β-reduction (the engine of computation). Applying an abstraction substitutes the argument for the bound variable:
(λx. M) N →β M[x := N]
Example:
(λx. x + 1) 5 →β 5 + 1 = 6
(λx. λy. x) a b →β (λy. a) b →β a
η-conversion (extensionality). λx. M x ≡η M provided x is not free in M — "a function that just forwards its argument to M is M".
A subterm of the shape (λx. M) N is called a redex (reducible expression). A term containing no redex is in normal form — it cannot be reduced further; it is the "answer".
4.5 Not every term has a normal form
Ω = (λx. x x)(λx. x x)
Ω →β Ω →β Ω →β ... -- reduces to itself forever
Ω is the lambda-calculus picture of an infinite loop. Termination is not guaranteed — exactly as in real programming.
4.6 Reduction order and the Church–Rosser theorem
A term can contain several redexes — which one should we reduce first?
- Normal order (leftmost-outermost): reduce the outermost redex first; arguments are substituted unevaluated. (Lazy evaluation is normal order + sharing.)
- Applicative order (innermost): evaluate arguments first, then apply. (This is what C, Java, Python do — eager evaluation.)
Consider: (λx. 42) Ω
Normal order: (λx. 42) Ω →β 42 ✓ terminates
Applicative order: must evaluate Ω first → loops forever ✗
Church–Rosser Theorem (confluence). If a term M reduces to A by one path and to B by another, then there exists a term C such that both A and B reduce to C.
Two crucial corollaries:
- Uniqueness of normal forms — a term has at most one normal form. Whatever order you choose, you can never get two different "answers".
- Normal order is the safest strategy (the standardisation theorem): if any reduction order reaches the normal form, normal order does. This is the theoretical justification for Haskell's lazy evaluation.
4.7 Combinators
A combinator is a lambda term with no free variables (a closed term). The famous ones:
| Combinator | Definition | Behaviour |
|---|---|---|
| I | λx. x | identity: I a = a |
| K | λx y. x | constant: K a b = a |
| S | λf g x. f x (g x) | distributing application |
| Y | λf. (λx. f (x x)) (λx. f (x x)) | fixed point: Y f = f (Y f) — encodes recursion |
Remarkable fact: S and K alone are sufficient to express every lambda term (for example I = S K K). Combinator-based compilation was an early implementation technique for lazy languages, and it is why graph-reduction machines (next lesson) work the way they do.
The Y combinator deserves special attention: it manufactures recursion out of nothing. Defining fact = Y (λf n. if n==0 then 1 else n * f(n-1)) gives a working factorial without any self-reference in the definition itself.
4.8 Worked exam-style reduction
Reduce (λf. λx. f (f x)) (λy. y + 1) 3 to normal form:
(λf. λx. f (f x)) (λy. y+1) 3
→β (λx. (λy. y+1) ((λy. y+1) x)) 3
→β (λy. y+1) ((λy. y+1) 3)
→β (λy. y+1) (3+1)
→β (3+1)+1
= 5
The term λf. λx. f (f x) is the Church numeral 2 — "apply f twice". Church numerals encode arithmetic purely with functions, demonstrating that numbers themselves are not primitive in the lambda calculus.
4.9 Substitution, formally — and the capture problem
β-reduction relies on substitution M[x := N] ("replace free occurrences of x in M by N"). The exam-ready rules:
| Case | Result of M[x := N] |
|---|---|
x[x := N] | N |
y[x := N] (y ≠ x) | y |
(P Q)[x := N] | (P[x := N]) (Q[x := N]) |
(λx. P)[x := N] | λx. P — x is bound here; stop |
(λy. P)[x := N], y ≠ x, y not free in N | λy. (P[x := N]) |
(λy. P)[x := N], y ≠ x, y free in N | α-rename y first, then substitute |
The capture trap (a guaranteed short question): reduce (λx. λy. x) y.
WRONG: (λx. λy. x) y →β λy. y -- the free y got CAPTURED: now means identity!
RIGHT: α-rename first: (λx. λz. x) y →β λz. y -- a constant function returning y ✓
Whenever the argument contains a variable with the same name as a binder you are substituting under, rename the binder first.
4.10 Church booleans and numerals — encoding data as functions
TRUE = λt. λf. t -- choose the first
FALSE = λt. λf. f -- choose the second (note: FALSE = Church numeral 0!)
IF = λb. λt. λe. b t e
IF TRUE a b →β TRUE a b →β (λf. a) b →β a ✓
0 = λf. λx. x -- apply f zero times
1 = λf. λx. f x
2 = λf. λx. f (f x)
SUCC = λn. λf. λx. f (n f x) -- one more application of f
ADD = λm. λn. λf. λx. m f (n f x) -- m applications after n applications
MUL = λm. λn. λf. n (m f) -- n copies of "m copies of f"
Worked: SUCC 1 = 2
SUCC 1
= (λn. λf. λx. f (n f x)) (λg. λy. g y)
→β λf. λx. f ((λg. λy. g y) f x)
→β λf. λx. f ((λy. f y) x)
→β λf. λx. f (f x) = 2 ✓
The takeaway sentence for theory questions: data can be represented by its own elimination behaviour — a boolean is a two-way choice, a number is an iterator.
4.11 The Y combinator in action — recursion from nothing
Y = λf. (λx. f (x x)) (λx. f (x x)). Show the fixed-point property once and you can quote it forever:
Y g = (λx. g (x x)) (λx. g (x x))
→β g ((λx. g (x x)) (λx. g (x x)))
= g (Y g) -- so Y g = g (Y g) = g (g (Y g)) = ...
Y g unfolds g as many times as the computation demands — which is exactly what a recursive definition does. This only works under normal-order reduction: applicative order would try to evaluate Y g fully first and loop.
4.12 Reduction-order summary table
| Normal order (leftmost-outermost) | Applicative order (innermost) | |
|---|---|---|
| What is reduced first | the function position | the arguments |
| Arguments are passed | unevaluated (by name) | fully evaluated (by value) |
| Finds normal form when one exists? | always (standardisation theorem) | not always — may loop |
| May duplicate work? | yes (without sharing) | no |
| Real-language counterpart | Haskell (lazy = normal order + sharing) | C, Java, Python, ML (eager) |
On (λx. 42) Ω | returns 42 | loops forever |
4.13 Extra worked reductions for practice
(a) S K K behaves as I:
S K K a = (λf g x. f x (g x)) K K a
→β (λg x. K x (g x)) K a
→β (λx. K x (K x)) a
→β K a (K a)
= (λx y. x) a (K a) →β (λy. a) (K a) →β a ✓ so S K K = I
(b) Two redexes, two routes, one answer (Church–Rosser live):
(λx. x x) ((λy. y) z)
Route 1 (normal): →β ((λy. y) z) ((λy. y) z) →β z ((λy. y) z) →β z z
Route 2 (applicative): →β (λx. x x) z →β z z -- same normal form ✓
Exam pointers
- "Define α/β/η-conversion" — one line each plus a one-line example each; mention capture-avoidance for α to earn the full mark.
- "State and explain the Church–Rosser theorem" — statement + diamond diagram + the two corollaries (uniqueness of normal forms; normal order is normalising).
- Reductions are marked per step: underline the redex you fire at each step and write
→βbetween lines. - Know I, K, S, Y by heart, including
Y f = f (Y f)and the fact that {S, K} is a complete basis.
Self-check
- Mark free and bound occurrences in
λx. y (λy. x y). - Reduce
(λx. λy. y x) 3 (λz. z + 1)to normal form. - Why does
(λx. λy. x) yneed α-conversion before β-reduction? - Which reduction order corresponds to call-by-value? To call-by-name?
- Show
ADD 1 1 = 2using the Church encodings. - Give a term with a normal form that applicative order fails to find.