3.1 Logic grammars — parsing by deduction
Prolog was born for language processing. A context-free grammar can be written almost verbatim as a DCG (Definite Clause Grammar) using -->:
% A tiny English grammar
sentence --> noun_phrase, verb_phrase.
noun_phrase --> determiner, noun.
verb_phrase --> verb, noun_phrase.
determiner --> [the].
determiner --> [a].
noun --> [cat].
noun --> [mouse].
verb --> [chases].
?- phrase(sentence, [the, cat, chases, a, mouse]).
true.
?- phrase(sentence, [the, cat, sleeps]).
false.
What --> really does: each DCG rule is translated to an ordinary clause with two hidden arguments threading the input list through — sentence(S0, S) :- noun_phrase(S0, S1), verb_phrase(S1, S). These are difference lists: each nonterminal consumes a prefix and passes on the rest. Parsing succeeds when the whole list is consumed.
Building a parse tree — add arguments to the nonterminals:
sentence(s(NP, VP)) --> noun_phrase(NP), verb_phrase(VP).
noun_phrase(np(D, N)) --> determiner(D), noun(N).
verb_phrase(vp(V, NP)) --> verb(V), noun_phrase(NP).
determiner(det(the)) --> [the].
determiner(det(a)) --> [a].
noun(noun(cat)) --> [cat].
noun(noun(mouse)) --> [mouse].
verb(verb(chases)) --> [chases].
?- phrase(sentence(T), [the, cat, chases, a, mouse]).
T = s(np(det(the), noun(cat)),
vp(verb(chases), np(det(a), noun(mouse)))).
Because grammar rules are clauses, the same grammar can parse (list → tree) and generate (tree → list, or enumerate all sentences) — bidirectionality again. DCGs can also carry agreement (number, gender) and even context-sensitivity through extra arguments, which plain CFGs cannot.
3.2 Search techniques — Prolog as a search engine
Backtracking is depth-first search, so Prolog solves search problems with remarkably little code. The standard formulation: states, a successor relation move/2, and a path-finder that avoids revisiting states.
Depth-first search with visited list:
% solve(Start, Goal, Path)
solve(Start, Goal, Path) :- dfs(Start, Goal, [Start], P), reverse(P, Path).
dfs(Goal, Goal, Visited, Visited).
dfs(S, Goal, Visited, Path) :-
move(S, S1),
\+ member(S1, Visited), % loop check!
dfs(S1, Goal, [S1|Visited], Path).
Breadth-first search — guarantees the shortest path, by managing a queue of paths explicitly:
bfs(Start, Goal, Path) :- bfsQ([[Start]], Goal, R), reverse(R, Path).
bfsQ([[Goal|Rest]|_], Goal, [Goal|Rest]).
bfsQ([[S|Rest]|Queue], Goal, Path) :-
findall([S1, S|Rest],
(move(S, S1), \+ member(S1, [S|Rest])),
Extensions),
append(Queue, Extensions, NewQueue), % FIFO → breadth-first
bfsQ(NewQueue, Goal, Path).
Swap the append to put Extensions in front and you get DFS; insert by heuristic cost and you get best-first / A. The queue discipline is* the search strategy.
Generate and test — the third great idiom:
% N-queens style: generate a candidate, test the constraints
solution(Qs) :- permutation([1,2,3,4,5,6,7,8], Qs), safe(Qs).
safe([]).
safe([Q|Qs]) :- noAttack(Q, Qs, 1), safe(Qs).
noAttack(_, [], _).
noAttack(Q, [Q1|Qs], D) :-
Q =\= Q1 + D, Q =\= Q1 - D,
D1 is D + 1, noAttack(Q, Qs, D1).
permutation generates, safe tests, and backtracking automatically interleaves the two. Moving tests earlier (pruning) is the route from exponential blow-up to practical performance — in constraint logic programming the system does that propagation for you.
3.3 The water-jug problem — a complete classic
Two jugs (4 L and 3 L), unlimited water; measure exactly 2 L in the big jug. State = jugs(Big, Small):
move(jugs(_, S), jugs(4, S)). % fill big
move(jugs(B, _), jugs(B, 3)). % fill small
move(jugs(_, S), jugs(0, S)). % empty big
move(jugs(B, _), jugs(B, 0)). % empty small
move(jugs(B, S), jugs(B1, S1)) :- % pour small→big
T is B + S, B1 is min(T, 4), S1 is T - B1, B1 =\= B.
move(jugs(B, S), jugs(B1, S1)) :- % pour big→small
T is B + S, S1 is min(T, 3), B1 is T - S1, S1 =\= S.
?- bfs(jugs(0,0), jugs(2,_), Path).
% Path = [jugs(0,0), jugs(0,3), jugs(3,0), jugs(3,3), jugs(4,2), jugs(0,2), jugs(2,0)]
Seven clauses describe the physics; the search machinery from §3.2 does the thinking. That separation — logic vs control — is the closing message of the whole course: declare the problem precisely, and let deduction do the work.
3.4 DCG translation, made explicit (the standard theory question)
"Show what the DCG rule translates to" — every grammar rule nt --> body gains two extra arguments (S0, S) meaning "this nonterminal consumes the front of S0, leaving S":
% DCG form % translated clause
sentence --> np, vp. sentence(S0, S) :- np(S0, S1), vp(S1, S).
noun --> [cat]. noun([cat|S], S).
The pair (S0, S) is a difference list: the phrase recognised is exactly S0 minus S. Terminals like [cat] translate to a head that peels one token off the front. phrase(sentence, L) simply calls sentence(L, []) — "consume all of L, leave nothing".
Recognition trace for [the, cat, chases, a, mouse]:
sentence([the,cat,chases,a,mouse], [])
→ np([the,cat,chases,a,mouse], S1), vp(S1, [])
np: det consumes 'the' → noun consumes 'cat' → S1 = [chases,a,mouse]
→ vp([chases,a,mouse], [])
verb consumes 'chases' → np([a,mouse], [])
det consumes 'a' → noun consumes 'mouse' → [] ✓ SUCCESS
Agreement via arguments — the upgrade plain CFGs cannot do:
sentence --> np(Num), vp(Num). % subject and verb must AGREE
np(Num) --> det, noun(Num).
noun(sing) --> [cat].
noun(plur) --> [cats].
vp(sing) --> [sleeps].
vp(plur) --> [sleep].
?- phrase(sentence, [the, cat, sleeps]). % true
?- phrase(sentence, [the, cat, sleep]). % false — number mismatch
3.5 Comparing the search strategies (reproduce in the exam)
| Depth-first (DFS) | Breadth-first (BFS) | Best-first / A* | |
|---|---|---|---|
| Frontier discipline | stack (LIFO) | queue (FIFO) | priority queue by f(n) |
| In our code | Extensions in front | append(Queue, Extensions, _) | insert sorted by heuristic |
| Finds shortest path? | no | yes (unit costs) | yes, if heuristic admissible |
| Memory | O(depth) — light | O(breadth^depth) — heavy | between, depends on h |
| Complete? | not on infinite branches | yes (finite branching) | yes |
| Native to Prolog? | yes — backtracking is DFS | needs explicit queue | needs explicit queue + costs |
One crisp sentence to remember: changing only the queue discipline changes the search strategy — the move/2 relation (the logic) never changes.
3.6 N-queens, examined more closely
The generate-and-test solution/1 of §3.2 is correct but tries up to 8! = 40,320 permutations. The constraint-interleaved version places queens column by column, testing as it goes — same logic, far better control:
queens(N, Qs) :- numlist(1, N, Ns), place(Ns, [], Qs).
place([], Qs, Qs).
place(Unplaced, Placed, Qs) :-
select(Q, Unplaced, Rest), % choose a row for the next column
\+ attacks(Q, Placed, 1), % test IMMEDIATELY — prune early
place(Rest, [Q|Placed], Qs).
attacks(Q, [P|_], D) :- Q =:= P + D ; Q =:= P - D.
attacks(Q, [_|Ps], D) :- D1 is D + 1, attacks(Q, Ps, D1).
?- queens(6, Qs).
Qs = [5, 3, 1, 6, 4, 2] .
Because the representation puts one queen per column (position in the list) and one per row (the permutation property via select), only diagonal attacks need testing — note attacks checks Q = P ± D where D is the column distance. Moving the test inside the generator is the single most important optimisation idea of this lesson: prune as early as possible.
3.7 Search-tree picture for the water-jug start
BFS explores level by level, so the 7-state path shown is guaranteed minimal. The visited-list (\+ member(S1, ...)) is what keeps the tree finite — without it, fill/empty cycles repeat forever. In exam answers, always mention the loop check explicitly; omitting it is the most commonly penalised error in search questions.
Exam pointers
- "What is a DCG? Show the translation of a rule." — the
-->to two-extra-arguments translation of §3.4, the words difference list, and thephrase/2call. - "Write a DCG for a small language and parse a sentence" — reuse the cat/mouse grammar; add the parse-tree arguments if asked for structure.
- "Compare DFS and BFS in Prolog" — table of §3.5 plus the one-line code difference (where
Extensionsjoins the queue). - N-queens or water-jug as a "formulate as a search problem" question: define the state representation, the
move/2relation, the goal test, and the loop check — those four labelled parts are the marking scheme.
Self-check
- Translate
vp --> verb, np.anddet --> [a].into plain clauses by hand. - What does
?- phrase(np(T), [the, mouse]).bindTto in the parse-tree grammar? - In the BFS code, exactly which line would you change to obtain DFS, and to obtain best-first search?
- Why does generate-and-test N-queens explode combinatorially, and what does the interleaved version change?
- For the water-jug problem, list all states reachable from
jugs(4, 3)in one move.