3.1 Prolog derivation trees, choices, and unification

To illustrate how Prolog produces answers for programs and goals, consider the following simple datalog program (no functions).
/* program P                       clause #      */   
 
p(a).                              /* #1 */  
p(X) :- q(X), r(X).                /* #2 */  
p(X) :- u(X).                      /* #3 */  
 
q(X) :- s(X).                      /* #4 */  

r(a). /* #5 */ r(b). /* #6 */
s(a). /* #7 */ s(b). /* #8 */ s(c). /* #9 */ u(d). /* #10 */

Exercise 3.1.1 Load program P into Prolog and observe what happens for the goal ?-p(X). When an answer is reported, hit (or enter) ';' so that Prolog will continue to trace and find all of the answers.

Exercise 3.1.2 Load program P into Prolog, turn on the trace, and record what happens for the goal ?-p(X). When an answer is reported, hit (or enter) ';' so that Prolog will continue to trace and find all of the answers. (Use ?- help(trace). first, if needed.)

The following diagram shows a complete Prolog derivation tree for the goal ?-p(X). The edges in the derivation tree are labeled with the clause number in the source file for program P that was used to replace a goal by a subgoal. The direct descendants under any goal in the derivation tree correspond to choices. For example, the top goal p(X) unifies with the heads of clauses #1, #2, #3, three choices.

Fig. 3.1.1
Fig. 3.1.1

The trace (Exercise 3.1.2 above) of the goal ?-p(X) corresponds to a depth-first traversal of this derivation tree. Each node in the Prolog derivation tree was, at the appropriate point in the search, the current goal. Each node in the derivation tree is a sequence of subgoals. The edges directly below a node in this derivation tree correspond to the choices available for replacing a selected subgoal. The current side clause, whose number labels the arc in the derivation tree, is tried in the following way: If the leftmost current subgoal (shown as g1 in the little diagram below) unifies with the head of the side clause (shown as h in the diagram), then that leftmost current subgoal is replaced by the body of the side clause (shown as b1,b2,...,bn in the diagram). Pictorially, we could show this as follows:

Fig. 3.1.2
Fig. 3.1.2

One important thing not shown explicitly in the diagram is that the logical variables in the resulting goal b1,b2,...,bn,g2,g3,... have been bound as a result of the unification, and Prolog needs to keep track of these unifying substitutions as the derivation tree grows down any branch.

Now, a depth first traversal of such a derivation tree means that alternate choices will be attempted as soon as the search returns back up the tree to the point where the alternate choice is available. This process is called backtracking.

Of course, if the tail of the rule is empty, then the leftmost subgoal is effectively erased. If all the subgoals can eventually be erased down a path in the derivation tree, then an answer has been found (or a 'yes' answer computed). At this point the bindings for the variables can be used to give an answer to the original query.

unification of Prolog terms

Prolog unification matches two Prolog terms T1 and T2 by finding a substitution of variables mapping M such that if M is applied T1 and M is applied to T2 then the results are equal.

For example, Prolog uses unification in order to satisfy equations (T1=T2) ...

?- p(X,f(Y),a) = p(a,f(a),Y).
X = a  Y = a

?- p(X,f(Y),a) = p(a,f(b),Y).
No
In the first case the successful substituton is {X/a, Y/b}, and for the second example there is no substitution that would result in equal terms. In some cases the unification does not bind variables to ground terms but result in variables sharing references ...
?- p(X,f(Y),a) = p(Z,f(b),a).
X = _G182   Y = b   Z = _G182 
In this case the unifying substitution is {X/_G182, Y/b, Z/_G182}, and X and Z share reference, as can be illustrated by the next goal ...
?- p(X,f(Y),a) = p(Z,f(b),a), X is d.
X = d   Y = b   Z = d 
{X/_G182, Y/b, Z/_G182} was the most general unifying substitution for the previous goal, and the instance {X/d, Y/b, Z/d} is specialized to satisfy the last goal.

Prolog does not perform an occurs check when binding a variable to another term, in case the other term might also contain the variable. For example (SWI-Prolog) ...

?- X=f(X).
X = f(**) 
The circular reference is flagged (**) in this example, but the goal does succeed {X/f(f(f(...)))}. However ...
?- X=f(X), X=a.
No
The circular reference is checked by the binding, so the goal fails. "a canNOT be unified with f(_Anything)" ...
?- a \=f(_).
Yes
Some Prologs have an occurs-check version of unification available for use. For example, in SWI-Prolog ...
?- unify_with_occurs_check(X,f(X)).
No
The Prolog goal satisfation algoritm, which attempts to unify the current goal with the head of a program clause, uses an instance form of the clause which does not share any of the variables in the goal. Thus the occurs-check is not needed for that.

The only possibility for an occurs-check error will arise from the processing of Prolog terms (in user programs) that have unintended circular reference of variables which the programmer believes should lead to failed goals when they occur . Some Prologs might succeed on these circular bindings, some might fail, others may actually continue to record the bindings in an infinite loop, and thus generate a run-time error (out of memory). These rare situations need careful programming.

It is possible to mimic the general unification algorithm in Prolog. But here we present a specialized version of unification, whose computational complexity is linear in the size of the input terms, and simply matches terms left-to-right. The match(+GeneralTerm,+GroundedTerm) predicate attempts to match its first argument (which may contain variables) against its second argument (which must be grounded). This little program should be considered just as an illustration, or a programming exercise, although we do know of cogent applications for the LR matching algorithm in situations where general unification in not needed. We would not use match, however, in a Prolog application because built-in unification would be so much faster; we would simply have to ensure that the presuppositions for match are appropriately checked when built-in unification is used. The reference Apt and Etalle (1993) discusses the question in general regarding how much of general unification is actually NOT needed by Prolog.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% match(U,V) : 
%%   U may contain variables
%%   V must be ground
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% match a variable with a ground term
match(U,V) :- 
   var(U), 
   ground(V),
   U = V.  % U assigned value V

% match grounded terms
match(U,V) :- 
   ground(U), 
   ground(V),
   U == V.

% match compound terms
match(U,V) :- 
   \+var(U), 
   ground(V),
   functor(U,Functor,Arity),
   functor(V,Functor,Arity),
   matchargs(U,V,1,Arity).

% match arguments, left-to-right
matchargs(_,_,N,Arity) :- 
   N > Arity.
matchargs(U,V,N,Arity) :- 
   N =< Arity,
   arg(N,U,ArgU),
   arg(N,V,ArgV), 
   match(ArgU,ArgV), 
   N1 is N+1, 
   matchargs(U,V,N1,Arity).





Prolog Tutorial Contents