Quintus
Prolog Manual
This chapter gives a number of tips on how to organize your programs for increased efficiency.
One of the more difficult things to master when learning Prolog is the proper use of the cut. Often, when beginners find unexpected backtracking occurring in their programs, they try to prevent it by inserting cuts in a rather random fashion. This makes the programs harder to understand and sometimes stops them from working.
During program development, each predicate in a program should be considered independently to determine whether or not it should be able to succeed more than once. In most applications, many predicates should at most, succeed only once; that is, they should be determinate. Having decided that a predicate should be determinate, it should be verified that, in fact, it is. The debugger can help in verifying that a predicate is determinate (see {manual(e-1-2)}).
Consider the following predicate which calculates the factorial of a number:
fac(0, 1).
fac(N, X) :-
N1 is N - 1,
fac(N1, Y),
X is N * Y.
The factorial of 5 can be found by typing
| ?- fac(5, X).
X = 120
However, backtracking into the above predicate by typing a semicolon at this point, causes an infinite loop because the system starts attempting to satisfy the goals 'fac(-1, X).', 'fac(-2, X).', etc. The problem is that there are two clauses that match the goal 'fac(0, F).', but the effect of the second clause on backtracking has not been taken into account. There are at least three possible ways of fixing this:
fac(0,1) :- !.
Adding the cut essentially makes the first solution the only one for the factorial of 0 and hence solves the immediate problem. This solution is space-efficient because as soon as Prolog encounters the cut, it knows that the predicate is determinate. Thus, when it tries the second clause, it can throw away the information it would otherwise need in order to backtrack to this point. Unfortunately, if this solution is implemented, typing 'fac(-1, X)' still generates an infinite search.
fac(N, X) :-
N > 0,
N1 is N - 1,
fac(N1, Y),
X is N * Y.
This also solves the problem, but it is a more robust solution because this way it is impossible to get into an infinite loop. This solution makes the predicate logically determinate -- there is only one possible clause for any input -- but the Prolog system is unable to detect this and must waste space for backtracking information. The space-efficiency point is more important than it may at first seem; if fac/2 is called from another determinate predicate, and if the cut is omitted, Prolog cannot detect the fact that fac/2 is determinate. Therefore, it will not be able to detect the fact that the calling predicate is determinate, and space will be wasted for the calling predicate as well as for fac/2 itself. This argument applies again if the calling predicate is itself called by a determinate predicate, and so on, so that the cost of an omitted cut can be very high in certain circumstances.
fac(N, X) :-
( N > 0 ->
N1 is N - 1,
fac(N1, Y),
X is N * Y
; N =:= 0 ->
X = 1
).
This solution is as robust as solution b-5-1-1, and more efficient than solution b-5-1-1, since it exploits conditionals with arithmetic tests (see {manual(g-2-7)}, and {manual(b-5-6)}, for more information on optimization using conditionals).
Programs can often be made more readable by the placing of cuts as early as possible in clauses. For example, consider the predicate p/0 defined by
p :- a, b, !, c, d.
p :- e, f.
Suppose that b/0 is a test that determines which clause of p/0 applies; a/0 may or may not be a test, but c/0 and d/0 are not supposed to fail under any circumstances. A cut is most appropriately placed after the call to b/0. If in fact a/0 is the test and b/0 is not supposed to fail, then it would be much clearer to move the cut before the call to b/0.
A tool to aid in determinancy checking is available as library(det). It helps to locate places where cuts are necessary.
Cut is also commonly used in conjunction with the generate-and-test programming paradigm. For example, consider the predicate find_solution/1 defined by
find_solution(X) :-
candidate_solution(X),
test_solution(X),
!.
where candidate_solution/1 generates possible answers on backtracking. The intent is to stop generating candidates as soon as one is found that satisfies test_solution/1. If the cut were omitted, a future failure could cause backtracking into this clause and restart the generation of candidate solutions. A similar example is shown below:
process_file(F) :-
see(F),
repeat,
read(X),
process_and_fail(X),
!,
seen.
process_and_fail(end_of_file) :- !.
process_and_fail(X) :-
process(X),
fail.
The cut in process_file/1 is another example of terminating a generate-and-test loop. In general, a cut should always be placed after a repeat/0 so that the backtracking loop is clearly terminated. If the cut were omitted in this case, on later backtracking Prolog might try to read another term after the end of the file had been reached.
The cut in process_and_fail/1 might be considered unnecessary because, assuming the call shown is the only call to it, the cut in process_file/1 ensures that backtracking into process_and_fail/1 can never happen. While this is true, it is also a good safeguard to include a cut in process_and_fail/1 because someone may unwittingly change process_file/1 in the future.
In Quintus Prolog, predicates are indexed on their first arguments. This means that when a predicate is called with an instantiated first argument, a hash table is used to gain fast access to only those clauses having a first argument with the same primary functor as the one in the predicate call. If the first argument is atomic, only clauses with a matching first argument are accessed. Indexes are maintained automatically by the built-in predicates manipulating the Prolog database (for example, assert/1, retract/1, and compile/1).
Keeping this feature in mind when writing programs can help speed their execution. Some hints for program structuring that will best use the indexing facility are given below. Note that dynamic predicates as well as static predicates are indexed. The programming hints given in this section apply equally to static and dynamic code.
The major advantage of indexing is that it provides fast access to tables of data. For example, a table of employee records might be represented as shown below in order to gain fast access to the records by employee name:
% employee(LastName,FirstNames,Department,Salary,DateOfBirth)
employee('Smith', ['John'], sales, 20000, 1-1-59).
employee('Jones', ['Mary'], engineering, 30000, 5-28-56).
...
If fast access to the data via department is also desired, the data can be organized little differently. The employee records can be indexed by some unique identifier, such as employee number, and additional tables can be created to facilitate access to this table, as shown in the example below. For example,
% employee(Id,LastName,FirstNames,Department,Salary,DateOfBirth)
employee(1000000, 'Smith', ['John'], sales, 20000, 1-1-59).
employee(1000020, 'Jones', ['Mary'], engineering, 30000, 5-28-56).
...
% employee_name(LastName,EmpId)
employee_name('Smith', 1000000).
employee_name('Jones', 1000020).
...
% department_member(Department,EmpId)
department_member(sales, 1000000).
department_member(engineering, 1000020).
...
Indexing would now allow fast access to the records of every employee named Smith, and these could then be backtracked through looking for John Smith. For example:
| ?- employee_name('Smith', Id),
employee(Id, 'Smith', ['John'], Dept, Sal, DoB).
Similarly, all the members of the engineering department born since 1965 could be efficiently found like this:
| ?- department_member(engineering, Id),
employee(Id, LN, FN, engineering, _, M-D-Y),
Y > 65.
The other advantage of indexing is that it often makes possible early detection of determinacy, even if cuts are not included in the program. For example, consider the following simple predicate which joins two lists together:
concat([], L, L).
concat([X|L1], L2, [X|L3]) :- concat(L1, L2, L3).
If this predicate is called with an instantiated first argument, the first argument indexing of Quintus Prolog will recognize that the call is determinate -- only one of the two clauses for concat/3 can possibly apply. Thus, the Prolog system knows it does not have to store backtracking information for the call. This significantly reduces memory use and execution time.
Determinacy detection can also reduce the number of cuts in predicates. In the above example, if there was no indexing, a cut would not strictly be needed in the first clause as long as the predicate was always to be called with the first argument instantiated. If the first clause matched, then the second clause could not possibly match; discovery of this fact, however, would be postponed until backtracking. The programmer might thus be tempted to use a cut in the first clause to signal determinacy and recover space for backtracking information as early as possible.
With indexing, if the example predicate is always called with its first argument instantiated, backtracking information is never stored. This gives substantial performance improvements over using a cut rather than indexing to force determinacy. At the same time greater flexibility is maintained: the predicate can now be used in a non-determinate fashion as well, as in
| ?- concat(L1, L2, [a,b,c,d]).
which will generate on backtracking all the possible partitions of the list [a,b,c,d] on backtracking. If a cut had been used in the first clause, this would not work.
Even if the determinacy detection made possible by indexing (see {manual(b-5-2-2)}) is unavailable to a predicate call, Quintus Prolog still can detect determinacy before determinate exit from the predicate. Space for backtracking information can thus be recovered as early as possible, reducing memory requirements and increasing performance. For instance, the predicate member/2 (found in the Quintus Prolog library) could be defined by:
member(Element, [Element|_]).
member(Element, [_|Rest]) :-
member(Element, Rest).
member/2 might be called with an instantiated first argument in order to check for membership of the argument in a list, which is passed as a second argument, as in
| ?- member(4, [1,2,3,4]).
The first arguments of both clauses of member/2 are variables, so first argument indexing cannot be used. However, determinacy can still be detected before determinate exit from the predicate. This is because on entry to the last clause of a nondeterminate predicate, a call becomes effectively determinate; it can tell that it has no more clauses to backtrack to. Thus, backtracking information is no longer needed, and its space can be reclaimed. In the example, each time a call fails to match the first clause and backtracks to the second (last) clause, backtracking information for the call is automatically deleted.
Because of last clause determinacy detection, a cut is never needed as the first subgoal in the last clause of a predicate. Backtracking information will have been deleted before a cut in the last clause is executed, so the cut will have no effect except to waste time.
Note that last clause determinacy detection is exploited by dynamic code as well as static code in Quintus Prolog.
Another important efficiency feature of Quintus Prolog is tail recursion optimization. This is a space optimization technique which applies when a predicate is determinate at the point where it is about to call the last goal in the body of a clause. For example,
% for(Int, Lower, Upper)
% Lower and Upper should be integers such that Lower =< Upper.
% Int should be uninstantiated; it will be bound successively on
% backtracking to Lower, Lower+1, ... Upper.
for(Int, Int, _Upper).
for(Int, Lower, Upper) :-
Lower < Upper,
Next is Lower + 1,
for(Int, Next, Upper).
This predicate is determinate at the point where the recursive call is about to be made, since this is the last clause and the preceding goals ('<'/2 and is/2) are determinate. Thus tail recursion optimization can be applied; effectively, the stack space being used for the current predicate call is reclaimed before the recursive call is made. This means that this predicate uses only a constant amount of space, no matter how deep the recursion.
To take best advantage of this feature, make sure that goals in recursive predicates are determinate, and whenever possible put the recursive call at the end of the predicate.
This isn't always possible, but often can be done through the use of accumulating parameters . An accumulating parameter is an added argument to a predicate that builds up the result as computation proceeds. For example, in our factorial example (see {manual(b-5-1-1)}), the last goal in the body of the recursive case is is/2, not the recursive call to fac/2.
fac(N, X) :-
( N > 0 ->
N1 is N - 1,
fac(N1, Y),
X is N * Y
; N =:= 0 ->
X = 1
).
This can be corrected by adding another argument to fac/2 to accumulate the factorial.
fac(N, X) :- fac(N, 1, X).
% fac(+N, +M, -X)
% X is M * the factorial of N.
fac(N, M, X) :-
( N > 0 ->
N1 is N - 1,
M1 is N * M,
fac(N1, M1, X)
; N =:= 0 ->
X = M
).
Here we do the multiplication before calling fac/3 recursively. Note that we supply the base case, 1, at the start of the computation, and that we are multiplying by decreasing numbers. In the earlier version, fac/2, we multiply after the recursive call, and so we multiply by increasing numbers. Effectively, the new version builds the result backwards. This is correct because multiplication is associative.
This technique becomes much more important when extended to lists, as in this case it can save much building of unneeded lists through unnecessary calls to append sublists together. For example, the naive way to reverse a list is:
nreverse([], []).
nreverse([H|T], L) :-
nreverse(T, L1),
append(L1, [H], L).
This is very wasteful, since each call to append/3 copies the initial part of the list, and adds one element to it. Fortunately, this can be very easily rewritten to use an accumulating parameter:
reverse(L1, L2) :- reverse(L1, [], L2).
% reverse(+X, +Y, -Z)
% Z is X reversed, followed by Y
reverse([], Z, Z).
reverse([H|T], L0, L) :-
reverse(T, [H|L0], L).
This version of reverse is many times faster than the naive version, and uses much less memory. The key to understanding the behavior of this predicate is the observation made earlier: using an accumulating parameter, we build the result backwards.
Don't let this confuse you. Building a list forward is easy. For example, a predicate which returns a list L of consecutive numbers from 1 to N could be written in two different ways: counting up and collecting the resulting list forward, or counting down and accumulating the result backward.
iota1(N, L) :- iota1(1, N, L).
iota1(N, Max, L) :-
( N > Max ->
L = []
; N1 is N+1,
L = [N|L1],
iota1(N1, Max, L1)
).
or,
iota2(N, L) :- iota2(N, [], L).
iota2(N, L0, L) :-
( N =< 0 ->
L = L0
; N1 is N-1,
iota2(N1, [N|L0], L)
).
Both versions generate the same results, and neither waste any space. The second version is slightly faster. Choose whichever approach you prefer.
The built-in predicate '=..'/2 is a clear way of building terms and taking them apart. However, it is almost never the most efficient way. functor/3 and arg/3 are generally much more efficient, though less direct. The best blend of efficiency and clarity is to write a clearly-named predicate which implements the desired operation and to use functor/3 and arg/3 in that predicate.
Here is an actual example. The task is to reimplement the built-in predicate '=='/2. The first variant uses '=..'/2 (this symbol is pronounced "univ" for historical reasons). Some Prolog textbooks recommend code similar to this.
ident_univ(X, Y) :-
var(X), % If X is a variable,
!,
var(Y), % so must Y be, and
samevar(X, Y). % they must be the same.
ident_univ(X, Y) :- % If X is not a variable,
nonvar(Y), % neither may Y be;
X =.. [F|L], % they must have the
Y =.. [F|M], % same function symbol F
ident_list(L, M). % and identical arguments
ident_list([], []).
ident_list([H1|T1], [H2|T2]) :-
ident_univ(H1, H2),
ident_list(T1, T2).
samevar(29, Y) :- % If binding X to 29
var(Y), % leaves Y unbound,
!, % they were not the same
fail. % variable.
samevar(_, _). % Otherwise they were.
This code performs the function intended; however, every time it touches a non-variable term of arity N, it constructs a list with N+1 elements, and if the two terms are identical, these lists are reclaimed only when backtracked over or garbage-collected.
Better code uses functor/3 and arg/3.
ident_farg(X, Y) :-
( var(X) -> % If X is a variable,
var(Y), % so must Y be, and
samevar(X, Y) % they must be the same;
; nonvar(Y), % otherwise Y must be nonvar
functor(X, F, N), % The principal functors of X
functor(Y, F, N), % and Y must be identical,
ident_farg(N, X, Y) % including the last N args.
).
ident_farg(0, _, _) :- !.
ident_farg(N, X, Y) :- % The last N arguments are
arg(N, X, Xn), % identical
arg(N, Y, Yn), % if the Nth arguments
ident_farg(Xn, Yn), % are identical,
M is N-1, % and the last N-1 arguments
ident_farg(M, X, Y). % are also identical.
This approach to walking through terms using functor/3 and arg/3 avoids the construction of useless lists.
The pattern shown in the example, in which a predicate of arity K calls an auxiliary predicate of the same name of arity K+1 (the additional argument denoting the number of items remaining to process), is very common. It is not necessary to use the same name for this auxiliary predicate, but this convention is generally less prone to confusion.
In order to simply find out the principal function symbol of a term, use
| ?- the_term_is(Term),
| functor(Term, FunctionSymbol, _).
The use of '=..'/2, as in
| ?- the_term_is(Term),
| Term =.. [FunctionSymbol|_].
is wasteful, and should generally be avoided. The same remark applies if the arity of a term is desired.
'=..'/2 should not be used to locate a particular argument of some term. For example, instead of
Term =.. [_F,_,ArgTwo|_]
you should write
arg(2, Term, ArgTwo)
It is generally easier to get the explicit number "2" right than to write the correct number of "don't care" variables in the call to '=..'/2. Other people reading the program will find the call to arg/3 a much clearer expression of the program's intent. The program will also be more efficient. Even if several arguments of a term must be located, it is clearer and more efficient to write
arg(1, Term, First),
arg(3, Term, Third),
arg(4, Term, Fourth)
than to write
Term =.. [_,First,_,Third,Fourth|_]
Finally, '=..'/2 should not be used when the functor of the term to be operated on is known (that is, when both the function symbol and the arity are known). For example, to make a new term with the same function symbol and first arguments as another term, but one additional argument, the obvious solution might seem to be to write something like the following:
add_date(OldItem, Date, NewItem) :-
OldItem =.. [item,Type,Ship,Serial],
NewItem =.. [item,Type,Ship,Serial,Date].
However, this could be expressed more clearly and more efficiently as
add_date(OldItem, Date, NewItem) :-
OldItem = item(Type,Ship,Serial),
NewItem = item(Type,Ship,Serial,Date).
or even
add_date(item(Type,Ship,Serial),
Date,
item(Type,Ship,Serial,Date)
).
There is an efficiency advantage in using conditionals whose test part consists only of arithmetic comparisons or type tests. Consider the following alternative definitions of the predicate type_of_character/2. In the first definition, four clauses are used to group characters on the basis of arithmetic comparisons.
type_of_character(Ch, Type) :-
Ch >= "a", Ch =< "z",
!,
Type = lowercase.
type_of_character(Ch, Type) :-
Ch >= "A", Ch =< "Z",
!,
Type = uppercase.
type_of_character(Ch, Type) :-
Ch >= "0", Ch =< "9",
!,
Type = digit.
type_of_character(_Ch, Type) :-
Type = other.
In the second definition, a single clause with a conditional is used. The compiler generates optimized code for the conditional; the second version of type_of_character/2 runs faster than the first and uses less memory.
type_of_character(Ch, Type) :-
( Ch >= "a", Ch =< "z" ->
Type = lowercase
; Ch >= "A", Ch =< "Z" ->
Type = uppercase
; Ch >= "0", Ch =< "9" ->
Type = digit
; otherwise ->
Type = other
).
Following is a list of builtin predicates that are compiled efficiently in conditionals:
contact: product
support sales information