Search trees and using Cut in Prolog

*******************************************************************

1. a) Start with a simple example.  We want to write the following
predicate:

%% replace(oldList, oldElem, newElem, newList)
%% newList is the same as oldList except all occurrences of oldElem
%% are replaced with newElem. 

Developing different versions of the code.

A first attempt might be the following:

    replace1([],_,_,[]).
    replace1([Old|TOld],Old,New,[New|TNew]) :- 
        replace1(TOld,Old,New,TNew).
    replace1([H|TOld],Old,New,[H|TNew]) :- 
        replace1(TOld,Old,New,TNew).

**** What is the problem?

replace1 will generate the right answers, but also wrong answers,
because clause 3 can match on backtracking even when the head of the
oldList is the item to be replaced:

    ?- replace1([1,2,3,2,1],2,5,X).
    X = [1,5,3,5,1];
    X = [1,5,3,2,1];
    X = [1,2,3,5,1];
    X = [1,2,3,2,1];
    no

Draw the Prolog Search Tree for the following query:


                 ?- replace1([1,2],2,b,R).
                         |
                         | [H1|TOld1] = [1,2] 
                         | Old1 = 2
                         | New1 = b
                         | R = [1|TNew1]
                         |
                     replace1([2],2,b,TNew1)
                        / \
[Old2|TOld2] = [2]     /   \      [H2|TOLd2] = [2] 
Old2 = 2              /     \     Old2 = 2
New2 = b             /       \    New2 = b
[b|TNew2] = TNew1   /         \   [2|TNew2] = TNew1 
                   /           \
  replace1([],2,b,TNew2)   replace1([],2,b,TNew2) 
        |                           |
        | TNew2 = []                | TNew2 = []
        |                           |
     SUCCESS                     SUCCESS
     R = [1,b]                   R = [1,2]


**** Solve the problem using "not" or \=:

    replace2([],_,_,[]).
    replace2([Old|TOld],Old,New,[New|TNew]) :- 
        replace2(TOld,Old,New,TNew).
    replace2([H|TOld],Old,New,[H|TNew]) :- 
        H\=Old,
        replace2(TOld,Old,New,TNew). 

    ?- replace2([1,2,3,2,1],2,5,X).
    X = [1,5,3,5,1];
    no

Here is the search tree: 

                 ?- replace2([1,2],2,b,R).
                         |
                         | [H1|TOld1] = [1,2] 
                         | Old1 = 2
                         | New1 = b
                         | R = [1|TNew1]
                         |
                     replace2([2],2,b,TNew1)
                        / \
[Old2|TOld2] = [2]     /   \      [H2|TOLd2] = [2] 
Old2 = 2              /     \     Old2 = 2
New2 = b             /       \    New2 = b
[b|TNew2] = TNew1   /         \   [2|TNew2] = TNew1 
                   /           \
  replace2([],2,b,TNew2)   2 \= 2  (H2 \= Old2)  
        |                  replace2([],2,b,TNew2) 
        |                           |
        | TNew2 = []                X
        |                          FAIL
     SUCCESS                     
     R = [1,b]                  


**** How can you use cut to do the same thing? 

The following code is equivalent, because a recursive call to replace3
can only give one solution (why is that?), so whether the cut goes
before it or after it doesn't matter -- the important thing is that
you can't succeed on clause 2 and backtrack to clause 3:

    replace3([],_,_,[]).
    replace3([Old|TOld],Old,New,[New|TNew]) :- 
        !,
        replace3(TOld,Old,New,TNew).
    replace3([H|TOld],Old,New,[H|TNew]) :- 
        replace3(TOld,Old,New,TNew).
    
    replace4([],_,_,[]).
    replace4([Old|TOld],Old,New,[New|TNew]) :- 
        replace4(TOld,Old,New,TNew),
        !.
    replace4([H|TOld],Old,New,[H|TNew]) :- 
        replace4(TOld,Old,New,TNew).


    ?- replace3([1,2,3,2,1],2,5,X).
    X = [1,5,3,5,1];
    no
    ?- replace4([1,2,3,2,1],2,5,X).
    X = [1,5,3,5,1];
    no

(Any call to replace3 can only give one solution because it either
matches clause 1 or clause 2 (not both), and if it matches clause 2 it
cannot backtrack to clause 3.)

Here is the search tree. ! cuts all the remaining alternatives (branches). 
Here there is only one other alternative, but if there were many, all of 
them would be cut. The difference between replace3 and replace4 is the time 
when the cutting takes place. replace4 cuts as soon as the element is found. 
replace4 cuts only after the branch succeeded. In this case the result in the
same, but this is not always the case.


                 ?- replace3([1,2],2,b,R).
                         |
                         | [H1|TOld1] = [1,2] 
                         | Old1 = 2
                         | New1 = b
                         | R = [1|TNew1]
                         |
                     replace3([2],2,b,TNew1)
                        / \---------------------------> CUT !
[Old2|TOld2] = [2]     /   \      [H2|TOLd2] = [2] 
Old2 = 2              /     \     Old2 = 2
New2 = b             /       \    New2 = b
[b|TNew2] = TNew1   /         \   [2|TNew2] = TNew1 
                   /           \
  replace3([],2,b,TNew2)   replace3([],2,b,TNew2) 
        |                           |
        | TNew2 = []                | TNew2 = []
        |                           |
     SUCCESS                     SUCCESS
     R = [1,b]                   R = [1,2]
     

**** But, wait.  There's a problem...  What if we query with a
**** variable for the first argument?  That is, asking what oldList
**** could a newList correspond to?

    ?- replace2(X,1,2,[1]).
    no
    ?- replace3(X,1,2,[1]).
    X = [1];
    no

replace2 gives the correct answer -- any 1's in oldList would have
been replaced by 2 in the newList, so [1] cannot be a result.  Why
does replace3 not handle this correctly?

(For replace2: The query only matches clause 3, but then fails the
first subgoal, H\=Old.  This ensures that a newList element can't be
the same as the element to be replaced.

For replace3: The query only matches clause 3, but there is no check
to ensure that the element to be replaced can't be in newList.)

**** Here's another example:

    ?- replace2(X,1,2,[2]).
    X = [1];
    X = [2];
    no
    ?-  replace3(X,1,2,[2]).
    X = [1];
    no
    
Again, replace2 gives the correct answers -- a 2 in newList could have
been a 1 or a 2 in oldList.  replace3, on the other hand, says that a
2 could *only* have come from a 1.  Again, why??

(For replace2: The query matches both clause 2 and clause 3.  When it
matches clause 2, the system will first create oldList with the
element to be replaced as its first element.  When backtracking, it
will match clause 3, and create oldList with the first element of the
newList as the first element of the oldList.  In this case, the
element happens to equal the element to be replaced with, but this is
not disallowed by the definition.

For replace3: The query matches clause 2 and creates oldList with the
element to be replaced as its first element.  Then, because of the
cut, it does not match clause 3, so it only finds the first solution.)

**** What is the lesson?  When practical, it's better to write
**** predicates to behave correctly without using cut, as it's
**** difficult to get it to work correctly in all cases.