%File:  PLES3.TRA  (c)  	09/24/80            The Soft Warehouse %


MATHTRACE: FALSE $
ECHO: TRUE $

%    This is the third in a series of muSIMP programming lessons.

    Often,  it  is  most  natural to represent a  data  aggregate  as  a 
sequence  or  LIST of items rather than as a general binary  tree.   For 
example,  such a sequence is quite natural for the elements of a  vector 
or  of  a set.   We can represent such a sequence in terms of  nodes  by 
having all of the FIRST cells point to the data elements, using the REST 
cells  to link the sequence together.   The last linkage node can have a 
REST  cell which is FALSE to indicate that there are no further  linkage 
nodes:

	       /\
              /  \
           item1 /\
	        /  \
             item2  .
	             .
     	             /\
	            /  \
	         itemN  FALSE


    When  this  diagram is rotated 45 degrees in  the  counter-clockwise 
direction,  it  looks  like  a  clothes line with  the  successive  data 
elements suspended from it.

               +-------+--  -  -  -  ---+----- FALSE
               |       |                |
             item1   item2            itemN

    This  latter  diagram more clearly suggests a sequence  or  list  of 
items.   The  simple regularity of the structure permits correspondingly 
simple function definitions for processing such  structures.   Moreover, 
the  linear structure suggests an external printed representation  which 
is  far  more readable than dotted pairs.   In response to an  ampersand 
terminator,  muSIMP prints the above aggregate in the more natural  LIST 
notation:

                        (item1 item2 ... itemN)

rather than the equivalent dot notation

             (item1 . (item2 .  ... (itemN . FALSE) ... ))

     Conversely,  the  reader  accepts list notation as  an  alternative 
input form to dot notation.   Naturally,  any of the items in a list can 
themselves  be either lists or more general dotted pairs.   The  printer 
uses list notation as much as possible.  Thus, a structure of the form



	       /\
              /  \
           item1 /\
	        /  \
             item2  .
	             .
                     /\
	            /  \
	         itemN  atom

where "atom" is not the atom FALSE, is printed in a mixed notation as

                    (item1 item2 ... (itemN . atom))

The  muSIMP read routines will correctly read such mixed notation.   You 
may wonder why you never noticed list notation being output in PLES1 and 
PLES2.   At  the  beginning  of  those lessons the  function  PRINT  was 
redefined so that it printed expression entirely in dot notation.

    It  is important to fully understand the connection  between  dotted 
pairs and lists,  so take 5 minutes or so to type in some lists,  nested 
lists,  nested  dotted pairs,  and mixtures,  noting carefully how  they 
print.  %  RDS: FALSE $
%  Did your examples include: %

'() &
%    Is that surprising?   Since FALSE is used to signal the end of  the 
list, FALSE and the empty list must be equivalent. %

FALSE EQ '() &
%    Clearly functions which successively process each element of a list 
must somehow determine when the end of the list has been reached.   This 
TERMINAL CASE is easily achieved by an equality test for the name FALSE.  
Since  the need for this test is so pervasive in muSIMP,  the empty list 
recognizer EMPTY is written in machine language for efficiency  reasons.  
However, it could be defined using an EQ test as follows:

FUNCTION EMPTY (LIS),
  LIS EQ '(),
ENDFUN;

    Using EMPTY,  see if you can define a function named  #ITEMS,  which 
returns  the  number  of (top-level) items in its  list  argument.   For 
example, #ITEMS ('(FROG, (FRUIT . BAT), NEWT))  should yield 3.  Here is 
an  incomplete  solution.   All  you have to do is  enter  it  with  the 
portions marked "..." appropriately filled.

FUNCTION #ITEMS (LIS),
  WHEN EMPTY (LIS), ...  EXIT,
  1 + #ITEMS ( ... )
ENDFUN $
%  RDS: FALSE $


%   Actually,  there is already a built-in function called LENGTH, which 
returns  the length of a list.   It is somewhat more general in that  it 
returns  the  number of characters necessary for printing when given  an 
atom.

    Note that with lists it is typical to recur only on the REST of  the 
list,  whereas  with general binary trees it is typical to recur on both 
the FIRST and the REST.

    So  far,  the examples and exercises have been  relatively  isolated 
ones.   Now  we  will focus on writing a collection of  functions  which 
together provide a significant applications package:

    A  list provides a natural representation for a set.   For  example, 
(MANGO,  (CHOCOLATE . FUDGE), (ALFALFA, SPROUTS)) can represent a set of 
three  foods.   Using this representation,  let's write functions  which 
test set membership and form unions, intersections, etc.

    First,  write a function named ISIN, which returns TRUE if its first 
argument  is in the list which is its second argument,  returning  FALSE 
otherwise:  %  RDS: FALSE $
%   Our solution is: %

FUNCTION ISIN (U, LIS),
  WHEN EMPTY (LIS),  FALSE  EXIT,
  WHEN U = FIRST (LIS),  EXIT,
  ISIN (U, REST(LIS))
ENDFUN $
ISIN ('FROG, '(SALAMANDER  NEWT  TOAD)) ;
%   Actually, there is already a built-in version of ISIN called MEMBER.

    A set contains no duplicates,  so we really should have a recognizer 
function  named ISSET,  which returns TRUE if its list argument contains 
no duplicates, returning FALSE otherwise.  Try to write such a function:  
%   RDS: FALSE $
%   Here is a hint, in case you gave up:

FUNCTION SET (LIS),
  WHEN ... EXIT,
  WHEN MEMBER (FIRST(LIS), ... ),  FALSE  EXIT,
  SET ( ... )
ENDFUN;				%	RDS: FALSE $
%   In case it isn't clear by now,  a rule of this game is that you  are 
free  (and  encouraged) to use any functions we have already  discussed, 
whether  they are built-in,  previous examples,  or previous  exercises.  
That  is  one  reason  it  is adviseable for  you  to  actually  do  the 
exercises.

    Now  write a function named SUBSET,  which returns TRUE if  the  set 
which  is  its  first argument is a subset of that which is  its  second 
argument.   (Remember that every set is a subset of itself and the empty 
set is a subset of every set.)  %  RDS: FALSE $
%   Here is a hint, in case you gave up or had a less compact solution:


FUNCTION SUBSET (SET1, SET2),
  WHEN ... EXIT,
  WHEN MEMBER (FIRST(SET1), ...),  SUBSET( ...)  EXIT
ENDFUN;				%	RDS: FALSE $
%    Two sets are equal if and only if they contain the  same  elements.  
However,  the  elements  need  not occur in the  same  order.   Write  a 
corresponding comparator function named EQSET:  %	RDS: FALSE $
%   Ah yes, a hint perhaps?:

FUNCTION EQSET (SET1, SET2),
   ...
ENDFUN;		%	RDS: FALSE $
%   Do you think that's not much of a hint?

    Well, the body of the function really can be written with one modest 
line, so try harder:  %  RDS: FALSE $
%    Remember  the  rules of the game:  You are encouraged  to  use  any 
function discussed previously:  %

FUNCTION EQSET (SET1, SET2),
  SUBSET (SET1, SET2)  AND  SUBSET (SET2, SET1)
ENDFUN;
%   Our examples so far have merely analyzed sets.  We can use ADJOIN to 
construct lists,  just as we used ADJOIN to construct binary trees.   As 
an example of this, write a function named MAKESET, which returns a copy 
of  its  list  argument,  except without duplicates if  there  are  any:  
%   RDS:  FALSE $
%   If you need a hint, here is one, but it is all you will get:

FUNCTION MAKESET (LIS)
  WHEN ...,  '()  EXIT,
  WHEN MEMBER ( ... ), ...  EXIT,
  ADJOIN ( ... )
ENDFUN;		%	RDS: FALSE $
%   Let's see if your solution works correctly:  %

MAKESET ('(FROG, FROG, FROG)) &
%    If  there is a duplicate in the answer,  then back to the  computer 
terminal:  %  RDS: FALSE $
%     (It  helps  to  think  of  nasty  test  cases  BEFORE  you   start 
programming).

    Now  for  the crowning glory of our set package:  The UNION  of  two 
sets  is defined as the set of all elements which are in either (perhaps 
both) sets.  Give it a try:  %  RDS: FALSE $
%   A hint perhaps?   Well, the function body can be written in 3 lines, 
each  of which begins just like the corresponding line in our  hint  for 
MAKESET.  %  RDS: FALSE $
%   Here is our solution: %

FUNCTION UNION (SET1, SET2),
  WHEN EMPTY (SET1),  SET2  EXIT,
  WHEN MEMBER (FIRST(SET1), SET2),  UNION (REST(SET1), SET2)  EXIT,
  ADJOIN (FIRST(SET1), UNION (REST(SET1), SET2))
ENDFUN $
UNION ('(DOG, CAT, 5, RAT), '(-5, CAT, PIG, DOG))&
%   The intersection of two sets is the set of all elements which are in 
both  sets.   Using  our definition of UNION  as  inspiration,  write  a 
corresponding function for the intersection:  %  RDS: FALSE $
%    So far,  our set algebra package has been developed in a  so-called 
BOTTOM-UP manner,  with the most primitive functions defined first,  and 
with  the  more sophisticated functions defined in terms of  them.   The 
opposite  approach is TOP-DOWN,  where we define the most  comprehensive 
functions  in terms of more primitive ones,  then we define  those  more 
primitive ones in terms of still more primitive ones, until no undefined 
functions remain.

    As  an  example of the top-down attitude,  let's write  a  SYMMETRIC 
DIFFERENCE   function  for  our  set-algebra  package.    The  symmetric 
difference  of two sets is the set of all elements which are in  exactly 
one of the two sets.   This is in contrast to the ordinary diference  of 
two sets, which is all of the elements that are in the first set but not 
the second.   However, if an ordinary difference function was available, 
we  could  write the symmetric difference as the union of  the  ordinary 
difference  between set1 and set2,  with the ordinary difference between 
set2  and  set1.   We have already written UNION,  but an  ordinary  set 
difference is not yet available.  Nevertheless, let's bravely proceed to 
write the symmetric difference in terms of the ordinary difference, then 
we will worry about how to write the latter:
%

FUNCTION SYMDIF (SET1, SET2),
  UNION (ORDDIF (SET1, SET2),  ORDDIF (SET2, SET1))
ENDFUN $
%   Now you try to write ORDDIF.  It may help you to know that it can be 
written very similarly to UNION:  %		RDS: FALSE $
%    Some  programmers  are initially uncomfortable  with  the  top-down 
approach because it makes them nervous to refer to undefined  functions:  
there are obvious loose ends during the writing process.  However, it is 
not  necessary  to understand how an auxiliary function can  be  written 
before  daring to refer to it.   All that is necessary is that the  duty 
relegated  to the auxiliary function be somehow more elementary than the 
overall duty performed by the function which refers to it.

    There are necessarily loose ends during the writing of a program  in 
any sequential order.   With the bottom-up approach,  the loose ends are 
neither  written  nor referred to until lower-level functions have  been 
written.   Unfortunately,  as such hidden loose ends are revealed   they 
often  make  apparent the need to completely reorganize and rewrite  all 
subordinate functions into a more suitable organization.   In  contrast, 
the obvious loose ends during a top-down development provide  invaluable 
clues  about  how to organize the remaining  functions.   Moreover,  any 
subsequent changes tend to be easier,  because communication between the 
functions  is more localized,  more independent,  and more  hierarchial.  
For example,  we know that in the definition of SYMDIF we are taking the 
union of two DISJOINT sets,  because from the definition of ORDDIF it is 
clear that ORDDIF (SET1,  SET2)  and  ORDDIF (SET2,  SET1)  cannot  have 
elements  in common.   Hence it would be more efficient merely to append 
the second ordinary set difference to the first ordinary set difference, 


or vice-versa.   Unfortunately,  ADJOIN does not accomplish the  desired 
effect.  

     For example,   ADJOIN ('(5,  9),  '(3,  7))  yields ((5,  9), 3, 7)  
rather than the desired  (5,  9,  3, 7).  What we must do is ADJOIN 9 to 
(3,  7),  then adjoin 5 to that result.   See if you can generalize this 
process into a function named APPEND, which returns a list consisting of 
the list which is its first argument appended onto the beginning of  the 
list which is its second argument:%   RDS: FALSE $
%   How about: %

FUNCTION APPEND (LIS1, LIS2),
  WHEN EMPTY (LIS1),  LIS2  EXIT,
  ADJOIN (FIRST(LIS1),  APPEND (REST(LIS1), LIS2))
ENDFUN $
%    You may not be getting tired,  but my circuits are weary,  so let's 
bring this lesson to a close.  %

ECHO: FALSE $  RDS () $
