%File: PLES4.TRA (c) 09/24/80 The Soft Warehouse % MATHTRACE: FALSE $ ECHO: TRUE $ % This is the fourth in a series of muSIMP programming lessons. Often within a function definition it is necessary to DYNAMICALLY create a list. Suppose we want to make a list of the values of the variables FIRSTNAME, LASTNAME, and MAILADDRESS. It will not do to use the program statement '(FIRSTNAME, LASTNAME, MAILADDRESS), because the quote operator prevents evaluation of the variables. However, the desired effect can be achieved by the statement ADJOIN (FIRSTNAME, ADJOIN (LASTNAME, ADJOIN (MAILADDRESS, '()))). muSIMP provides the function LIST to achieve this effect much more compactly and conveniently. Thus, the above list could be created with the following statement: LIST (FIRSTNAME, LASTNAME, MAILADDRESS). Unlike most functions, LIST can have any arbitrary number of arguments. For example consider the following assignments: % FIRSTNAME: 'JOHN & LASTNAME: 'DOE & MAILADDRESS: 'TIMBUKTU & % Create a list of these variables using the quote operartor and compare it with a list created using the function LIST: % RDS: FALSE $ % A useful utility to have is a constructor function which reverses a list. Writing such a function can be somewhat tricky. The following skeletal definition uses our friends APPEND and LIST as helper functions: FUNCTION REVLIS (LIS), WHEN ... , FALSE EXIT, APPEND ( ... , LIST (FIRST (LIS))), ENDFUN $ See if you can successfully complete this definition. Naturally, you also have to reenter APPEND if a correct version is not around from the previous lesson. (Remember to jot down all function definitions if you are not using a hard-copy terminal.) % RDS:FALSE $ % A well-written APPEND necessarily requires execution time which is approximately proportional to the length of its first argument. The REVLIS function outlined above invokes APPEND n times if n is the length of its original argument, and the average length of the argument to APPEND is n/2. Thus, the time is approximately proportional to n*(n/2), which is proportional to n^2. An technique using COLLECTION VARIABLES permits a list to be reversed in time proportional to n, yielding tremendous time savings for long lists: % FUNCTION REVLIS (LIS, ANS), WHEN EMPTY (LIS), ANS EXIT, REVLIS (REST(LIS), ADJOIN (FIRST(LIS), ANS)) ENDFUN $ TRACE (REVLIS) & REVLIS ('(1, 2, 3)) & % A collection variable accumulates the answer during successive recursive invocations. Then, the resulting value is passed back through successive levels as the returned answer. As is illustrated here, we can invoke a function with fewer arguments than there are parameters. When this is done, the extra parameters are initialized to FALSE, and they are available for use as LOCAL VARIABLES within the function body. Quite often, as in this example, the initial value of FALSE is exactly what we want, because it also represents the empty list. (When we want some other initial value, either the user can supply it, or the function can supply it to an auxiliary function which does the recursion.) Of course, if a user of REVLIS supplies a second argument, then the function returns the reversed first argument appended onto the second argument. This "feature" is occasionally quite useful. What if the user supplies more arguments than there are parameters? The extra arguments are evaluated, but ignored. Up to this point the lessons have taught the "applicative" style of programming. The emphasis has centered on expression evaluation, functional composition, and recursion. The power and elegance of applicative programming was the topic of an influential Turing Lecture by J Backus. The lecture was published in the August 1978 issue of the Communications of the ACM. muSIMP also supports the alternative "Von Neumann" style emphasizing loops, assignments, and other side-effects. To illustrate this style, here is an alternative definition of REVLIS which introduces the LOOP construct: % FUNCTION REVLIS (LIS, ANS), LOOP WHEN EMPTY (LIS), ANS EXIT, ANS: ADJOIN (FIRST(LIS), ANS), LIS: REST (LIS) ENDLOOP ENDFUN $ % muSIMP has a function named REVERSE which is defined in machine language. Since it is entirely equivalent to REVLIS and much faster, REVERSE should normally be used in place of REVLIS in application programs written by the user. An iterative loop is an expression consisting of the keyword LOOP, followed by a sequence of one or more expressions separated by commas, followed by the matching delimiter named ENDLOOP. The body of a loop is evaluated similarly to a function body, except: 1. When evaluation reaches the delimiter named ENDLOOP, evaluation proceeds back to the first expression in the loop. 2. When evaluation reaches an EXIT within the loop, evaluation proceeds to the point immediately following ENDLOOP, and the value of the loop is that of the last expression evaluated therein. There can be any number of conditional exits anywhere in a loop. Ordinarily there is at least one exit unless the user plans to have the loop repeat indefinitely. Now consider the following sequence: % L1: '(THE ORIGINAL ) $ L2: '(TAIL) $ LIS: 'DOG & ANS: 'CAT & REVLIS (L1, L2) & % The above definition of REVLIS makes assignments to its parameters LIS and ANS. For this example, the final assignments are LIS: '() and ANS: '(ORIGINAL, THE, TAIL). So, what do you guess are the corresponding current values for LIS and ANS? See for yourself: % RDS: FALSE $ % The assignments to parameters LIS and ANS in REVLIS has no effect on their values once the function has returned! The restoration of the original environment following the return from a called function allows the programmer to change the value of a function's parameters without fear of damaging the values the parameters of the same name have outside the function. Thus functions can be thought of as "black boxex" which have no effect other than their returned value. A function's parameters can not be used to pass information back to the calling function. If we wish to return more than one piece of information, a list of values can be returned. However, another way is to make assignments within the function body to variables which are not among its parameters. Such variables are called "fluid" or "global" variables. The iterative version of REVLIS using the LOOP construct is slightly faster than the recursive version, but the latter is more compact. When there is such a trade-off between speed and compactness, a good strategy is to program for speed in the crucial most frequently used functions, and program for compactness elsewhere. Another consideration when choosing between iteration and recursion is the amount of storage required to perform a given task. Each time a function is called information must be stored on a STACK so the original environment can be restored when the function returns. Since recursion involves the nesting of function calls, a highly recursive function can exhaust all available memory before completing its task. This will result in the ALL Spaces Exhausted error trap. The use of iteration in this situation might permit an equivalent computation to proceed to termination. For practice with loops, use one to write a nonrecursive recognizer named ISSET, which returns TRUE if its list argument contains no duplicate elements, returning FALSE otherwise. (Compare your definition with the recursive version in lesson PLES3.) % RDS: FALSE $ % Here is our solution: % FUNCTION ISSET (LIS), LOOP WHEN EMPTY (LIS), EXIT, WHEN MEMBER (FIRST(LIS), REST(LIS)), FALSE EXIT, LIS: REST (LIS) ENDLOOP ENDFUN $ ISSET ('(DOG, CAT, COW, CAT, RAT)) & % Another good exercise adapted from PLES3 is to use a loop to write a nonrecursive function named SUBSET, which returns TRUE if its first argument is a subset of its second argument, returning FALSE otherwise: % RDS: FALSE $ % A BLOCK is another control construct which is sometimes convenient, particularly in conjuction with the Von Neumann style. As an illustration of its use, the following iterative version of the MAKESET function from PLES3 returns a set composed of the unique elements in the list which is its first argument: % FUNCTION MAKESET (LIS, ANS), LOOP WHEN EMPTY (LIS), ANS EXIT, BLOCK WHEN MEMBER (FIRST(LIS), ANS), EXIT, ANS: ADJOIN (FIRST(LIS), ANS) ENDBLOCK, LIS: REST (LIS) ENDLOOP ENDFUN $ MAKESET ('(FROG, FROG, FROG, TERMITE)) & % When evaluation reaches an EXIT, it proceeds to the point following the next ENDBLOCK, ENDLOOP, or ENDFUN delimiter -- whichever is nearest. Thus, BLOCK provides a means for alternative evaluation paths which rejoin within the same function body or loop body, without causing an exit from that body. The first expression in a block must be a conditional-exit (anything else can be moved outside anyway), but since there can be any number of other conditional exits or other expressions within the block, the block provides a very general structured control mechanism. For example, the CASE-statement and IF-THEN-ELSE construct of some other languages are essentially special cases of a block. You may not have noticed, but the loop version of MAKESET has the effect of reversing the order of the set elements. Using ADJOIN in a loop generally has this effect, which is why it is so suitable for REVERSE. With sets, incidental list reversal is perhaps acceptable, but for most applications of lists it is not. We could of course use a preliminary or final invocation of REVERSE so that the final list would emerge in the original order, but that would relinquish the speed advantage of the loop approach, while further increasing its greater bulk. Thus, recursion is usually preferable to loops when ADJOIN is involved. For example, recursion is used almost exclusively to implement muMATH, because its symbolic expressions are represented as ordered lists. Loops are also less applicable to general tree structures than to lists, but it is often possible to loop on the REST pointer while recursing on the first pointer, or vice-versa, particularly if ADJOIN is not involved. For example, compare the following semi-recursive definition of #ATOMS with the fully-recursive one in PLES1: % FUNCTION #ATOMS (U, N), N: 1, LOOP WHEN ATOM (U), N EXIT, N: N + #ATOMS (FIRST(U)), U: REST (U) ENDLOOP ENDFUN $ #ATOMS ('((3 . FOO), BAZ)) & % If the answer surprises you, don't forget the FALSE which BAZ is implicitly dotted with. See if you can similarly write a semi-recursive function named DUP which does what the infix operator named "=" does: % RDS: FALSE $ % Those of you with previous exposure to only Von Neumann style programming undoubtedly feel more at home now. The reason we postponed revealing these features until now is that we wanted to force the use of applicative programming long enough for you to appreciate it too. Naturally, one should employ whichever style is best suited for each application, so it is worthwhile to become equally conversant with both styles. Thus endeth the sermon. % ECHO: FALSE $ RDS () $