[ This section is a little out of place--need to introduce type and equality predicates first! Those have been presented in class, so this should be comprehensible anyway. Need to make this a separate hunk, and move it after the next hunk. ] [ Also need to introduce tail recursion somewhere early, and fwd ref the chapter on recursion. ] In this section I'll demonstrate the most common idioms for recursion over simple data structures--lists and trees.
Some of the examples will be implementations of standard
Scheme procedures like
reverse. Scheme already has these procedures built
in, but you should understand how they can be implemented using
simpler procedures like
inevitably have to write special-purpose procedures that are
slightly different, but coded similarly. (In later chapters,
I'll show some more advanced programming techniques that let
you implement more general and/or efficient procedures like
I'll also show a few other handy procedures for operating on lists, e.g., a list-copying routine.
Then I'll show recursion over simple binary trees of pairs. The normal style for recursion over trees in Scheme is slightly different from what you may be used to in languages like C or Pascal--and simpler.
length is the standard Scheme procedure that returns the length
of a list. It only counts the elements along the spine of the list
It's easy to do this using recursion. The length
of a list is 0 if the list is empty, and otherwise it's 1 plus
the length of the rest of the list. Here's the easiest way to
(define (length lis) (cond ((null? lis) 0) (else (+ 1 (length (cdr lis))))))
The main thing to notice about this example is the recursive structure.
The procedure can accept a pointer to either a pair
empty list. The structure of the procedure corresponds directly
to the recursive definition of a (proper) list. The two-part
corresponds to the fact that there are two rules that characterize
lists; it figures out which case we're dealing with.
We explicitly checked for the end-of-list case, but we implicitly
assumed that otherwise the object being operated on is a
pair. This might seem like bad style, but actually it's good,
because it ensures that an error will be signaled if the argument
to length is not the empty list or a pair--the second branch
cond will be taken (erroneously), but the attempt to
(cdr lis) will signal an error.
We could make this clearer by using a three-branch
separate branches for the two valid cases and the error case:
(define (length lis) (cond ((null? lis) 0) ((pair? lis) (+ 1 (length (cdr lis)))) (else (error "invalid argument to length")))
Here I've used the error-signaling procedure
stops execution and signals an error. (In most systems, the error
"invalid argument to length" will be printed and
the user will be presented with a break prompt for debugging
the problem.) Unfortunately,
error is not supported by
all Scheme systems. (Later I'll show an implementation that
should work reasonably well in any Scheme system.)
Also note that in this example, I've used
lis as the name of a list
argument, rather than
list. That's because there's a standard
Scheme procedure named
list, which will be shadowed by any
local variable with the same name. (This is because of Scheme's
unified namespace---you can't have a variable and a procedure
with the same name, for reasons that will be explained later;
seems to be the only identifier for which this is commonly a problem.)
The above definition of
length is not tail recursive--after calling
itself, there must be a return so that
1 can be added to the value
and returned. Later I'll show a more efficient, tail-recursive version of
length, and a more general procedure called
can be used to construct a variety of procedures whose basic
algorithm is similar.
There are two common senses of copying, shallow copying, and deep copying. A shallow copy makes a copy of one object, and the copy has pointers to the same objects that the original did.
A deep copy copies not only the top-level objects in a data structure, but the ones below that, and so on recursively, so that a whole new data structure is created.
For lists, which are made up of more than one object, it is often
useful to copy the spine of the list, i.e., doing a deep copy along
cdr's only. We typically think of a list as being
like a special kind of object, even though it's really a sequence
of pair objects. It's therefore natural to copy "just the list."
If we just want to do a shallow copy, we can define
to copy a pair, without copying anything else.
In these examples, I'll assume we only want to copy list structure--that is a connected set of pairs. Whenever we come to something that's not a pair, we stop copying and the copy shares structure with the original. (These aren't standard Scheme procedures.)
Here's a truly shallow copy, just copying a single pair:
(define (pair-copy pr) (cons (car pr) (cdr pr)))
If we want to do a deep copy, we can use recursion to copy
cdr values that are also pairs. The following code for
pair-tree-deep-copy assumes that the structure to be copied is a
tree of pairs. (If there is any shared structure, it will be copied
each time it is reached, and the copy will not have the same structure.
It will always be a tree. Preserving shared structure while copying
is harder, but can be done. If there's a directed cycle,
pair-tree-deep-copy will loop infinitely.)
(define (pair-tree-deep-copy thing) (if (not (pair? thing)) thing (cons (pair-tree-deep-copy (car thing)) (pair-tree-deep-copy (cdr thing)))))
pair-tree-deep-copy works on improper as well as
proper lists, but only copies the pairs. Where it gets to a non-pair
value, it stops and just uses the same value in the copy, and the copy
shares structure with the original.
The code for
pair-tree-deep-copy directly reflects the kind of
structure it copies. It can handle non-pairs, which are assumed to be
leaves of the graph of pairs that it's copying, and it can handle pairs,
which are assumed to be interior nodes of the tree. Their
cdr values may be leaves of the tree, or other pairs.
So the recursive definition of a pair-tree is:
The first rule is the base case, i.e., is the simple one that doesn't require recursion. The second is the recursive rule, which expresses the fact that an interior node's car and cdr fields can point to any kind of pair-tree: a leaf, or another interior node whose children may likewise be leaves or other interior nodes...
This is the easy way to write recursive routines over data structures--figure out a recursive description that exactly describes the expected data structures, and then use that recursive description to write a recursive description of the result you want. Then you can straightforwardly code routine that will traverse the structure and copmute that result.
Generally, we write the base case first, to make it clear where recursion ends--and so that we don't forget to write it and accidentally write infinite recursions or unhandled cases. If you do this consistently, your code will be more readable and you'll make fewer mistakes.
To copy the spine of a proper list, we can use this description of the answer we want:
A copy of a list is
carvalue is the same as the
carof the original list, and whose
cdrvalue is a copy of the rest of the original list.
Here's the code:
(define (list-copy lis) (cond ((null? lis) '()) (else (cons (car lis) (list-copy (cdr lis))))
As usual, we only check to see if we're a the end of
the list, and otherwise assume the argument is a pair. Since we
car and the
cdr of the pair in the latter case,
we'll get an error if the argument is not a proper list. This is usually
what we want, so that Scheme will signal an error when it gets to
the part of the list with unexpected structure.
list-copy was chosen to suggest that it operates on lists,
and in Scheme terminology "list" means "proper list" by default. If
we want a routine that copies improper lists, we should call it something
else, and write a comment saying what kinds of things it works for.
Actually, lists are so common in Scheme that we could have just called
copy. Most procedure names begin with the name of the kind
of structure they operate on, but exceptions are made for lists and
Two handy operations on lists are
both are standard Scheme procedures.
append takes any number of lists as arguments, and returns a
list with all of their elements.
reverse takes a list and returns
a new list, with the same elements but in the opposite order.
Note that like most Scheme procedures, neither of these procedures is destructive--each creates a new list without side-effecting (modifying) its argument(s).
Append works much like
list-copy except that we have multiple
lists to deal with.
The trick to getting it right is to maintain the essential structure
list-copy, with the right minor differences.
For now, let's keep things simple, and just do a two-argument version
Our strategy is to recurse through the first list, like
copying one element of the list at each step. When we get to the
end, however, the base case is different--rather than terminating
the list with the empty list, we just use the second list as the
"rest" of the copy we're making.
Notice that the base case occurs when the first list is null--the
append of an empty list and another list is just that other
cons zero items onto the front of
that list. Concretely, we can just return that list.
Here's the recursive characterization of the result we want
carof the first list, and whose
appendof the rest of the first list and (all of) the second list.
Here's a simple two-argument version of
(define (append2 lis1 lis2) (cond ((null? lis1) lis2) (else (cons (car lis1) (append2 (cdr lis1) lis2)))))
append2 copies its first list argument, but the
result simply shares a pointer to the last list argument--the
last list is not copied, so the result shares structure with
that list. This is also true of the standard Scheme function
append, which can take any number of lists as arguments.
The first n-1 lists are copied, but the last is shared.
Be sure you understand the concrete operation of the above algorithm. On the way down during recursion, we're taking the first list apart, holding onto one list element at each step. When we hit the end of the first list, recursion stops and we return the second list. On the way back up, we're consing those items onto the new list we're creating, back-to-front.
Suppose we have defined two lists,
(define foo '(x y z)) (define bar '(a b)) (define baz (append bar foo))
The result will be that
baz shares structure with
but not with
bar. Changes to the list via
also be visible via
+----------------------------------------+ | | \|/ | +---+ +---+---+ +---+---+ +---+---+ | foo | *-+--->| * | *-+---->| * | *-+----->| * | * | | +---+ +-+-+---+ +-+-+---+ +-+-+---+ | | | | | \|/ \|/ \|/ | x y z | | | +---+ +---+---+ +---+---+ | bar | *-+--->| * | *-+---->| * | * | | +---+ +-+-+---+ +-+-+---+ | | | | \|/ \|/ | a b | /|\ /|\ | | | | +---+ +---+---+ +---+---+ | baz | *-+--->| * | *-+---->| * | *-+--------------------+ +---+ +---+---+ +---+---+
In general, the result of
append shares structure with the
last argument passed to
append. If you want to avoid
this, you can pass
append an empty list as its last
argument. For example
(append '(1 2 3) '()) will copy
(1 2 3).
If you're worried about efficiency, be aware that
takes time proportional to the length of the lists that must be
copied, i.e., all but the last list being
usually doesn't matter, but it's a consideration for performance-critical
parts of your program, especially if you're appending long lists.
(It's common to
append short lists onto the front
of long lists, and then
reverse the result if necessary.)
reverse returns a reversed copy of a list.
There's an easy (but slow) way to define
reverse in terms
append. We just take the first element off the list,
reverse the rest of the list, and append the first element to the
end of the list. We do this recursively, so that each time we
reverse the rest of the list, we're doing the same thing on
a shorter list. When we get down to the end of the list, reversing
it is a no-op: the reverse of an empty list is the empty list.
(define (reverse lis) (if (null? lis) '() (append (reverse (cdr lis)) (list (car lis)))))
Think about how this actually works.
reverse recurses down the
list, calling itself on the
cdr of the list at each recursive step,
until the recursion stops at the end of the list. (This last call
returns the empty list, which is the reverse of the empty list.)
At each step, we use
car to peel off one element of the
list, and hold onto it until the recursive call returns.
The reversed lists are handed back up through the returns,
with the cars being slapped on the rear of the list at each return
step. (To add a single item to the end of the list using
we must first put it in a one-element list using
We end up constructing the new list back-to-front on the way up from the recursion. Going down recursively tears the list apart, one item at each recursive step, and coming back up adds an element to the end of the new list at each step.
This is a good example to understand, both abstractly and concretely. You should understand the concrete steps involved in taking a list apart and putting it back together backwards. On the other hand, you should also recognize that the algorithm works even if you don't pay attention to that.
Once you get the hang of recursion, it's often easy to write algorithms
without actually thinking about the little steps involved, or thinking
much about the ordering of steps. In this case, it's easy to see that
if we can reverse the rest of the list, and append the first item to
the end of that, we've reversed the whole list. We don't need to think
much about the ordering of the operations, because that falls naturally
out of the way we pass arguments to functions. We can declare that
reverse of a non-empty list is the
reverse of the rest of the list and (a list containing) the
first item in the list", and then write the code accordingly, as a
pure function---one that only depends on the values of its
arguments, and has no side effects.
By writing this recursively, we'll apply the same trick
all the way down the list. Thinking a little
more concretely--but not much--we can see that at each time we
reverse the rest of the list, the list in question will be shorter.
Somewhere we'll hit the end of the list, so we have to handle that
base case. It's usually easy to see what the right thing to do
is for the base case. In this case, we can declare that "the
reverse of the empty list is the empty list," and add
the appropriate branch to
This is a good example of how you can combine functions to create new functions, implementing algorithms without using sequencing or side effects. (Notice that if we had side effects, we'd have to think very carefully about the ordering of steps, to make sure that we used a data structure after certain changes, and before others. Bleah.)
(The following remarks about efficiency are fairly advanced--you
shouldn't worry about these things yet if they get in the way of
learning how to write programs simply and straightforwardly. You
can skip or skim them and come back to them later once you've gotten
the hang of Scheme, and want to tune the time-critical parts of
your programs for maximum efficiency. On the other hand, you
may find that thinking about the concrete details reinforces
the basic ideas.)
There are two problems coding
reverse this very simple way,
reverse turns out to be one of the hardest
"simple" list routines to code efficiently. Later I'll sho
better versions that are more clever, but only very slightly more
complicated. (They'll still be recursive, and won't use loops or
[ where? (Later I need to show a linear-time version that uses list->vector and then reverses the vector into a list tail-recursively... ]
The first problem is that each call to
append takes time
proportional to the length of the list it's given. (Remember
append effectively copies all of the pairs in the first list
it's given, making a backward copy.) We have to copy the "rest" of
the list using
append, starting at each pair in the list. On
average, we copy half the list at a given recursive step, so since we
do this for every pair in the list, we have an order n-squared
Another problem is that we're doing things on the way back up from recursion, which turns out to be more expensive than doing things on the way down. As I'll explain in a later chapter, Scheme can do recursion very efficiently if everything is done in a forward direction, on the way down--Scheme can optimize away all but one of the returns, and the state-saving before the calls. (Luckily, this is easy to do.)
Since Scheme provides a built-in
reverse, you don't have
to think much about this. A good Scheme system will provide
a heavily-optimized implementation of
reverse that is
linear in the length of the list being reversed.
reverse is very handy, and the efficiency of a built-in
reverse is important, because it's usually best to
construct a list in whichever order is easy and efficient,
and then reverse the whole list if necessary. Typically, you
cons one item onto a list at a time, or maybe
a few items at a time, in whatever order it's easiest to create the
list. This allows you to construct the list in linear time;
with a linear-time
reverse, the overall process is still
================================================================== This is the end of Hunk E. TIME TO TRY IT OUT At this point, you should go read Hunk F of the next chapter and work through the examples using a running Scheme system. Then return here and resume this chapter. ==================================================================
(Go to Hunk F, which starts at section Lists (Hunk F).)