An Introduction to Scheme and its Implementation

Go to the first, previous, next, last section, table of contents.

Lists Again

Suppose we want to make a list of symbols whose print names are the English words for the first five integers. We could do it using quoting, of course, like this:

Scheme>(define firstfive '(one two three four five))
#void
Scheme>firstfive
(one two three four five)

We don't have to quote each symbol individually. Within a quote expression, everything is assumed to be literal data, not expressions to evaluate.

We could also do it by calling list to construct the list, and handing it each of the five symbols as literals. To do that, we have to quote them, so that Scheme won't think we're referring to variables named one, two, etc.

Scheme>(define firstfive (list 'one 'two 'three 'four 'five))
#void
Scheme>firstfive
(one two three four five)

Since list is a procedure, its argument expressions are evaluated. We use a quote around each expression, so that it will return a pointer to the appropriate symbol, rather than the value of the variable by the same name.

This works whether or not there is a variable by that name, because names of symbols and names of variables are completely different things.

For example, even after evaluating the above expressions, attempting to evaluate the expression four will be an error, unless we've defined a variable named four. The existence of a symbol with a given print name doesn't say anything about the existence of a variable with that name.

Heterogeneous Lists

Since Scheme is dynamically typed, we can put any kind of object in a list. So far, we've made a list of integers and a list of symbols. We can also make a list of different kinds of things, such as a list of integers, symbols, and lists.

Scheme>(define mixed5 '(one 2 (three and a) "four" 5))
#void
Scheme>mixed5
(one 2 (three and a) "four" 5)

Here we've constructed a mixed list whose first element is a symbol, the second is an integer, the third is a list of symbols, the fourth is a string, and the fifth is another integer. (The technical term for a mixed list is a "heterogeneous list.")

We can draw it like this:

       +-----+
mixed5 |  +--+-->+---+---+  +---+---+  +---+---+  +---+---+  +---+---+
       +-----+   | + | +-+->| + | +-+->| + | +-+->| + | +-+->| + | * |
                 +-+-+---+  +-+-+---+  +-+-+---+  +-+-+---+  +-+-+---+
                   |          |          |          |          |
                  \|/        \|/         |         \|/        \|/
                  one         2          |        "four"       5
                                         |
                                        \|/
                                       +---+---+  +---+---+  +---+---+
                                       | + | +-+->| + | +-+->| + | * |
                                       +-+-+---+  +-+-+---+  +-+-+---+
                                         |          |          |
                                        \|/        \|/        \|/
                                       three       and         a

Notice that we draw the symbols (one, three, and, and a) as simple sequences of characters. This is just a drawing convention. They're really objects, like pairs are. We draw strings similarly, but with double quotes around them. Don't be fooled--these are objects on the heap, too. We just draw them this way to keep the picture from getting cluttered up.

Operations on Lists

We've already seen two list-processing procedures that you'll use a lot, car and cdr. car takes a pointer to a pair, and extracts the value of its first (car) field. cdr takes a pointer to a pair and returns the value of its second (cdr) field.

(It might have been better if car had been called first and cdr had been called rest, since that's more suggestive of how they're used: a pointer to the first item in a list, and a pointer to the pair that heads the rest of the list.)

Given our list stored in mixed5, we can extract parts of the list using car and cdr.

Scheme>(car mixed5)
one
Scheme>(cdr mixed5)
(2 (three and a) "four" five)

By using car and cdr multiple times, we can extract things beyond the first element. For example, taking the cdr of the cdr of a list skips the first two elements, and returns the rest:

Scheme>(cdr (cdr mixed5))
((three and a) "four" 5)

Taking the car of that list (that is, the car of the cdr of the cdr) returns the first item in that list:

Scheme>(car (cdr (cdr mixed5)))
(three and a)

We can keep doing this, for example taking the second element of that sublist by taking the car of its cdr.

Scheme>(car (cdr (car (cdr (cdr mixed5)))))
and

This starts to get tedious and confusing--too many nestings of procedures that do too little at each step--so Scheme provides a handful of procedures that do two list operations at a whack. The two most important ones are cadr and cddr.

cadr takes the car of the cdr, which gives you the second item in the list. cddr takes the cdr of the cdr, skipping the first two pairs in a list and returning the rest of the list.

This lets us do the same thing we did above, a little more concisely and readably:

Scheme>(cadr (car (cddr mixed5)))
and

With a little practice, it's not hard to read a few nested expressions like this. In this example, taking the cddr of mixed5 skips down the list two places, giving us the list that starts with the sublist we want. Then taking the car of that gives us the sublist itself off the front of that list, and taking the cadr of that gives us the second item in the sublist.

Of course, even if Scheme didn't provide cadr and cdr, you could write them yourself in terms of car and cdr:

(define (cadr x)
   (car (cdr x)))

(define (cddr x)
   (cdr (cdr x)))

Scheme actually provides predefined list operations for all combinations of up to four car's and cdr's. For example, cadadr takes the cadr of the cadr. (The naming scheme is that the pattern of a's and d's reflects the equivalent nesting of calls to car and cdr.)

You probably won't want to bother with most of those, because the names aren't very intuitive. Two procedures that are worth knowing are list-ref and list-tail.

(list-ref list n) extracts the nth element of a list list, which is equivalent to n-1 applications of cdr followed by car. For example, (list-ref '(a b c d e) 3) is equivalent to (car (cdr (cdr '(a b c d e)))), and returns d.)

In effect, you can index into a list as though it were an array, using list-ref. (Of course, the access time for an element of a list is linear in the index of the element. If you need constant-time access, you can use vectors, i.e., one-dimensional arrays.) Notice that the numbering is zero-based, which is why (list-ref lis 3) returns the fourth element of a lis. This is consistent with the indexing of vectors, which are also zero-based, as well as reflecting the number of cdr operations.

(list-tail list n) skips the first n elements of a list and returns a pointer to the rest, which is equivalent to repeated applications of cdr. (This is standard R4RS Scheme, but not IEEE Scheme. If your Scheme doesn't provide list-tail, you can easily write your own.)

These two procedures can make it much clearer what you're doing when you extract elements from nested lists. Suppose that we have a list foo, which is a triply-nested list--a list of lists of lists, and we want to extract the second element of the bottom-level list that is the third element of the middle-level list that is the fourth element of the outermost list.

We could write (car (cdr (car (cdr (cdr (car (cdr (cdr (cdr foo))))))))), but that's pretty hard to read. If we use cadr, caddr, and cadddr, we can make it somewhat more readable by using one function call at each level of structure: (cadr (caddr (cadddr foo))). But it's still clearer to write (list-ref (list-ref (list-ref foo 4) 3) 2)

or (indented)

(list-ref (list-ref (list-ref foo 4)
                    3)
          2)

list-ref and list-tail are much more convenient than things like caddr when the indexes into a list vary at run time. For example, we might use an index variable i (or some other expression that returns an integer) to pick out the ith member of a list: (list-ref foo i). Writing this with car and cdr would require writing a loop or recursion to perform n-1 cdr's and a car.

==================================================================
This is the end of Hunk N

At this point, you should go back to the previous chapter and
read Hunk O before returning here and continuing this tutorial.
==================================================================

(Go BACK to read Hunk O, which starts at section Tail Recursion (Hunk O).)

Go to the first, previous, next, last section, table of contents.