In this section, I'll present a new interpreter for a bigger subset of Scheme; it handles all of the essential special forms of Scheme, except for macro definitions. (A macro facility would be easy to add, as well, and would make it easy to implement the remaining special forms by automatic transformation, in terms of the special forms the interpreter "understands" directly. A later chapter will show how to do this.)
The new interpreter is very much like the one from the last chapter, with three important differences:
let) may create a new environment, and subexpressions (such as the
letbody) can simply be evaluated in the new environment by recursive calls to
let. The arguments are bound in a local environment (like
letvariables), and the body is interpreted in that environment.
Here is our new
(define (eval expr envt) (cond ((symbol? expr) (eval-symbol expr envt)) ((pair? expr) (eval-list expr envt)) ((self-evaluating? expr) expr) (#t (error "Illegal expression form" expr))))
Notice that not much has changed---
eval still just analyzes
expressions and dispatches to more specialized helper procedures that
handle particular kinds of expressions.
The important difference is that
eval expects an environment
envt, which represents the binding environment in
which to evaluate an expression. That is, the environment
argument is used to keep track of the meaning of variable names--what
storage they refer to--as teh interpretation process moves in and
out of scopes.
Instead of using the old "flat" representation of an environment, which was just a table of name-value pairs, we'll represent nested environments as a list of tables, or environment chain.
When we begin interpreting, the environment chain will consist
of one table, the top-level environment. When we evaluate a binding
construct such as a
let, we will create a new table, or
environment frame, which binds the local variables. This
frame will contain the name-value pairs bound locally, plus a pointer
to the next enclosing environment. The environment chain is thus
a linked list that acts like a stack, for the most part--new enviornment
frames are pushed on the front of the list when entering a binding
construct, and popped off the front of the list when exiting it.
We could implement this stack-like behavior with an explicit stack data structure in the interpreter, but it's easier to use the activation "stack" of the language we're using to implement the interpreter. (In this case, that happens to be Scheme, but if we were implementing the interpreter in C, we could use C's activation stack.) We'll just use recursion to evaluate subexpressions, and rely on the language we're implementing the interpreter in to remember where we were in interpreting the enclosing expressions.
At any given point during evaluation, the current environment
is the environment referred to by the interpreter's internal
envt, an in particular the most recent binding of
When we evaluate an expression that doesn't change the interpretive
environment, and call
eval recursively to evaluate subexpressions,
we simply pass the
envt variable's value to the recursive
calls. This will ensure that the subexpressions execute in the
same environment as the enclosing expression expression.
When we evaluate a binding construct, and evaluate subexpressions in
that environment, we create a new environment and pass that
to the recursive calls to
eval, so the subexpressions will
execute in the new enviornment instead.
Notice that we don't actually modify the environment chain when creating
a new environment--we simply create a new frame which holds a pointer
to the old environment, and pass it to the recursive
fact that we don't actually modify the structure of the environment
is important--it's will let us implement closure correctly.
When the interpreter returns from evaluating a subexpression, it
returns to an enclosing invocation of
eval; the old
environment will become visible again because we return to
eval where that environment is the value of the
For example, consider what happens when we interpret the following expression, starting at the top level:
(let ((foo 1)) (if (a) (let ((bar 2)) (if (b) (c) (d)) (e)) (f) (g))
[ We'll focus on the nested calls to eval corresponding to the nesting of let, if, let, if ]
If we look at the nested calls to
eval, we first see a call
that evaluates the whole expression in the top-level environment:
+-----+ eval expr: (let...) envt: | *--+--> [toplevel envt] +-----+
(I've given a textual representation of the
expr argument, but
a pictorial representation of the
envt argument to
eval will dispatch to
eval-let, passing it the same
eval-let will evaluate the initial value expression
1 in that environment, and create a new environment binding
(I'll ignore the recursive call to
eval to evaluate the argument.)
It will then call
eval recursively to evaluate the
body in that environment.
I'll depict the nested invocations of
top-to-bottom, showing the stack growing toward the bottom of the
picture. (This just turns out to be simpler than drawing the stack
+-----+ eval expr: (let...) envt: | *--+--> [toplevel envt] +-----+ /|\ /|\ | | +-----+ | | eval-let expr: (let...) envt: | *--+-------+ | +-----+ | | +-----+ | eval expr: (if...) envt: | *--+--> [ [foo 1] * ] +-----+
eval-if will evaluate the condition expression
in the given environment. We'll ignore that recursive call to
eval, but assume it returns a true value. In that case,
eval-if will evaluate its consequent, the inner
expression, by another recursive call to
At this point, the "stack" of invocations of
eval-if looks like this:
+-----+ eval expr: (let...) envt: | *--+--> [toplevel envt] +-----+ /|\ /|\ | | +-----+ | | eval-let expr: (let...) envt: | *--+-------+ | +-----+ | | +-----+ | eval expr: (if...) envt: | *--+---> [ [foo 1] * ] +-----+ /|\ | | +-----+ | eval-if expr: (if...) envt: | *--+-------+ +-----+ | | +-----+ | eval expr: (let...) envt: | *--+-------+ +-----+
let will evaluate the intial value expression,
by a recursive call to
eval, which we will ignore here. Then
it will bind
bar in a new environment frame, and call
recursively to evaluate the body in that environment. The body consists
of another if, so
eval-if will be called, and it will evaluate
its argument expression and either the consequent or the alternative
in that environment.
Assuming the condition returns true and it evaluates the consequent,
(c), here's the "stack" of invocations of
eval-if at the point where
+-----+ eval expr: (let...) envt: | *--+--> [toplevel envt] +-----+ /|\ /|\ | | +-----+ | | eval-let expr: (let...) envt: | *--+-------+ | +-----+ | | +-----+ | eval expr: (if...) envt: | *--+---> [ [foo 1] * ] +-----+ /|\ /|\ | | | | +-----+ | | eval-if expr: (if...) envt: | *--+-------+ | +-----+ | | | | +-----+ | | eval expr: (let...) envt: | *--+-------+ | +-----+ | | +-----+ | eval expr: (if...) envt: | *--+---> [ [bar 2] * ] +-----+ /|\ | +-----+ | eval expr: (c) envt: | *--+-------+ +-----+
[ Note that the pictures above all depict evaluation of nested non-tail
expressions. In the case of tail expressions, the "stack" will not
include as much information, because the state of the calls to
etc., will not be saved before the calls that evaluate subexpressions.
Our interpreter is written in good tail-recursive style, with tail calls to evaluate expressions that are tails of expressions in the language we're interpreting. This means that the intepreter is tail-recursive wherever the program it's implementing is tail-recursive, and since it's implemented in a tail-recursive language (Scheme), we preserve the tail-recurson of the program we're interpreting. In effect, we snarf tail-call optimization from the underlying Scheme system. If we were implementing our interpreter in C, we'd have to use special tricks to preserve tail recursion. We'll show how this can be done later, when we discuss our compiler. ]
In the interpreter in the last chapter, we implemented special forms
directly in the interpreter---
eval-list checked compound
expressions to see if they began with a special form name. In effect,
we hardcoded the meanings of special form names in the procedure
In our new interpreter, we'll use a cleaner approach, which treats special form definitions pretty much like variable definitions. This will let us put special forms in particular environments, and use the normal scoping mechanisms to look up the routines that compile them.
This has several advantages. The first is that it makes our interpreter more modular. We can create different environments with different special forms, and use the same interpreter to interpret different languages. That is, we separate out the basic operation of the interpreter from the particular special forms we decide on.
The second advantage is that it will allow us to build an elegant macro facility, so that new special forms can be defined in terms of old ones. (This will be described in detail in [ a later chapter ].)
[ this is out of place, but fwd ref idea anyway? Shorten? Or just move?]
A Scheme interpreter or compiler only needs to "understand"
procedure calling and a few basic special forms---
quote, and one very special
special form for defining new special forms (macros). (We can write
cond as a macro using
let as a macro using
letrec as a macro using
set!, and so on.)
The third advantage is that we can use the same scoping rules for special forms that we use for variables. This will be very convenient later, because we will be able to define local macros, in much the same way we define local procedures.
To support this, we need to represent bindings slightly differently. In the simple interpreter from the last chapter, each binding was just a name-value pair. Now we'll have a third part to each binding, telling what kind of binding it is--a variable binding, a special form binding, or a macro binding.
We can still use associations to represent the bindings. Where the
simpler interpreter representing each binding as an association of
(name value), the new one will use bindings of
(name type whatever
). In the case of a
normal variable binding, the "whatever" is the actual value of the
variable. In the case of a special form, the "whatever" is the
information the interpreter needs to interpret that particular special
form, including the procedure to evaluate it. For example, when binding
let, we can store a pointer to the procedure
right there in the binding information.
Since the exact representation of bindings is irrelevant, and we may
want to change it, we'll call the whole thing a
data structure. This reflects that fact that it may not hold just
a binding, but also any auxiliary information we want to store.
To abstract away from exactly how bindings are implemented, we'll define
several procedures that operate on
binding-info's. These include:
bdg-type, which returns a symbol saying what kind of binding it is:
<variable>for a normal variable,
<special-form>for a built-in special form binding, and
<syntax>for a syntax (macro) binding.
bdg-variable-ref, which returns the value of a normal variable binding.
bdg-special-form-evaluator, which returns an evaluation procedure for a special form binding.
For now we'll ignore
<syntax> bindings, which will be discussed
in a later chapter.
[ give actual code for accessors, etc? ]
Here's our new
eval-list for handling compound expressions:
(define (eval-list list-expr envt) ;; only try to consider it specially if the head is a symbol (if (symbol? (car list-expr)) ;; look it up in the current lexical environment (let ((binding-info (envt-lexical-lookup envt (car list-expr)))) ;; switch on the type of thing that it is (cond ((not binding-info) (error "Unbound symbol" (car list-expr))) (else (cond ;; special forms just call the special-form ;; evaluator, which is stored in the binding-info ;; object itself ((eq? (binding-type binding-info) '<special-form>) ((bdg-special-form-evaluator binding-info) list-expr envt)) ((eq? (binding-type binding-info) '<variable>) (eval-combo (bdg-variable-ref binding-info) (cdr list-expr) envt)) ((eq? (binding-type binding-info) '<syntax>) (eval-macro-call (bdg-syntax-transformer binding-info) list-expr envt)) (else (error "Unrecognized binding type")))))) ;; the head of the list is not a symbol, so evaluate it ;; and then do an eval-combo to evaluate the args and ;; call the procedure (eval-combo (eval (car list-expr) envt) (cdr list-expr) envt)))
eval-list first checks to see whether the head of the list
is a symbol; if not, it's just a combination (procedure call expression),
and is handled by
eval-combo. (Remember that a combination
can have an arbitrary expression as its operator, and that expression
is assumed to return a procedure to call.)
If it is a symbol, the binding of the variable is looked up. If it's a special form binding, the evaluation procedure is extracted from the binding info, and called to evaluate the expression.
If the head of the list is just the name of a normal variable, that's
also just a combination, and
eval-combo is called in that
If the head of the list is the name of a syntax binding (macro), we
eval-macro-call to deal with it; don't worry about
this for now--it will be discussed in detail in Chapter [ whatever ].
Notice that in all cases, the environment is passed along unchanged to whatever procedure handles the expression.