Read Structure and Interpretation of Computer Programs Online
Authors: Harold Abelson and Gerald Jay Sussman with Julie Sussman
after-call20 |
Exercise 5.36.
What order of evaluation does our compiler produce for operands of a
combination? Is it left-to-right, right-to-left, or some other order?
Where in the compiler is this order determined? Modify the compiler
so that it produces some other order of evaluation. (See the
discussion of order of evaluation for the explicit-control evaluator
in section
5.4.1
.) How does changing the order of
operand evaluation affect the efficiency of the code that constructs
the argument list?
Exercise 5.37.
One way to understand the compiler's
preserving
mechanism for
optimizing stack usage is to see what extra operations would
be generated if we did not use this idea. Modify
preserving
so
that it always generates the
save
and
restore
operations.
Compile some simple expressions and identify the unnecessary stack
operations that are generated.
Compare the code to that generated with the
preserving
mechanism intact.
Exercise 5.38.
Our compiler is clever about avoiding unnecessary stack operations,
but it is not clever at all when it comes to compiling calls to the primitive
procedures of the language in terms of the primitive operations
supplied by the machine. For example, consider how much code is
compiled to compute
(+ a 1)
: The code sets up an argument list
in
argl
, puts the primitive addition procedure (which it finds
by looking up the symbol
+
in the environment) into
proc
,
and tests whether the procedure is primitive or compound. The
compiler always generates code to perform the test, as well as code
for primitive and compound branches (only one of which will be executed).
We have not shown the part of the controller that implements
primitives, but we presume that these instructions make use of
primitive arithmetic operations in the machine's data paths. Consider
how much less code would be generated if the compiler could
open-code
primitives – that is, if it could generate code to directly
use these primitive machine operations. The expression
(+ a 1)
might be compiled into something as simple as
43
(assign val (op lookup-variable-value) (const a) (reg env))
(assign val (op +) (reg val) (const 1))
In this exercise we will extend our compiler to support open coding of
selected primitives. Special-purpose code will be generated
for calls to these primitive procedures instead of the general
procedure-application code. In order to support this, we will augment
our machine with special argument registers
arg1
and
arg2
.
The primitive arithmetic operations of the machine will take their
inputs from
arg1
and
arg2
. The results may be put into
val
,
arg1
, or
arg2
.
The compiler must be able to recognize the application of an
open-coded primitive in the source program. We will augment the
dispatch in the
compile
procedure to recognize the names of
these primitives in addition to the
reserved words (the special forms)
it currently recognizes.
44
For each special form our compiler has a code generator. In
this exercise we will construct a family of code generators for the
open-coded primitives.
a. The open-coded primitives, unlike the special forms, all need their
operands evaluated. Write a code generator
spread-arguments
for use by
all the open-coding code generators.
Spread-arguments
should take an
operand list and compile the given operands targeted to successive argument
registers. Note that an operand may contain a call to an open-coded
primitive, so argument registers will have to be preserved during operand
evaluation.
b. For each of the primitive procedures
=
,
*
,
-
, and
+
, write a code generator that takes a combination with that
operator, together with a target and a linkage descriptor, and
produces code to spread the arguments into the registers and then
perform the operation targeted to the given target with the given
linkage. You need only handle expressions with two operands. Make
compile
dispatch to these code generators.
c. Try your new compiler on the
factorial
example. Compare the
resulting code with the result produced without open coding.
d. Extend your code generators for
+
and
*
so that they
can handle expressions with arbitrary numbers of operands. An
expression with more than two operands will have to be compiled into a
sequence of operations, each with only two inputs.
One of the most common optimizations performed by compilers is the
optimization of variable lookup. Our compiler, as we have implemented
it so far, generates code that uses the
lookup-variable-value
operation of the evaluator machine. This searches for a variable by
comparing it with each variable that is currently bound, working frame
by frame outward through the run-time environment. This search can be
expensive if the frames are deeply nested or if there are many
variables. For example, consider the problem of looking up the value
of
x
while evaluating the expression
(* x y z)
in an
application of the procedure that is returned by
(let ((x 3) (y 4))
(lambda (a b c d e)
(let ((y (* a b x))
(z (+ c d x)))
(* x y z))))
Since a
let
expression is just syntactic sugar for a
lambda
combination, this expression is equivalent to
((lambda (x y)
(lambda (a b c d e)
((lambda (y z) (* x y z))
(* a b x)
(+ c d x))))
3
4)
Each time
lookup-variable-value
searches for
x
, it must
determine that the symbol
x
is not
eq?
to
y
or
z
(in the first frame), nor to
a
,
b
,
c
,
d
, or
e
(in the second frame). We will assume, for the moment, that
our programs do not use
define
– that variables are
bound only with
lambda
. Because our language is
lexically
scoped, the run-time environment for any expression will have a
structure that parallels the lexical structure of the program in which
the expression appears.
45
Thus, the compiler can know, when it analyzes the
above expression, that each time the procedure is applied the variable
x
in
(* x y z)
will be found two frames out from the
current frame and will be the first variable in that frame.
We can exploit this fact by inventing a new kind of variable-lookup
operation,
lexical-address-lookup
, that takes as arguments an
environment and a
lexical address
that consists of two numbers:
a
frame number
, which specifies how many frames to pass over,
and a
displacement number
, which specifies how many variables to
pass over in that frame.
Lexical-address-lookup
will produce
the value of the variable stored at that lexical address relative to
the current environment. If we add the
lexical-address-lookup
operation to our machine, we can make the compiler generate code that
references variables using this operation, rather than
lookup-variable-value
. Similarly, our compiled code can use a new
lexical-address-set!
operation instead of
set-variable-value!
.
In order to generate such code, the compiler must be able to determine
the lexical address of a variable it is about to compile a reference
to. The lexical address of a variable in a program depends on where
one is in the code. For example, in the following program, the
address of
x
in expression <
e1
> is (2,0) – two frames back
and the first variable in the frame. At that point
y
is at
address (0,0) and
c
is at address (1,2). In expression
<
e2
>,
x
is at (1,0),
y
is at (1,1), and
c
is at (0,2).
((lambda (x y)
(lambda (a b c d e)
((lambda (y z) <
e1
>)
<
e2
>
(+ c d x))))
3
4)
One way for the compiler to produce code that uses lexical addressing
is to maintain a data structure called a
compile-time
environment
. This keeps track of which variables will be at which
positions in which frames in the run-time environment when a
particular variable-access operation is executed. The compile-time
environment is a list of frames, each containing a list of variables.
(There will of course be no values bound to the variables, since
values are not computed at compile time.) The compile-time
environment becomes an additional argument to
compile
and is
passed along to each code generator. The top-level call to
compile
uses an empty compile-time environment.
When a
lambda
body is compiled,
compile-lambda-body
extends the compile-time environment by a frame containing the
procedure's parameters, so that the sequence making up the body
is compiled with that extended environment.
At each point in the compilation,
compile-variable
and
compile-assignment
use the compile-time
environment in order to generate the appropriate lexical addresses.
Exercises
5.39
through
5.43
describe how to complete this sketch of
the lexical-addressing strategy in order to incorporate lexical lookup
into the compiler.
Exercise
5.44
describes another use for the
compile-time environment.
Exercise 5.39.
Write a procedure
lexical-address-lookup
that implements the new
lookup operation. It should take two arguments – a lexical address
and a run-time environment – and return the value of the variable
stored at the specified lexical address.
Lexical-address-lookup
should signal an error if the value of the variable is the symbol
*unassigned*
.
46
Also write a procedure
lexical-address-set!
that
implements the operation that changes the value of the variable at a
specified lexical address.
Exercise 5.40.
Modify the compiler to maintain the compile-time environment as
described above. That is, add a compile-time-environment argument to
compile
and the various code generators, and extend it in
compile-lambda-body
.
Exercise 5.41.
Write a procedure
find-variable
that takes as arguments a
variable and a compile-time environment and returns the lexical
address of the variable with respect to that environment. For
example, in the program fragment that is shown above, the compile-time
environment during the compilation of expression <
e1
> is
((y
z) (a b c d e) (x y))
.
Find-variable
should produce
(find-variable 'c '((y z) (a b c d e) (x y)))
(1 2)
(find-variable 'x '((y z) (a b c d e) (x y)))
(2 0)
(find-variable 'w '((y z) (a b c d e) (x y)))
not-found
Exercise 5.42.
Using
find-variable
from exercise
5.41
,
rewrite
compile-variable
and
compile-assignment
to output
lexical-address instructions. In cases where
find-variable
returns
not-found
(that is, where the variable is not in the
compile-time environment), you should have the code generators use the
evaluator operations, as before, to search for the binding.
(The only place a variable that is not found at compile time can be is in
the global environment, which is part of the run-time environment but
is not part of the compile-time environment.
47
Thus, if you wish, you may have the evaluator operations look directly in
the global environment, which can be obtained with the operation
(op get-global-environment)
, instead of having them search the whole run-time
environment found in
env
.)
Test the modified compiler on a few simple cases, such as the nested
lambda
combination at the beginning of this section.
Exercise 5.43.
We argued in section
4.1.6
that internal definitions
for block structure should not be considered “real”
define
s. Rather,
a procedure body should be interpreted as if the internal variables being
defined were installed as ordinary
lambda
variables initialized to their
correct values using
set!
. Section
4.1.6
and
exercise
4.16
showed how to modify the metacircular
interpreter to accomplish this by scanning out internal definitions. Modify
the compiler to perform the same transformation before it compiles a procedure
body.