Structure and Interpretation of Computer Programs (99 page)

Read Structure and Interpretation of Computer Programs Online

Authors: Harold Abelson and Gerald Jay Sussman with Julie Sussman

BOOK: Structure and Interpretation of Computer Programs
13.46Mb size Format: txt, pdf, ePub

gc-loop
  (test (op =) (reg scan) (reg free))
  (branch (label gc-flip))
  (assign old (op vector-ref) (reg new-cars) (reg scan))
  (assign relocate-continue (label update-car))
  (goto (label relocate-old-result-in-new))

At
update-car
, we modify the
car
pointer of the pair being
scanned, then proceed to relocate the
cdr
of the pair. We
return to
update-cdr
when that relocation has been accomplished.
After relocating and updating the
cdr
, we are finished scanning
that pair, so we continue with the main loop.

update-car
  (perform
   (op vector-set!) (reg new-cars) (reg scan) (reg new))
  (assign old (op vector-ref) (reg new-cdrs) (reg scan))
  (assign relocate-continue (label update-cdr))
  (goto (label relocate-old-result-in-new))
update-cdr
  (perform
   (op vector-set!) (reg new-cdrs) (reg scan) (reg new))
  (assign scan (op +) (reg scan) (const 1))
  (goto (label gc-loop))

The subroutine
relocate-old-result-in-new
relocates objects as
follows: If the object to be relocated (pointed at by
old
) is
not a pair, then we return the same pointer to the object unchanged
(in
new
). (For example, we may be scanning a pair whose
car
is the number 4. If we represent the
car
by
n4
, as
described in section 
5.3.1
, then we want the
“relocated”
car
pointer to still be
n4
.) Otherwise, we
must perform the relocation. If the
car
position of the pair to
be relocated contains a broken-heart tag, then the pair has in fact
already been moved, so we retrieve the forwarding address (from the
cdr
position of the broken heart) and return this in
new
.
If the pointer in
old
points at a yet-unmoved pair, then we move
the pair to the first free cell in new memory (pointed at by
free
) and set up the broken heart by storing a broken-heart tag and
forwarding address at the old location.
Relocate-old-result-in-new
uses a register
oldcr
to hold the
car
or the
cdr
of the object pointed at by
old
.
18

relocate-old-result-in-new
  (test (op pointer-to-pair?) (reg old))
  (branch (label pair))
  (assign new (reg old))
  (goto (reg relocate-continue))
pair
  (assign oldcr (op vector-ref) (reg the-cars) (reg old))
  (test (op broken-heart?) (reg oldcr))
  (branch (label already-moved))
  (assign new (reg free)) 
; new location for pair
  
;; Update 
free
 pointer.
  (assign free (op +) (reg free) (const 1))
  
;; Copy the 
car
 and 
cdr
 to new memory.
  (perform (op vector-set!)
           (reg new-cars) (reg new) (reg oldcr))
  (assign oldcr (op vector-ref) (reg the-cdrs) (reg old))
  (perform (op vector-set!)
           (reg new-cdrs) (reg new) (reg oldcr))
  
;; Construct the broken heart.
  (perform (op vector-set!)
           (reg the-cars) (reg old) (const broken-heart))
  (perform
   (op vector-set!) (reg the-cdrs) (reg old) (reg new))
  (goto (reg relocate-continue))
already-moved
  (assign new (op vector-ref) (reg the-cdrs) (reg old))
  (goto (reg relocate-continue))

At the very end of the garbage-collection process, we interchange the
role of old and new memories by interchanging pointers: interchanging
the-cars
with
new-cars
, and
the-cdrs
with
new-cdrs
. We will then be ready to perform another garbage
collection the next time memory runs out.

gc-flip
  (assign temp (reg the-cdrs))
  (assign the-cdrs (reg new-cdrs))
  (assign new-cdrs (reg temp))
  (assign temp (reg the-cars))
  (assign the-cars (reg new-cars))
  (assign new-cars (reg temp))

5
We could represent memory as lists of items.
However, the access time would then not be independent of the index,
since accessing the
n
th element of a list requires
n
- 1
cdr
operations.

6
For completeness, we should specify a
make-vector
operation that constructs vectors. However, in the present
application we will use vectors only to model fixed divisions of the
computer memory.

7
This is
precisely the same
“tagged data” idea we introduced in chapter 2 for
dealing with generic operations. Here, however, the data types are
included at the primitive machine level rather than constructed
through the use of lists.

8
Type information may be encoded in a variety of
ways, depending on the details of the machine on which the Lisp
system is to be implemented. The execution efficiency of Lisp
programs will be strongly dependent on how cleverly this choice is
made, but it is difficult to formulate general design rules for good
choices. The most straightforward way to implement typed pointers is
to allocate a fixed set of bits in each pointer to be a
type
field
that encodes the data type. Important questions to be
addressed in designing such a representation include the following:
How many type bits are required? How large must the vector indices
be? How efficiently can the primitive machine instructions be used to
manipulate the type fields of pointers? Machines that include special
hardware for the efficient handling of type fields are said to have
tagged architectures
.

9
This decision on the
representation of numbers determines whether
eq?
, which tests
equality of pointers, can be used to test for equality of numbers. If
the pointer contains the number itself, then equal numbers will have
the same pointer. But if the pointer contains the index of a location
where the number is stored, equal numbers will be guaranteed to have
equal pointers only if we are careful never to store the same number
in more than one location.

10
This is just like writing a number as a sequence of
digits, except that each “digit” is a number between 0 and the
largest number that can be stored in a single pointer.

11
There are other ways
of finding free storage. For example, we could link together all the
unused pairs into a
free list
. Our free locations are
consecutive (and hence can be accessed by incrementing a pointer)
because we are using a compacting garbage collector, as we will see in
section 
5.3.2
.

12
This is essentially the implementation of
cons
in terms of
set-car!
and
set-cdr!
, as described in
section 
3.3.1
. The operation
get-new-pair
used in that implementation is realized here by the
free
pointer.

13
This may not be true eventually,
because memories may get large enough so that it would be impossible
to run out of free memory in the lifetime of the computer. For
example, there are about 3× 10
13
, microseconds in a year, so
if we were to
cons
once per microsecond we would need about
10
15
cells of memory to build a machine that could operate for 30
years without running out of memory. That much memory seems absurdly
large by today's standards, but it is not physically impossible. On
the other hand, processors are getting faster and a future computer
may have large numbers of processors operating in parallel on a single
memory, so it may be possible to use up memory much faster than we
have postulated.

14
We
assume here that the stack is represented as a list as described in
section 
5.3.1
, so that items on the stack are
accessible via the pointer in the stack register.

15
This idea was invented and first implemented
by Minsky, as part of the implementation of
Lisp for the PDP-1 at the
MIT Research Laboratory of Electronics. It was further developed by
Fenichel and Yochelson (1969) for use in the Lisp implementation for
the Multics time-sharing system. Later,
Baker (1978) developed a
“real-time” version of the method, which does not require the
computation to stop during garbage collection. Baker's idea was
extended by
Hewitt, Lieberman, and Moon (see Lieberman and Hewitt
1983) to take advantage of the fact that some structure is more volatile
and other structure is more permanent.

An alternative commonly used garbage-collection technique is the
mark-sweep
method. This consists of tracing all the structure
accessible from the machine registers and marking each pair we reach.
We then scan all of memory, and any location that is unmarked is
“swept up” as garbage and made available for reuse. A full
discussion of the mark-sweep method can be found in Allen 1978.

The Minsky-Fenichel-Yochelson algorithm is the dominant algorithm in
use for large-memory systems because it examines only the useful part
of memory. This is in contrast to mark-sweep, in which the sweep
phase must check all of memory. A second advantage of stop-and-copy
is that it is a
compacting
garbage collector. That is, at the
end of the garbage-collection phase the useful data will have been
moved to consecutive memory locations, with all garbage pairs
compressed out. This can be an extremely important performance
consideration in machines with virtual memory, in which accesses to
widely separated memory addresses may require extra paging
operations.

16
This list of registers does not include
the registers used by the storage-allocation system –
root
,
the-cars
,
the-cdrs
, and the other registers that will be
introduced in this section.

17
The term
broken heart
was coined by David Cressey, who wrote a garbage
collector for
MDL, a dialect of Lisp developed at MIT during the early
1970s.

18
The garbage collector uses the low-level predicate
pointer-to-pair?
instead of the list-structure
pair?
operation because in a real system there might be various things
that are treated as pairs for garbage-collection purposes.
For example, in a Scheme system that conforms to the IEEE standard
a procedure object may be implemented as a special kind of “pair”
that doesn't satisfy the
pair?
predicate.
For simulation purposes,
pointer-to-pair?
can be implemented as
pair?
.

5.4  The Explicit-Control Evaluator

In section 
5.1
we saw how to
transform simple Scheme programs into descriptions of register
machines. We will now perform this transformation on a more complex
program, the metacircular evaluator of
sections
4.1.1
-
4.1.4
,
which shows how
the behavior of a Scheme interpreter can be described in terms of the
procedures
eval
and
apply
.
The
explicit-control
evaluator
that we develop in this section shows how the underlying
procedure-calling and argument-passing mechanisms used in the
evaluation process can be described in terms of operations on
registers and stacks. In addition, the explicit-control evaluator can
serve as an implementation of a Scheme interpreter, written in a
language that is very similar to the native machine language of
conventional computers. The evaluator can be executed by the
register-machine simulator of section 
5.2
.
Alternatively, it can be used as a starting point for building a
machine-language implementation of a Scheme evaluator, or even a
special-purpose machine for evaluating Scheme expressions.
Figure 
5.16
shows such a hardware implementation: a
silicon chip that acts as an evaluator for Scheme. The chip designers
started with the data-path and controller specifications for a
register machine similar to the evaluator described in this section
and used design automation programs to construct the
integrated-circuit layout.
19

Registers and operations

In designing the explicit-control evaluator, we must specify the
operations to be used in our register machine. We described the
metacircular evaluator in terms of abstract syntax, using procedures
such as
quoted?
and
make-procedure
. In implementing the
register machine, we could expand these procedures into sequences of
elementary list-structure memory operations, and implement these
operations on our register machine. However, this would make our
evaluator very long, obscuring the basic structure with
details. To clarify the presentation, we will include as primitive
operations of the register machine the syntax procedures given in
section 
4.1.2
and the procedures for
representing environments and other run-time data given in
sections 
4.1.3
and 
4.1.4
.
In order to completely specify an evaluator that could be programmed
in a low-level machine language or implemented in hardware, we would
replace these operations by more elementary operations, using the
list-structure implementation we described in
section 
5.3
.

Figure 5.16:
  A silicon-chip implementation of an evaluator for
Scheme.

Other books

The Gallipoli Letter by Keith Murdoch
The Journey Begun by Judisch, Bruce
Black Water Creek by Brumm, Robert
The Island by Minkman, Jen
Amy Chelsea Stacie Dee by Mary G. Thompson
Broken Storm Part One by May C. West
Shadow Man by Grant, Cynthia D.
Velocity by Cassandra Carr
El club erótico de los martes by Lisa Beth Kovetz