Read Structure and Interpretation of Computer Programs Online
Authors: Harold Abelson and Gerald Jay Sussman with Julie Sussman
The basic phenomenon here is that synchronizing different processes,
establishing shared state, or imposing an order on events requires
communication among the processes. In essence, any notion of time in
concurrency control must be intimately tied to communication.
51
It is
intriguing that a similar connection between time and communication
also arises in the
Theory of Relativity, where the speed of light (the
fastest signal that can be used to synchronize events) is a
fundamental constant relating time and space. The
complexities we encounter in dealing with time and state in our
computational models may in fact mirror a fundamental complexity of
the physical universe.
34
Most real processors actually execute a few
operations at a time, following a strategy called
pipelining
. Although this technique greatly improves the effective
utilization of the hardware, it is used only to speed up the execution
of a sequential instruction stream, while retaining the behavior of
the sequential program.
35
To quote some graffiti seen on a Cambridge
building wall: “Time is a device that was invented to keep everything
from happening at once.”
36
An even worse failure for this system
could occur if the two
set!
operations attempt to change the
balance simultaneously, in which case the actual data appearing in
memory might end up being a random combination of the information
being written by the two processes. Most computers have interlocks on
the primitive memory-write operations, which protect against such
simultaneous access. Even this seemingly simple kind of protection,
however, raises implementation challenges in the design of
multiprocessing computers, where elaborate
cache-coherence
protocols are required to ensure that the various processors will
maintain a consistent view of memory contents, despite the fact that
data may be replicated (“cached”) among the different processors to
increase the speed of memory access.
37
The factorial program in
section
3.1.3
illustrates this for a single
sequential process.
38
The columns show the contents of Peter's wallet,
the joint account (in Bank1), Paul's wallet, and Paul's private account
(in Bank2), before and after each withdrawal (W) and deposit (D).
Peter withdraws $10 from Bank1; Paul deposits $5 in Bank2,
then withdraws $25 from Bank1.
39
A more formal way to express this idea is to say that
concurrent programs are inherently
nondeterministic
. That
is, they are described not by single-valued functions, but by
functions whose results are sets of possible values. In
section
4.3
we will study a
language for expressing nondeterministic
computations.
40
Parallel-execute
is not part of standard Scheme, but
it can be implemented in MIT Scheme. In our implementation, the
new concurrent processes also run concurrently with the original
Scheme process. Also, in our implementation, the value returned
by
parallel-execute
is a special control object that can be used
to halt the newly created processes.
41
We have simplified
exchange
by exploiting the fact
that our
deposit
message accepts negative amounts. (This is a
serious bug in our banking system!)
42
If the account balances start out as $10,
$20, and $30, then after any number of concurrent exchanges, the
balances should still be $10, $20, and
$30 in some order. Serializing the deposits to individual accounts is not
sufficient to guarantee this. See exercise
3.43
.
43
Exercise
3.45
investigates why
deposits and withdrawals are no longer automatically serialized
by the account.
44
The term “mutex” is an abbreviation for
mutual
exclusion
. The general problem of arranging a mechanism that permits
concurrent processes to safely share resources is called the mutual
exclusion problem. Our mutex is a simple variant of the
semaphore
mechanism (see exercise
3.47
), which was
introduced in the
“THE” Multiprogramming System developed at the
Technological University of Eindhoven and named for the university's
initials in Dutch (Dijkstra 1968a). The acquire and
release operations were originally called
P and V, from the Dutch
words
passeren
(to pass) and
vrijgeven
(to release), in
reference to the semaphores used on railroad systems. Dijkstra's
classic exposition (1968b) was one of the first to clearly present the
issues of concurrency control, and showed how to use semaphores to
handle a variety of concurrency problems.
45
In most
time-shared operating systems, processes that are
blocked by a mutex do
not waste time “busy-waiting” as above. Instead, the system
schedules another process to run while the first is waiting, and the blocked
process is awakened when the mutex becomes available.
46
In MIT Scheme for a single processor, which uses a time-slicing
model,
test-and-set!
can be implemented as follows:
(define (test-and-set! cell)
(without-interrupts
(lambda ()
(if (car cell)
true
(begin (set-car! cell true)
false)))))
Without-interrupts
disables
time-slicing interrupts while its procedure argument is being executed.
47
There are many variants of such
instructions – including test-and-set, test-and-clear, swap,
compare-and-exchange, load-reserve, and store-conditional – whose
design must be carefully matched to the machine's processor-memory
interface.
One issue that arises here is to determine what happens
if two processes attempt to acquire the same resource
at exactly the same time by using such an instruction.
This requires some mechanism
for making a decision about which process gets control. Such a
mechanism is called an
arbiter
. Arbiters usually boil down to
some sort of hardware device.
Unfortunately, it is possible to prove that one cannot physically
construct a fair arbiter that works 100% of the time unless one
allows the arbiter an arbitrarily long time to make its decision. The
fundamental phenomenon here was originally observed by the fourteenth-century
French philosopher
Jean Buridan in his commentary on
Aristotle's
De caelo
. Buridan argued that a perfectly rational
dog placed between two equally attractive sources of food will starve
to death, because it is incapable of deciding which to go to first.
48
The general technique for avoiding deadlock by numbering the
shared resources and acquiring them in order is due to
Havender
(1968). Situations where deadlock cannot be avoided require
deadlock-recovery
methods, which entail having processes “back out”
of the deadlocked state and try again. Deadlock-recovery
mechanisms are widely used in database management systems, a topic that
is treated in detail in Gray and Reuter 1993.
49
One such alternative to serialization is called
barrier
synchronization
. The programmer permits concurrent processes to
execute as they please, but establishes certain synchronization points
(“barriers”) through which no process can proceed until all the
processes have reached the barrier. Modern processors provide machine
instructions that permit programmers to establish synchronization
points at places where consistency is required. The
PowerPC
T
M
, for example, includes for this purpose two instructions called
SYNC and
EIEIO (Enforced In-order Execution of Input/Output).
50
This may seem like a strange point of view, but there are
systems that work this way. International charges to credit-card
accounts, for example, are normally cleared on a per-country basis,
and the charges made in different countries are periodically
reconciled. Thus the account balance may be different in
different countries.
51
For distributed
systems, this perspective was pursued by
Lamport (1978), who showed how
to use communication to establish “global clocks” that can be used
to establish orderings on events in distributed systems.
We've gained a good understanding of assignment as a tool in modeling,
as well as an appreciation of the complex problems that assignment
raises. It is time to ask whether we could have gone about things in a
different way, so as to avoid some of these problems. In this
section, we explore an alternative approach to modeling state, based
on data structures called
streams
. As we shall see, streams can
mitigate some of the complexity of modeling state.
Let's step back and review where this complexity comes from. In an
attempt to model real-world phenomena, we made some apparently
reasonable decisions: We modeled real-world objects with local state
by computational objects with local variables. We identified time
variation in the real world with time variation in the computer. We
implemented the time variation of the states of the model objects in
the computer with assignments to the local variables of the model
objects.
Is there another approach? Can we avoid identifying time in the
computer with time in the modeled world? Must we make the model
change with time in order to model phenomena in a changing world?
Think about the issue in terms of mathematical functions. We can
describe the time-varying behavior of a quantity
x
as a function of
time
x
(
t
). If we concentrate on
x
instant by instant, we think of
it as a changing quantity. Yet if we concentrate on the entire
time history of values, we do not emphasize change – the function
itself does not change.
52
If time is measured in discrete steps, then we can model a time function as
a (possibly infinite) sequence. In this section, we will see how to
model change in terms of sequences that represent the time histories
of the systems being modeled. To accomplish this, we introduce new
data structures called
streams
. From an abstract point of view,
a stream is simply a sequence. However, we will find that the
straightforward implementation of streams as lists (as in
section
2.2.1
) doesn't fully reveal
the power of stream processing. As an alternative, we introduce the
technique of
delayed evaluation
, which enables us to represent
very large (even infinite) sequences as streams.
Stream processing lets us model systems that have state without ever
using assignment or mutable data. This has important implications,
both theoretical and practical, because we can build models that avoid
the drawbacks inherent in introducing assignment. On the other hand,
the stream framework raises difficulties of its own, and the question
of which modeling technique leads to more modular and more easily
maintained systems remains open.
As we saw in section
2.2.3
,
sequences can serve as standard interfaces for combining program
modules. We formulated powerful abstractions for manipulating
sequences, such as
map
,
filter
, and
accumulate
, that
capture a wide variety of operations in a manner that is both succinct
and elegant.
Unfortunately, if we represent sequences as lists, this elegance is
bought at the price of severe inefficiency with respect to both the
time and space required by our computations.
When we represent manipulations on sequences as transformations
of lists, our programs must construct and copy data structures (which
may be huge) at every step of a process.
To see why this is true, let us compare two programs for computing the
sum of all the prime numbers in an interval. The first program is
written in standard iterative style:
53
(define (sum-primes a b)
(define (iter count accum)
(cond ((> count b) accum)
((prime? count) (iter (+ count 1) (+ count accum)))
(else (iter (+ count 1) accum))))
(iter a 0))
The second program performs the same computation using the sequence
operations of section
2.2.3
:
(define (sum-primes a b)
(accumulate +
0
(filter prime? (enumerate-interval a b))))
In carrying out the computation, the first program needs to store only
the sum being accumulated. In contrast, the filter in the second
program cannot do any testing until
enumerate-interval
has
constructed a complete list of the numbers in the interval. The
filter generates another list, which in turn is passed to
accumulate
before being collapsed to form a sum. Such large
intermediate storage is not needed by the first program, which we can
think of as enumerating the interval incrementally, adding each prime
to the sum as it is generated.
The inefficiency in using lists becomes painfully apparent if we use
the sequence paradigm to compute the second prime in the interval from
10,000 to 1,000,000 by evaluating the expression