It Began with Babbage (43 page)

Read It Began with Babbage Online

Authors: Subrata Dasgupta

BOOK: It Began with Babbage
6.86Mb size Format: txt, pdf, ePub
IX

The FORTRAN language as it was originally conceived in 1954 to 1956 (Backus called it FORTRAN
I
48
) would evolve in its features and capabilities over several versions: FORTRAN II, III, and IV; it spawned other “dialects.” It became, among other things, a machine-independent or high-level language. To speak later of FORTRAN was to speak of a language genus, one might say. However, the fundamental and unmistakable
style
of this genus was established with the original ancestor. We get a sense of this style in the little and trivial program written in the “original” FORTRAN shown in
Table 13.1
.

This program reads an array of 20 integers into an array data structure
A
. It then initializes
X, Y, Z
, and
W
to zero, then scans in sequence the integers in
A
and keeps count (in
X
) of the number of integers that are between –50 and + 50 and tallies the total of these numbers in
Y
. It also keeps count of the number of integers that fall outside the range in a separate counter (
Z
) and also adds the total of these numbers in
W
. When all 20 numbers in
A
have been accounted for, it prints out the values of the counters and the corresponding totals and stops.

The formulaic nature of most of these
statements
—by 1956, this word was used to refer to each of the basic meaningful units of action, although they were originally called formulas,
49
hence, no doubt, the name of the language—is quite apparent. Yet, algebraic or formulaic though they seemed, they were not mathematical expressions but descriptions of actions to be performed by a computer.
X
=
X
+ 1 is not an algebraic expression in FORTRAN (indeed, mathematically or logically it would not make sense); rather, it signifies an action. The = is not the mathematical equality symbol; it denotes the assignment operation, synonymous with Zuse's => in Plankalkül (see Section III, this chapter). The statement means: Read the value of the variable
X
, increment by one, and write the resulting value into
X
.

TABLE
13.1 A FORTRAN Program

 

DIMENSION A(20)

 

READ A

 

X=0

 

Y=0

 

Z=0

 

W=0

 

DO 2 I=1,20

 

IF ABS(A(I)) > 50 GO TO 1

 

X=X+1

 

Y=Y+A(I)

 

GO TO 2

1

Z=Z+1

 

W=W+A(I)

2

CONTINUE

 

PRINT X,Y

 

PRINT Z,W

 

STOP

Other statements in
Table 13.1
, such as the
DO
or
IF
would seem, however, less transparent to the neophyte FORTRAN user; the former denotes an iteration, the latter a conditional branch. The purpose of the FORTRAN
Programmer's Reference Manual
was to explain
all
aspects of the language for the user's benefit. Such an explanation took the form of first specifying the legal form of each statement type and the rules governing this form, and an explanation of what the statement meant.

Form and meaning. In the realm of a natural language like English, linguists usually call form the
syntax
of the language, and the meaning its
semantics
. The FORTRAN
Programmer's Manual
of 1956 marked the emergence of a new consciousness among language designers about the linguistic (rather than merely the notational) nature of programming languages. The terms
syntax
and
semantics
(as we will see) would come to be used very soon thereafter, but we find a presence of this consciousness in the writing of the
Programmer's Manual
.

The text describing a programming language must describe all and only what the language is about. For the user of that language, the text is all. But, unlike the postmodernist's “text,” which can be “read” in as many different ways as there are readers, and the writer's intention is not relevant, the text describing a programming language must
provide only one “reading”—there should be only one way of making meaning, one way of interpreting the language.

The FORTRAN group took this obligation seriously. In the
Programmer's Manual
each type of statement is described in terms of its form and meaning, syntax and semantics.

“DO n i = m
1
, m
2
” or “DO n i = m
1
, m
2
, m
3
” where n is a statement number, I is a … variable, and m
1
, m
2
, m
3
are each either an unsigned … constant or a nonsubscripted … variable. If m
3
is not stated it is taken to be 1.
50

Here, then, was the “general form” of the
DO
statement. Alongside were given examples: DO 30 I = 1, 10, DO 30 I = 1, M, 3. Then followed the meaning of the
DO
statement:

The DO statement is a command to “DO the statements which follow, to and including the statement with statement number n, repeatedly, the first time with i = m
1
, and with I increased by m
3
for each succeeding time; after they have been done with i equal to the highest of this sequence of values which does not exceed m
2
let control reach the statement following the statement with statement number n.…
51

Conciseness and elegance—attributes valued by mathematicians, as many of the FORTRAN group were—is in evidence in certain sections of the language description. “Recursion,” the mathematical strategy of defining a concept in terms of itself (such as defining the factorial of a number recursively,
N! = N*(N–1)!
, and
0!=1
) was used to specify the rules for forming algebraic expressions (such as
X+Y
):

By repeated use of the following rules, all permissible expressions may be derived …

4. If E is an expression the (E) is an expression

5. If E and F are expressions … then E+F, E-F, E*F, E/F are expressions.…
52

X

Yet, by Backus's own admission, language design played a secondary role in the FORTRAN endeavor.
53
Rather, as (it was once said) Britain acquired India as part of its empire in a fit of absent-mindedness, so also, it almost seems, reading Backus, that the FORTRAN language was designed absent-mindedly. But Backus and his colleagues, collectively, were too much the engineer–scientists to be absent-minded, even in a metaphorical sense. What Backus really implied, as he would write in 1981, was that—no matter how satisfying the FORTRAN programming language was—as a programming tool, it would never have sufficed. The emphasis had to be on program efficiency. It was this criterion that ruled above all else.
54

The burden of responsibility for producing efficient IBM 704 machine code from a FORTRAN program—later, the former came to be called
object program
or
object code
and the latter,
source program
or
source code
—fell on the compiler or, more precisely, on the compiler writer. By the time the first book-length texts and monographs on compilers began to appear during the mid to late 1960s,
55
a great deal of knowledge had accumulated on the theory, design, and implementation of compilers. And the programmers who built the first FORTRAN compiler made no small contribution to this branch of what, in the 1960s, came to be called
programming systems
.
56

A compiler is a computer program of a rather complex and peculiar sort, for it manipulates and processes other computer programs. It takes as input a program written in a machine-independent or high-level programming language (such as FORTRAN, the source program) and produces as output a
functionally
equivalent program (the object code) that is executable on a specific computer.
57
A complier, then, is an
automatic
translator in the strict linguistic sense; it transforms automatically a text in a programming language
L
into another functionally equivalent text specified in a machine language
M
. Both texts are programs. Both texts are liminal. But, the abstract aspect of a program in the programming language (when it describes an algorithm meant
only
for human–human communication, a piece of “social text”) is more pronounced than in the case of a program in a machine language.

The great difference between a compiler and the “automatic machine translators” that Warren Weaver and others had so desired (see
Chapter 11
, Section IX), was that automatic machine translation referred to translating texts in one
natural
language (say, Russian) into another natural language (say, English). The compiler's task is far more modest. It translates from one
artificial
language into another. And because artificial languages are invented, they can (at least in principle) be defined precisely and restricted in features, without the richness, vagueness, ambiguity, and variability of natural languages.

Thus, the challenges faced by the compiler writers (as, somewhat curiously, programmers who designed and built compilers would come to be called
58
) may not have been as overwhelming as the task faced by programmers who built automatic machine translation systems. However, the challenge of compiler writing was, in many ways, more complex than any other kind of programming experienced until then.

A compiler is, in fact, a system of many components performing many different tasks. It is an amalgam of many different algorithms of varying intricacy and complexity. It must first ensure that the program syntax adheres to the rules of the language. It must “make sense” of the overall structure, and this involves recording the flow of information and the flow of control between program components. To manage the complexity of its work, the compiler must decompose the source program into subunits (called
basic blocks
) that can be analyzed separately. The compiler must gather and record all the information about the program that is necessary for the ultimate task of the complier: generating machine code (look ahead to
Figure 15.2
).

Hovering like the sword of Damocles over all this is the “efficiency imperative.” The proof of the pudding lies in the eating, and “eating”—in this context—means how efficient the object code is compared with “hand-coded” programs in assembly or machine language.

There is a “cultural” gap between the language of the source program and the language of the object program. The former language is abstract and symbolic. In fact, it is itself an abstract artifact; its syntax and semantics refer only implicitly to some (abstract) computer. The latter language is relatively more concrete and physical; it is liminal because, although abstract, it is very close to some real physical computer. The compiler has to straddle the two cultures and “map” texts from one to the other. The sword of Damocles mandates that the object code must be “optimized” as best as possible for both time to execute the object program and space required to store the object program in computer memory.

To repeat, by the mid 1960s, a fairly large body of knowledge about compiling would accumulate. However, the writers of the FORTRAN compiler in the mid 1950s had very little prior knowledge or experience to guide them;
they
were the producers of some of the knowledge that entered the texts and monographs of the 1960s and 1970s.

The FORTRAN compiler was, in present-centered language, a
one-pass compiler
: the source program in its original form was “fed” to the compiler just once; the latter “saw” the source program only once.
59
Thereafter, the source program would be analyzed, dissected, deconstructed—
mangled
, so to speak—all in the cause of producing “optimized” code for the IBM 704.
60

In the case of the FORTRAN compiler, there was not a single writer. Compiler writing was a group effort and different individuals given responsibility for different “sections” of the compiler. The compiler was organized into six sections to work in sequence, each section passing its output to the one that followed, with the last section assembling the final object program in IBM 704 machine code.
61
Naturally, “code optimization” was a significant functional goal for the compiler, and various kinds of optimization strategies were distributed among the different sections. Some of the algorithms invented for object code generation and code optimization would have consequences, and they came to influence later compilers. These included an algorithm developed for compiling assignment statements involving arithmetic expressions (for example, a statement such as
X
= (
A
[
I
] *
B
[
I
])/
C
[
K
])
62
and an algorithm for allocating machine registers to variables (the
register allocation problem
, in present-centered language).
63
As a later author would write, for these reasons the FORTRAN compiler for the IBM 704 was certainly the most significant of the first generation of compilers.
64

Other books

The Blackmail Club by David Bishop
Scrappy Little Nobody by Anna Kendrick
Let's Play in the Garden by Grover, John
The Savages by Matt Whyman
Dead in the Dog by Bernard Knight
Heart Fate by Robin D. Owens
Stephen King's N. by Marc Guggenheim, Stephen King, Alex Maleev
Critical by Robin Cook
The Darke Crusade by Joe Dever