Origins of C languages
BCPL was
designed by Martin Richards in the mid-1960s while visiting MIT and was used in
the early 1970s for several interesting projects, including the Oxford [Stoy
72] OS6 operating system, and parts of the Alto seminal work at Xerox PARC
[Thacker 79]. We became acquainted with this because the MIT CTSS system
[Corbato 62] on which Richards worked was used for the development of Multics.
The original BCPL compiler was transported to both the Multics system and the
GE-635 GECOS system by Rum Canaday and others at Bell Labs [Canada 69]; During
the final throes of Multics' life at Bell Labs, and immediately thereafter, it
was the language of choice among the group of people who would later become
involved with Unix.
BCPL, B, and
C all fit firmly into the traditional procedural family of Fortran and Algol
60. They are particularly oriented towards system programming, are small and
compactly described, and can be translated by simple compilers. They are close
to the machine' in that the abstracts they introduce are easily grounded in the
concrete data types and operations provided by conventional computers, and rely
on library routines for input-output and other interactions with the operating
system. With less success, library procedures are also used to specify
interesting control structures, such as coroutines and process closures. At the
same time, their abstractions lie at a sufficiently high level that it is
possible, with care, to achieve portability between machines.
BCPL, B, and C differ syntactically in many details but are generally similar. Programs consist of a sequence of global statements and function (procedure) declarations. Procedures may be nested in BCPL, but may not refer to non-static objects defined in procedures. B and C avoid this restriction by imposing a more serious one: no nested procedures at all. Each language (except the earlier versions of B) recognizes a separate compilation and provides a means to include text from named files.
Several BCPL syntactic and lexical mechanisms are more elegant and regular than those of B and C. For example, the BCPL procedure and the data declarations have a more uniform structure and a more complete set of loop constructions is provided. Although the BCPL programs are theoretically supplied from an undelimited stream of characters, the clever rules allow most semicolons to be elided after statements that end on a line boundary.
B and C omit this convenience, and end most of the statements with semicolons. Despite the differences, the majority of BCPL statements and operators map directly to corresponding B and C.
Some of the structural differences between BCPL and B stemmed from the limitations of the intermediate memory. BCPL declarations, for example, may take the form
let P1 be
command
and P2 be
command
and P3 be
command
...
Where the
text of the program represented by the commands contains complete procedures.
Subdeclarations connected to and occurring at the same time, so the name P3 is
known in the P1 procedure. In the same way, BCPL can package a group of
declarations and statements into a value-giving expression, for example
E1 := valof
( declarations ; commands ; resultis E2 ) + 1
The BCPL
compiler can easily handle such constructs by storing and analyzing the parsed
representation of the entire program in memory before the output is generated.
The storage limitations of the B compiler required a one-pass technique in which
output was generated as soon as possible, and the syntactic redesign that made
this possible was forwarded to C.
Some less pleasant aspects of BCPL were due to their technical problems and were consciously avoided in the design of B. For example, the BCPL uses a 'global vector' mechanism to communicate between separately compiled programs. The BCPL compiler can easily handle such constructs by storing and analyzing the parsed representation of the entire program in memory before the output is generated. The storage limitations of the B compiler required a one-pass technique in which output was generated as soon as possible, and the syntactic redesign that made this possible was forwarded to C. Some less pleasant aspects of BCPL were due to their technical problems and were consciously avoided in the design of B. For example, the BCPL uses a 'global vector' mechanism to communicate between separately compiled programs.
Other
violins in the transition from BCPL to B have been introduced as a matter of
taste, and some remain controversial, such as the decision to use a single
character = for assignment instead of:=. Similarly, B uses/**/to enclose
comments where BCPL uses/to ignore text up to the end of the line.
Not every
difference between the BCPL language documented in Richard's book [Richards 79]
and B was deliberate; we started with the earlier version of BCPL [Richards
67]. For example, the end case that escapes from the BCPL switch on the
statement was not present in the language when we learned it in the 1960s, and
so the overloading of the break keyword to escape from the B and C switch
statement is due to diverging evolution rather than conscious change.
let V = vec
10
or in B,
auto V[10];
The effect
is the same: a cell named V is assigned, then another group of 10 contiguous cells
is set aside, and the memory index of the first cells is set to V. By general
rule, the expression B is set to B.
Adds V and I
and refers to the I location after V. Both BCPL and B each add a special
notation to sweeten that array of accesses; in B, the equivalent expression is
V[i]
and in BCPL
V!i
Even at that
time, this approach to arrays was unusual; C would later assimilate it in an
even less conventional manner.
None of the BCPL, B, or C formats strongly support character data in the language; each handles strings much like integer vectors and complements the general rules with a few conventions. In both BCPL and B, a string denotes the address of a static area initialized with string characters packed into cells. In BCPL, the first packed byte contains the number of characters in the string; in B, there is no count and the strings are terminated by a special character that B spelled '*e.' This change was made partly to avoid limiting the length of the string caused by holding the count in an 8-bit or 9-bit slot, and partly because, in our experience, keeping the count seemed less convenient than using the terminator.
In general, individual characters in the BCPL string were manipulated by spreading the string to another array, one character per cell, and then repackaging it later; B provided the corresponding routines, but people more often used other library functions that accessed or replaced individual characters in a string.
No comments:
Post a Comment
Thanks