Monday, July 31, 2006

8 bits = ?

Prior to IBM's 1964 introduction of the System/360 family of computers [1], computer memories were generally measured and addressed using the same units in which they were accessed: words (of 8, 12, 16, 24, 29, 32, 48, or 60 bits, depending on the computer), characters (generally represented by 6 or 7 bits), or digits (BCD, represented by 4 bits).

All System/360 computers executed the same instruction set, despite their wide range of size and performance. Naturally, the larger, faster machines accessed memory in larger chunks [2], so there could no longer be a direct correspondence between the unit of addressing and the unit of access. IBM chose the unit of addressing to be 8 bits, and introduced a new set of 8-bit character codes called EBCDIC (Extended Binary Coded Decimal Interchange Code). All System/360 computers could execute instructions that operated on 8, 16, 32, or 64-bit quantities, regardless of the size of a memory access on that particular model.

IBM also chose the term byte for an 8-bit chunk of information, and called the successively larger units halfwords, words, and doublewords.

At that time, magnetic core memory was priced at about $1 per byte. Punning on the American slang of bit for an eighth of a dollar [3] and buck for a dollar, I suggested that the units should instead be called bucks, quarter-eagles, half-eagles, and eagles.

Fortunately, no one took my suggestion seriously. Following the hyperinflation of the memory exchange rate over the past 40 years, can you imagine buying a multi-GigaBuck memory card for your hundred-buck PDA, cellphone, or camera?

Notes:
[1] According to the IBM Archives, "On April 7, 1964, IBM introduced the System/360, the first large 'family' of computers to use interchangeable software and peripheral equipment. It was a bold departure from the monolithic, one-size-fits-all mainframe. Fortune magazine dubbed it 'IBM's $5 billion gamble.' System/360 offered a choice of five processors and 19 combinations of power, speed and memory. A user could operate the same magnetic tape and disk products as another user with a processor 100 times more powerful."
[2] Various models in the family fetched and stored data 8, 16, 32, or 64 bits at a time.
[3] "Two bits, four bits, six bits, a dollar. All those for [name of school] stand up and holler."

0 comments

Friday, July 28, 2006

Recursion?

In the summer of 1961 I attended a lecture in Los Angeles by a little-known Danish computer scientist. His name was Peter Naur [1], and his topic was the new algorithmic language Algol 60. He covered a lot of concepts that I hadn't picked up from my limited study of Algo, though of course I didn't really absorb very much of them from one lecture.

But I do remember one thing clearly:
At the end there was a question period, and the man next to me stood up. "It seems to me that there is an error in one of your slides."

Peter was puzzled, "No, I don't think so. Which slide, and what do you think is wrong with it?"

"It's the slide that shows a subroutine calling itself. [2] That's impossible to implement." [3]

Now Peter was even more puzzled, "But we have implemented the whole language, and run all the examples through our compiler."

The man sat down, but was still muttering to himself, "Impossible! Impossible!" And I suspect that much of the audience agreed with him.

Had the questioner but realized it, Algol 60's name parameters were a far greater implementation challenge than recursion, but the GIER compiler got them right, too.

Edited to add: I have just encountered some evidence that Alan Turing anticipated an important element of the call stack in his Pilot ACE proposal.

Notes:
[1] Peter Naur is the latest recipient of the prestigious A. M. Turing Award, in large part for his work connected with the Algol 60 Report and the GIER compiler. Here is a picture of me with him at the ACM Awards Banquet in San Francisco last April.

[2] This may have been the striking example of doing double integration by calling the numerical integration procedure with a call of itself as a name parameter, and the other parameters controlled by "Jensen's device."
[3] At the time it was fairly common practice to statically allocate the memory for a subroutine's code, its local variables, and its return address all together. The call stack had been independently invented at least twice in Europe, under different names, but was still not widely understood in America.

0 comments

Sunday, July 16, 2006

Bit-serial arithmetic

[This is part of a series of expository, rather than anecdotal, notes, but it provides background for my discussion of the Bendix G-15 architecture. Skip it if you're interested only in my stories.]

With the "embarrassing parallelism" of today's microcircuits, where the challenge is to keep many of the millions of available transistors busy at the same time, it is easy to forget just how little hardware is needed to mechanize binary arithmetic, performed serially. [1]

The basic algorithms are closely akin to those for multi-digit arithmetic that I was taught in elementary school. (For brevity, I will abbreviate "as in grade school arithmetic" to "AIGSA".) However, the algorithms are much simpler for binary, because of the small size of the addition and multiplication tables (0 times 0 is 0, 0 times 1 is 0, 1 times 0 is 0, 1 times 1 is 1) and their ease of translation into binary logic (x times y, is just x and y).

Number Representation

Today's computers almost universally represent negative numbers in "two's-complement" form (the result of subtracting the number's magnitude from zero and ignoring the underflow). But in the 50s and 60s other binary representations were common, notably sign-magnitude and one's-complement. The G-15 used two's-complement for addition and subtraction, and sign-magnitude for multiplication and division. It was the responsibility of the programmer to keep track of which form a value was in. The sign-magnitude form was commonly used for I/O and storage in main memory.

Fortunately, bit-at-a-time conversion between the two forms is easy working right-to-left (least significant bit first) and requires only two extra bits of memory (flip-flops):
  • If the sign is positive, pass all the bits unchanged.
  • If the sign is negative, pass all bits up to and including the first 1 bit unchanged; thereafter, complement each bit--replacing 0 by 1 and 1 by 0. [2]

The two's complement operation is also self-inverse--the complement of the complement of x is x.

The G-15 processed all numbers right-to-left (least significant bit first). [3] However, the sign bit preceded all the other bits, to enable complementation on the fly when needed.

The Binary Point, Scaling, and Shifting

Fixed point arithmetic is not restricted to integers. Many pocket calculators, for example, allow the user to choose the number of digits to display after the decimal point (typically two for "dollars and cents" calculations). This is even more important for a machine without floating point arithmetic.

In the G-15, a value's "binary point," unlike its sign, had no explicit internal representation. It was up to the programmer to keep track of the binary points of all values throughout the course of a computation. Values were often treated as integers (binary point 0, i.e., immediately following the least significant digit) or fractions (binary point 28 or 57, i.e., immediately preceding the most significant digit), but this was entirely the programmer's choice.

To align the binary points of scaled numbers (or to change their scaling for other reasons) it is necessary to shift them left or right--equivalent to doubling or halving them, respectively. The G-15 provided two double-precision shift registers, one for each direction. [4][5]

Unsigned Addition

AIGSA, operate one digit (bit) at a time, from the right. One extra flip-flop is needed to propagate a carry to the digit to the left (the next "bit time"). At each bit time there are three bits of input--the two operands and the carry bit. The result bit is the xor or "parity" of those three bits--0 if the number of 1s is even, 1 if it is odd. The carry bit is 0 if none or one of them is 1, and 1 if two or three are. That's it. Proceed to the next bit.

A carry bit of 1 following calculation of the last (most significant) bit indicates overflow.

AIGSA, it is necessary to align the binary points of operands before adding them. E.g.,
100.01
+10.
110.01
rather than
100.01
+0010.
100.11
The binary point of the result is the same as that of the two aligned operands.

Signed Addition

Use two's-complement form for negative numbers, but add them as above.

If both operands have the same sign, that is the sign of the result. A final carry bit that is different from the sign indicates overflow.

If the operands have opposite signs, no overflow can occur, and the final carry bit indicates the sign of the result. [6]

Subtraction

To subtract a number, change its sign then add it (complementing if negative).

Multiplication and Division

Addition and subtraction can be done in a single serial "pass," doing complementation as needed "on the fly."

Multiplication and division can more easily be done AIGSA with simple iterative processes using the sign-magnitude representation. The processes involve three registers (two for the operands and one for the result): an accumulator to which values can be added or subtracted, and two shift registers, as mentioned previously.

For either multiplication or division, the sign of the result is the xor of the signs of the two operands. It was computed and stored in the IP flip-flop as the operands were loaded, and inserted in the result when it was stored.

Multiplication

Initially zero the accumulator, place the multiplier in the right-shifting register, and the multiplicand in the left-shifting register.

AIGSA, work from right-to-left, performing a series of multiplications of the multiplicand by a single digit of the multiplier, and summing the partial products. Since the single digits can only be 0 or 1, each partial product is either 0 or the shifted multiplicand. Rather than forming all the partial products and then adding them, add them as they are formed. I.e., at each step, test the least significant bit of the multiplier register, and if it is 1, add the multiplicand register to the product register. Repeat with the registers shifted, until the entire product has been formed.

The binary point of the result is the sum of the binary points of the operands.

Division

Initially place the numerator in the accumulator register and the denominator in the right-shifting register. Zero the left-shifting register.

AIGSA, work from left to right. Subtract the "trial divisor" times the denominator from the numerator and observe the sign. (In binary arithmetic, 1 is the only interesting trial divisor.) If the sign is negative, the trial divisor "did not go"; the product should be added back, and a smaller divisor tried (in binary, this can only be 0). If the sign is positive, the trial divisor "did go," and is the next digit of the result. Insert it as the low-order bit of the developing quotient in the left-shifting register. Shift the denominator to the right to continue generating digits of the quotient.

The G-15 used a simple identity to simplify the implementation of the iteration described above. If the trial divisor of 1 did not go, rather than adding the product back, and then in the next iteration subtracting the product of the next trial divisor (which would be exactly half of the previous product), it did not do the "add back," but simply added the shifted denominator at the next step.

The binary point of the result is the difference between the binary points of the operands.

Notes:
[1] Most serial binary machines must have used algorithms similar to those I describe here, but I had personal experience only with the G-15. I shall describe what I know.
[2] The proof of this is left as an exercise for the reader, as will be various other simple properties of binary arithmetic.
[3] Bendix hardware engineers used special Tektronix oscilloscopes that swept from right-to-left, rather than left-to-right, to make numbers more readable on the scope.
[4] A single register that could be shifted in either direction would suffice for this purpose, but these registers had additional uses, as we shall see presently.
[5] It should be obvious that only a single extra flip-flop is needed to shift to the left (delay by one bit time), but it may be less obvious how to shift to the right using a single extra flip-flop. No, it does not require the mythical if gate that emits a pulse one bit time before receiving it.
[6] "But the sign preceded the least significant bit!" Yes, and fortunately, with a recirculating accumulator, it is available again at just the right time. Technically, this is the first bit-time of the next word-time, but the programmer cannot detect the late substitution. "The hardware must cheat, but it must not get caught."

0 comments

Tuesday, July 04, 2006

My first high-level language: Algo

Bendix Computer Division produced one of the first implementations of a language in the Algol family. The Algo system for the G-15 was based on Algol 58. It was a three-pass compiler [1] that produced code for a virtual machine rather similar to Intercom 1000.

Due to memory limitations, intermediate data between passes of the Algo compiler had to be punched out on paper tape and then read back in for the next pass. So to compile and run a program the user needed to:
0) Punch the source program on paper tape (online or offline).
1) Place the cassette [2] for pass 1 in the tape reader and boot it.
2) Replace the cassette by the source program and read it in.
3) Remove the output of pass 1 from the tape punch.
4) Replace the source program tape by the cassette for pass 2 and boot it.
5) Replace the cassette by the output of pass 1 and read it in.
6) Remove the output of pass 2 from the tape punch.
7) Replace the intermediate tape by the cassette for pass 3 and boot it.
8) Replace the cassette by the output of pass 2 and read it in.
9) Remove the output of pass 3 (the compiled program) from the tape punch.
10) Replace the intermediate tape by the cassette for the interpreter and boot it.
11) Repace the cassette by the compiled program, read it in, and begin execution.

For a really short program, this could take as little as 15 minutes. Of course, to run a previously-compiled program, only steps 10 and 11 were needed.

My first attempt to use Algo was for a VIP demonstration in the DPL. Milton Barber and I started work the night before [3] on an example to illustrate how "without a special knowledge of programming for electronic computers, anyone with a background of high school Algebra may express a program in the Algo language for the G-15 computer."

I no longer remember what program we were trying to write, but it was probably something simple enough to explain to our audience, like finding the roots of a quadratic equation. We fed our first program in, and 15 minutes later discovered that it didn't work. We consulted the manual and the cryptic diagnostics, found a problem, thought we fixed it, fed that program in, discovered it didn't work, either. (Lather, rinse, repeat.)

By 10:00 the next morning, with a 2:00 deadline looming, we gave up on Algo, wrote the program in machine language, debugged it, and had it ready for the demo. I doubt that the VIPs would have understood the importance of algebraic notation for programming, even if we could have made it work.

Oh, yes. This was also the last program I wrote in Algo.

Notes:
[1] The names of two of the passes have stuck with me: LYZER/LATER and RATOR/CATOR.
[2] Because of the unruliness of rolls of punched paper tape, frequently-used programs were stored in aluminum cassettes rather like oversized versions of today's magnetic tape cassettes.
[3] Even then it was accepted that programmers work better at night and under pressure.

0 comments