The Cray-1 System

Figure :
./images/1976_cray.jpg
Figure :
./images/CRAY.jpg
This is what a CRAY 1 looks like without its skins. The tower portion is the actual computer, and the electronics in he "seat" below are the power supplies, about 150 kilowatts. An upholstered bench fitted all around the machine, on which one could sit. During a seminar at Stanford University, Seymour told the students he did things that way because it was good to sit there keeping warm on those cold Wisconsin nights. (The first CRAY 1s were designed and initially built in the Cray Research Corporation facility in Chippewa Falls, Wisconsin.)
Figure :
./images/Cray1.jpg


The Cray-1, while arguably a natural outgrowth of the CDC 66000, 7600, and Star-100 systems, was also unique in many ways. Its mechanical structure, its cooling and power supply systems, and its logic were new in ways that ranged from novel to revolutionary.

The physical structure of the system consisted of a tall cylinder, made of columns of circuit board racks. The outer cylinder, at the base contained the 5 V DC power supply, and the chassis acted as ground node for this power. Positive voltage was distributed to the circuitry through bus bars of solid copper that carried a lot of current. One of the latter systems delivered experienced a long series of odd breakdowns until, after nearly every component was changed, it was discovered that the bus bars were slightly out of spec, causing too much voltage drop, and preventing the logic circuits from functioning reliably.

The circuit boards slid horizontally into the columnar racks, with the central layer (of five) making physical contact with the slot edges. This contact enabled the ground node connection for the circuitry, and also allowed heat transfer from the cards to the chassis. The chassis members themselves were hollow conduits of refrigerant, and the efficiency of this cooling method enabled the logic to operate at hitherto impractical speeds, which in turn required hige power dissipation.

The logic circuitry was implemented by just three types of small- and medium-scale integrated circuits: a dual NAND-gate chip, a 1000-bit memory chip, and a 12-bit register chip. All logic elements produced both positive and negative values together, enabling the circuits to present a purely resistive load to the power supply, allowing the reactive components produced by the time-varying currents and voltages to cancel out. The backplane wiring connecting the circuit boards was hand-assembled, and all conductors were of identical length, equalizing signal propagation delays across the machine. Each cable was a twisted pair, enabling electromagnetic cancellation as is common in high speed transmission lines today.

The machine was clocked at 80 MHz, a high speed for its time. Memory could respond to requests in four clocks, if conflicts were absent, and was interleaved into 64 banks. This enabled vectors to be read or written at the rate of one element per clock. The CPU contained both scalar and vector registers and function units. Most interesting were the vector registers and units, but the machine supported fast scalar calculations as well. Each operand register held 64 bits, plus error correction code bits. The address registers that accompanied each operand register held 24 bits. Loading a value into an address register caused a read or write of the corresponding operand register from or to memory. Vector registers held 64 elements, and could be loaded or stored at a rate of one element per clock. Vector function units were pipelined, and streams of operands could flow between memory, registers, and function units in such a way as to allow multiple instructions to "chain" together at execute time. The maximum length chain could use up to three vector function units and a register at each end, delivering as much as 240 MFlops/sec. for the length of time necessary to stream 64 elements through this pipe.

It was demonstrated that this architecture, and its true (perhaps the first true) Reduced Instruction Set, allowed the fastest computations then possible. However, it ttunred out to excel at other than numerical work. A symbolic math package named Reduce was ported to it, from a CDC 7600. This package relied on a Lisp interpreter from the university of texas, named UTLisp. Thus, for a while, the Cray-1 was also the fastest Lisp machine in existence, being able to dereference up to 80,000,000 pointers per second.