@Hammer
You'd be correct except for a few things:
IBM's 970 is only 1 stage shorter, to begin with. This makes for a
top Mhz rating of around 5Ghz on the next-generation process. But IBM
did something Intel didn't: It made the pipes WIDE. IBM's throwing
upwards of 12 instructions down the pipes at once! This means over
200 instructions are flying through this thing's pipes.
Then there's the thuroughput issue. Intel's still using the
shared-bus approach it pioneered so many years ago to save pin count,
assuming that a single, wide bus would be faster. In truth, it is
only cheaper than two smaller dedicated-route busses as done up in
older machines such as the Cray. IBM's got the edge in I/O handling,
being able to both send *AND* recieve instructions on the same cycle.
The latency in Intel's bus design, as it switches from send to recieve
adds hundreds of thousands of wait cycles to the system, all in the
extremely critical FSB area. IBM's fixed-task busses by comparison
can make full use of the availible bandwidth, delivering on the
potential of the design.
now, let's add in IBM's multiple processor approach, seperate channels
per-processor verses Intel's shared-bus approach. Means that if you
provide a large enough memory pipe that IBM MP approaches will be
zooming past anything Intel can throw at it.