Its about altivec. It seem that AmigaONE+Altivec give a double speed than a Athlon1.8
In that particular benchmark, yes..
But Distributed.net thinks that “the RC5 client is a poor benchmark to use in determining the speed or performance of a particular CPU.”
http://n0cgi.distributed.net/faq/cache/55.htmlCouldn’t you a find benchmark that relies less on a certain type of instruction?
As for the reasons why G4 kicks butt in RC5 I would like to refer to a Slashdot IRC chat with the guys from distributed.net:
http://www.slashnet.org/forums/DCTI-20020928.html“G4 CPUs have some architectural features very suited for RC5.”
“First, in the fastest cores, all processing is done in the vector unit of the chip (Altivec).”
“Intel and AMD CPUs do have integer vector units (SSE2 and MMX), but they're less suited to RC5 than Altivec for two main reasons:
More registers available (32 in the PowerPC versus 8 in MMX and SSE2), plus 128-bit wide registers (MMX is only 64-bit wide), and the existence of a hardware vector rotate instruction in Altivec, which isn't available in MMX and SSE2.”
“These reasons make it less worthy to use vector units on x86, where all processing is done in the standard scalar ALU.
Oh, and did I mention that the [G4] vector unit allows for the processing of 4 keys simultaneously?”
“The most recent AMD processors have better hardware rotate support than the most recent Intel ones, to answer the second question. RC5 uses rotate _a lot_”
“The AMD Athlon possesses 3 superscalar ALUs, all capable of doing a rotate instruction with single-cycle latency.
Given that rotates are the core of RC5 processing, this is very advantageous for the Athlon.
Also, the P4 latency for rotate instructions is 4 cycles, and while I'm not sure off the top of my head whether it has 1 or 2 superscalar barrel shifters, it is clearly in disadvantage here.”