While ARM and PPC are load/store machines - x86 and 68K are CISC machines.
The decoding is much simpler for RISC machines.
This means when you compare a real simple CORE which does 1 instruction per cycle the RISC machine is smaller /needs less power.
Also decoding multiple instructions is in the naive approach a lot simpler with RISC machines.
This means developing a super-scalar decoder is simpler for RISC.
But Intel,AMD and also new 68K chips have found their solutions to also be able to fast decode several instructions per cycle.
Now a CISC machine also has several advantages.
1) CISC instructions are much more powerful than RISC instructions.
For example:
ADDi.L #12456,(48,A0,Dn*

1 instruction on CISC - some CISC can even do this in 1 cycle.
= you need about 6 instructions to do the same on POWER
2) CISC instruction are much more compact.
This means caches can cache more instructions, and cache can also supply moer instrucitoner per cycle to the CPU.
To good designed CISC machine can do a lot of work per cycle.
Its not easy even for good RISC machines to keep up with this.
RISC has some clear advantages.
RISC chips are seasier to design.
Low performance = simple RISC chips need low power.
When you go high end the more complex CISC decoder is not the only problem anymore.
- Instruction Cache bandwidth limitations
- dependancies between instructions
There are the important topics.
RISC is no advantage here.
EPIC tries to address some of those but also has their very own pitfalls.
So yes - I can see that ARM has by design an advantage in the low performance region.
But in the high perfromance region - the problems are diffirent - and RISC is not in advantage here anymore.