-edit-
Sorry, bit of an essay :lol:
-/edit-
Waccoon wrote:
Isn't true RISC code about 1.5 times bigger than CISC code in executable format?
Quite often RISC code is bigger than CISC (there are exceptions however), but its not really much of an issue. RISC architectures tend to have highly orthogonal instruction sets (that is all instructions are the same size).
On the other hand CISC instruction sizes can vary like anything. For example, a simple 680x0 'move' instruction can be from 1 to 12 16-bit words long depending on the addressing modes used.
Data usually makes up large parts of any executable. To be flat honest, code I have compiled for both 680x0 and PPC have resulted in pretty similar sized programs, certianly not 50% larger.
Go figure...
I'm no electrical engineer, obviously, but are all 32 bits used for instructions, or instructions AND data? How many instructions are there on a PPC?
I don't doubt emulating PPC is tough and slow, but does *any* CPU has 65K+ instructions. :-?
Not 65K instructions, no. However it isn't about the actual number of instructions, rather it is about the total number of potential opcodes, which in turn depends on the opcode size.
Consider the 680x0 again. It has less than 50 instructions at the assembler level (think of add, addx, etc).
However, at the binary level, the instruction opcode is 16-bits wide. The pattern of bits in the opocde have particular meanings. You can see this in any 680x0 Programmer Manual.
For instance, consider the basic 680x0 integer add instruction. In assembler we can write it thus:
add.
, d
add. d,
Where
is the operand size (.b, .w, .l)
is the effective addressing mode
is the number of the data register (0-7)
Just from the above description we can see that actually there is a lot of implicit information here. This information is encoded into the 16-bit instruction word as follows
Bits 0-2 are the address register in the
Bits 3-5 are the mode
Bits 6-8 are the operation mode (size/sense)
Bits 9-11 is the data register number N
Bits 12-15 are the opcode identifier (here 1101)
So in reality there are literally hundreds of possible 16-bit values that correspond to different variations of the above 'add' instruction. Depending upon the effective address mode used, several extension 16-bit words may follow the instruction word.
As I said, emulation of the 680x0 often simply has a table of all the possible 16-bit values that point to functions to handle each specific opcode case.
All of the nonsense values will point to the same function that emulates an illegal instruction trap.
This approach is used because it simplest, hence quickest (and also constant time taken) to lookup a function in a table and call it.
Any other solution that more intelligently breaks the opcode word down into parts involves more stages, increases the complexity and hence the time taken, which is very bad for any emulation.
Back to PPC...
The same breakdown of opcode words into meaningul fields found on the 680x0 also applies ot PPC opcodes.
The PPC, being a RISC system has only a small number of instructions, but like the 680x0 encodes much other data into its 32-bit instruction word.
For example, a typical 'inst rA, rB, rC' type instruction has to encode 3 registers, each of which needs 5 bits (there are 32 integer registers and 2^5 = 32).
Additionally, RISC architectures such as the PPC tend to define subtle variations of instructions. For example there are integer arithmetic instruction variants that do not update the chip status registers - if you dont need to check them, why update them? This means that when writing (or generating) optimal code cycles can be saved etc. by using instruction variations that eliminate redundant work.
So, although the PPC may only define a handful of instructions (similar to 680x0), the number of possible opcode values is very large indeed.
You simply cannot use the same table 'entry every opcode possible' strategy as for 680x0 emulation becasue 2^32 = 4billion entries in the table :-o !
Now as I hinted earlier, the trick is to isolate the part of the 32-bit opcode that defines the instruction, mask out the register fields etc. and make a tables of those (now much smaller). There would then be several tables, each with only a few dozen entries.
However, this is exactly the sort of thing I said we needed to avoid when talking about 680x0 emulation because it increases complexity. Masking out the bits takes time and also there are many instructions that dont follow the same bit patterns that you have to check for.
You can imagine the overhead in first checking that the instruction isn't one of many special cases, then masking out the instruction identifier bits, shifting them down to get an index number, indexing the table, finally calling the function etc. etc.
Unfortunately, the choice either is this, or a 4G entry table (basically impossible).
In short, its a performance nightmare.