Author Topic: Enter the Dragon or enter the vapor? (Read 11309 times)

Karlos · « **on:** May 17, 2007, 10:13:43 PM »

Quote

eslapion wrote:
I just spoke to an electrical enginer who's more into digital electronics than me.

He said the trapping method could be replaced by a huge (about 256MB) look up table that would essentially become the microcode for a conversion processor.

Essentially, you get the coldfire to run as a sort of interpreter that runs in loops into the 256MB and that tells it how to interpret the real 68k code.

This way, there is no flushing the pipeline.

;-) :-) :-D :lol: :roflmao:

Trust me, pipeline flushing would probably be *much* faster than this! Randomly accesed large lookup tables (anything larger than the cache) hammer any CPU, simply because memory access is generally one of the slowest things they do and such lookups tend defeat caches completely.

However, I doubt that such a lookup table would need to be quite that large. If you assume 680x0 code uses 16-bit instruction words most of the time, you'd need 65536 entries in your table. It would be larger than this due to extended opcodes, but 256MB is basically an immense overestimate.

Regardless, you are still talking varions memory read and computed jump instructions before you even get to emulating your opcode. This is not going to be quick at all.

I once wrote a small VM as an exercise that works in the manner you are suggesting. It has 256 possible instructions (an enumeration) and 16 general purpose registers (and some stack pointers) employing a load-store architecture. Instructions generally consist of byte pairs, one for the instruction and one for the effective address (mostly register to register, but depends on the instruction type).

A hand optimised assembly version of the interpreter uses a computed jump that is about as efficient as it can get for this (each instruction handler has the code required to calculate the next jump inlined onto the end of it, so you dont branch from a central loop out to a handler and back). The code table is about 16K, each handler starting at a cache aligned address.

It's an order of magnitude simpler than a real 68K and it gets about 2 MIPs on a 25MHz 040. With any luck you'll see this is not going to be a realistic option for a coldfire native 68K emulation.

Karlos · « **Reply #1 on:** May 18, 2007, 01:22:01 PM »

Quote

MskoDestny wrote:
A dynarec/JIT doesn't really have to be all that slow. You don't have to deal with the mess of emulating a fundamentally different architecture, just expanding certain unimplemented instructions into multiple implemented ones. You don't have to deal with register mapping or simulating flag behavior (well except for those previously mentioned multiply instructions). Plus in theory, a sufficiently advanced dynarec can actually improve performance. HP's Dynamo is a dynarec that doesn't do any translation between architectures it just does processor specific optimizations and optimizations that can only be reasoned about at runtime.

Precisely. If you look up the other N threads about coldfire/68k compatibility you'll see I've given HP Dynamo as a working example of how a coldfire 68K-JIT could work.

Author Topic: Enter the Dragon or enter the vapor? (Read 11309 times)

Karlos

Re: Enter the Dragon or enter the vapor?

Karlos

Re: Enter the Dragon or enter the vapor?