Author Topic: Enter the Dragon or enter the vapor? (Read 9230 times)

Donar · « **Reply #29 on:** May 17, 2007, 07:31:20 PM »

In my understanding the V4(e) core needs a Program (the CF68klib which is provided by Freescale (Programed by MicroAPL) to run 68k code properly.

There are two ways this CF68kLib works
a) Run code, trap out when needed -> The opcodes that are "double" in 68k and CF and behave differently can not trap out and do whatever they should do in Coldfire world. Maybe that is a bit different to what they would do on 68k.
There are only a few (4?) opcodes that are double between 68060 and coldfire V4e instruction set an behave differently, but it is a valid point when speaking about binary compatability.

b) Set up an virtual 68k on Coldfire but without MMU and FPU, run 68k code. One problem could be if you want to emulate an 68060 (because there are only a few (4?) opcodes that are double between 68060 and coldfire V4e instruction set) that you probably need an MMU for your AMIGA to work...
You could emulate an 68030 but i think it has more unupported instructions, which should slow down the Emulation...

Someone (i think his name was Darek Smirtana?) from Elbox wrote in a forum post that they had to write their own software because the one that was provided by Freescale did not work, without getting more specific.

I also think (like eslapion) that if the Coldfire V4e at 266MHz gives a raw Power on 68k code that is at least at an level of an 68060 @ 100MHz i will be fine with it. One could argue that a 68k emulator on an embedded PPC is faster then "Emulation" on the Coldfire and probably cheaper but that is another thing.

The price difference between the Chips (68060 vs Colfire) and goodies (USB/Ethernet/PCI) that come with the Coldfire speak for the Coldfire. Performance wise there are only guesses, and a little Screenshot from ATARI Coldfire Project where "Raw Power" of an Coldfire developer board (V4 @ 200MHz, TOS under 68k Emulation) is equal to TOS running on a real ATARI that is equipped with a CT63 Accelerator (68060 @ 100 MHz)

As for the x86, i somehow have a bad feeling about having one in my AMIGA. It's fine for my PC running UAE but not for the Miggy. If it were for the money i would have sold my Blizzard 1260 on eBay for 200€-300€ already...

Tripitaka · « **Reply #30 on:** May 17, 2007, 07:58:32 PM »

:-o Deja Vu :-o

This thread has happened before....

Lemmink · « **Reply #31 on:** May 17, 2007, 08:00:03 PM »

Eslapion, I don't want to destroy your hopes, but the Elbox coldfire will never go faster then any 68060 accelerator for the Amiga. The outcome of the presentation was that the card ran unmodified / optimised software at the speed of an 68040 at about 33 MHz. What you thought was DivX playback was just a DVD player hooked up to the TV card in the demomachine.

Anyway I'm surprised that the Dragon (even as a prototype) even came into existence, something I alway heavyly doubted. I stand corrected at that one.

Donar · « **Reply #32 on:** May 17, 2007, 08:03:54 PM »

Quote

Tripitaka wrote:
:-o Deja Vu :-o

This thread has happened before....

As there is not much to talk about in Amiga land... better to talk about the Coldfire then about the (bad) weather... :lol:

Donar · « **Reply #33 on:** May 17, 2007, 08:11:49 PM »

Quote

Lemmink wrote:
... but the Elbox coldfire will never go faster then any 68060 accelerator for the Amiga. The outcome of the presentation was that the card ran unmodified / optimised software at the speed of an 68040 at about 33 MHz. ..

The thing that i do not understand is: The Coldfire is (Mips wise) roughly 4 times faster than a 68060 @ 70 MHz it shares a good lot of instructions/adressing modes with the 68k (75%?). Why shouldn't it reach the performance of an 68060 or more? :-?

« **Reply #34 on:** May 17, 2007, 08:12:33 PM »

Quote

Lemmink wrote:
Eslapion, I don't want to destroy your hopes, but the Elbox coldfire will never go faster then any 68060 accelerator for the Amiga. The outcome of the presentation was that the card ran unmodified / optimised software at the speed of an 68040 at about 33 MHz. What you thought was DivX playback was just a DVD player hooked up to the TV card in the demomachine.

Well, that really leaves one on his appetite... at least when compared to the performance that seems to have been obtained with the Atari Coldfire project.

Donar · « **Reply #35 on:** May 17, 2007, 08:23:01 PM »

I must admit it's only a screenshot, actually it is about porting PCI drivers to Atari but as the guy had no ATARI PCI device he used his Coldfire Evaluation Board (from the ATARI Coldfire Project) for development, and i stumbled over the Screenshot...

Link, look in lower half -> the two pictures before the GFX Card...

I do not know which instructions are used in the Tests/Benchmark and how accurate it is.

AJCopland · « **Reply #36 on:** May 17, 2007, 08:34:58 PM »

We'll get no real answer to all this until we get an actual Coldfire based accelerator.

The compatibility issue is seemingly no worse than going from 68040 to 68060 and everything that wouldn't work we'd either have to pre-process, emulate or shunt back over to the 68020 (in A1200 or whatever).

Can it be done? Yes, will it be done. Who cares anymore.

« **Reply #37 on:** May 17, 2007, 08:39:01 PM »

Here is another project I have located for the Amiga that is based on the Coldfire.

http://www.cdtv.org.uk/coldfire/

Comi · « **Reply #38 on:** May 17, 2007, 09:07:03 PM »

Why dont make contact with them and try some interview about Dragon and Shark, future of Elbox..
Now is good time becouse of half time between Amiga and Hyperion.

MskoDestny · « **Reply #39 on:** May 17, 2007, 09:16:41 PM »

Quote

Donar wrote:
Quote

Lemmink wrote:
... but the Elbox coldfire will never go faster then any 68060 accelerator for the Amiga. The outcome of the presentation was that the card ran unmodified / optimised software at the speed of an 68040 at about 33 MHz. ..

The thing that i do not understand is: The Coldfire is (Mips wise) roughly 4 times faster than a 68060 @ 70 MHz it shares a good lot of instructions/adressing modes with the 68k (75%?). Why shouldn't it reach the performance of an 68060 or more? :-?

Exceptions are rather expensive on modern pipelined processors as the pipeline has to be flushed. The V4e has 9 pipeline stages so each time it hits an unimplemented instruction it burns through 9 cycles before the first instruction that emulates the missing one finished completion and it is going to need to execute several instructions to simulate the missing one. The 68060 can execute >1 instruction per cycle on average (note that it takes longer than 1 cycle for an instruction to complete because of pipelining, but more than one instruction is inflight at a time). So it probably takes a ColdFire CPU upwards of 10 cycles to execute certain instructions that might have taken effectively 1 cycle or less on the 060, but the ColdFire is only clocked about 4 times as fast.

A well written dynarec should have a much lower performance penalty, but it's much harder to write than a trap based solution.

amiga92570 · « **Reply #40 on:** May 17, 2007, 09:17:52 PM »

I wrote them last month and they claim the dragon's still planned to be released. Just finishing up software.

« **Reply #41 on:** May 17, 2007, 09:35:06 PM »

I just spoke to an electrical enginer who's more into digital electronics than me.

He said the trapping method could be replaced by a huge (about 256MB) look up table that would essentially become the microcode for a conversion processor.

Essentially, you get the coldfire to run as a sort of interpreter that runs in loops into the 256MB and that tells it how to interpret the real 68k code.

This way, there is no flushing the pipeline.

Zac67 · « **Reply #42 on:** May 17, 2007, 09:48:43 PM »

Flushing the pipeline is a problem when using a exception based approach (which isn't entirely possible) - you're talking about emulation. Plus, there are more efficient ways to emulate without huge lookup tables.

The only way you'd really get the most out of a CF would be to combine all these methods and let the task scheduler choose the appropriate one based on a database: all unknown tasks run through a JIT compiler (lowest speed), some known tasks are patched during load time and flagged as CF compatible (full speed).

This would add a tiny bit overhead to the scheduler, but permit 'clean' software to run full speed. You can even start out JITing everything and add patches later through updates.

After all, it's not impossible, but may not be worth the while.

AJCopland · « **Reply #43 on:** May 17, 2007, 09:57:12 PM »

Quote

eslapion wrote:
Here is another project I have located for the Amiga that is based on the Coldfire.

http://www.cdtv.org.uk/coldfire/

That would be Oli_hd's project that has fallen on hard times.

The difficulties of developing such a thing on your own are quite amazing I'd imagine. Still he did seem to get pretty far with it all, certainly at the electrical level, not so sure on the software running front.

I've always hoped he'd open source everything if he didn't plan to take it any further *hint-hint* :-D

Andy

Karlos · « **Reply #44 from previous page:** May 17, 2007, 10:13:43 PM »

Quote

eslapion wrote:
I just spoke to an electrical enginer who's more into digital electronics than me.

He said the trapping method could be replaced by a huge (about 256MB) look up table that would essentially become the microcode for a conversion processor.

Essentially, you get the coldfire to run as a sort of interpreter that runs in loops into the 256MB and that tells it how to interpret the real 68k code.

This way, there is no flushing the pipeline.

;-) :-) :-D :lol: :roflmao:

Trust me, pipeline flushing would probably be *much* faster than this! Randomly accesed large lookup tables (anything larger than the cache) hammer any CPU, simply because memory access is generally one of the slowest things they do and such lookups tend defeat caches completely.

However, I doubt that such a lookup table would need to be quite that large. If you assume 680x0 code uses 16-bit instruction words most of the time, you'd need 65536 entries in your table. It would be larger than this due to extended opcodes, but 256MB is basically an immense overestimate.

Regardless, you are still talking varions memory read and computed jump instructions before you even get to emulating your opcode. This is not going to be quick at all.

I once wrote a small VM as an exercise that works in the manner you are suggesting. It has 256 possible instructions (an enumeration) and 16 general purpose registers (and some stack pointers) employing a load-store architecture. Instructions generally consist of byte pairs, one for the instruction and one for the effective address (mostly register to register, but depends on the instruction type).

A hand optimised assembly version of the interpreter uses a computed jump that is about as efficient as it can get for this (each instruction handler has the code required to calculate the next jump inlined onto the end of it, so you dont branch from a central loop out to a handler and back). The code table is about 16K, each handler starting at a cache aligned address.

It's an order of magnitude simpler than a real 68K and it gets about 2 MIPs on a 25MHz 040. With any luck you'll see this is not going to be a realistic option for a coldfire native 68K emulation.

Author Topic: Enter the Dragon or enter the vapor? (Read 9230 times)

Donar

Re: Enter the Dragon or enter the vapor?

Tripitaka

Re: Enter the Dragon or enter the vapor?

Lemmink

Re: Enter the Dragon or enter the vapor?

Donar

Re: Enter the Dragon or enter the vapor?

Donar

Re: Enter the Dragon or enter the vapor?

Re: Enter the Dragon or enter the vapor?

Donar

Re: Enter the Dragon or enter the vapor?

AJCopland

Re: Enter the Dragon or enter the vapor?

Re: Enter the Dragon or enter the vapor?

Comi

Re: Enter the Dragon or enter the vapor?

MskoDestny

Re: Enter the Dragon or enter the vapor?

amiga92570

Re: Enter the Dragon or enter the vapor?

Re: Enter the Dragon or enter the vapor?

Zac67

Re: Enter the Dragon or enter the vapor?

AJCopland

Re: Enter the Dragon or enter the vapor?

Karlos

Re: Enter the Dragon or enter the vapor?