Author Topic: Coldfire AGAIN (Read 9651 times)

SamOS39 · « **Reply #14 on:** March 29, 2008, 12:58:53 PM »

Some one better start stripping down that emulator then .. :-)

biggun · « **Reply #15 on:** March 29, 2008, 01:07:56 PM »

Quote

The Coldfire is an interesting design, for sure! But if we are talking about a small efficient CPU core for an ASIC/FPGA that is to be used for Emulating a 68k...
then the Coldfire offers us nothing....

You are aware that you can run 68k code in Coldfire, are you?
Yes the Coldfire does not implement ALL 68k instructions natively but it implement many 68k instructions natively.

The other day I ran an old AMIGA packer with the Coldfire library and funnily enough there were only 4-5 instructions in the whole binary which were not Coldfire native.

Maybe this was a lucky example but it shows that you do not need to emulate every instruction.
Depending on your application the Coldfire can get away running 90% of the instructions natively.

Cheers

bloodline · « **Reply #16 on:** March 29, 2008, 02:15:40 PM »

Quote

biggun wrote:
Quote

The Coldfire is an interesting design, for sure! But if we are talking about a small efficient CPU core for an ASIC/FPGA that is to be used for Emulating a 68k...
then the Coldfire offers us nothing....

You are aware that you can run 68k code in Coldfire, are you?

I've never had the chance to develop for a Coldfire so I can't say for sure just how 68k compatible it is. I have read the developer documents though.

Quote

Yes the Coldfire does not implement ALL 68k instructions natively but it implement many 68k instructions natively.

Many... but the 68k offers something like 1500 possible opcode/operand combinations... The Coldfire seems to lack a significant number instructions (not a problem, as they can be trapped, but a speed penalty none the less), it's missing a lot of addressing modes a big problem with a CISC design like the 68k (still they can be trapped etc...)... now the two big show stoppers for me are the instructions that are functionally different, this would require a proper emulator to correct... and the final thing that puts me off the cold fire... the supervisor mode is totally different, the only way around that is to build a new AmigaOS... or preferably use AROS, if we can get some more 68k devs on board.

Until I get a chance to play with one I can't be convinced from the documentation that the Coldfire is good for our use.

Quote

The other day I ran an old AMIGA packer with the Coldfire library and funnily enough there were only 4-5 instructions in the whole binary which were not Coldfire native.

4 or 5? With something like a CPU we can't be vague... programs either work or don't... there is no half measures.

Quote

Maybe this was a lucky example but it shows that you do not need to emulate every instruction.
Depending on your application the Coldfire can get away running 90% of the instructions natively.

But HOW do you know which instructions you need to emulate without a full Emulator?

HenryCase · « **Reply #17 on:** March 29, 2008, 02:16:56 PM »

Quote

biggun wrote:
Depending on your application the Coldfire can get away running 90% of the instructions natively.

Which is why Coldfire is the best upgrade for 68k architecture, and if it works, would be a good choice for the Natami.

Quote

biggun wrote:
You can buy the "source" of the Coldfire for an affordable sum. This means that you can "bake" your own Coldfire including AGA/SuperAGA in one Chip.

If you can "bake" your own Coldfire, would it be possible to fix the few misbehaving instructions to make a fully 68k-compatible custom Coldfire CPU? I'm assuming the fact that one CPU uses a 16-bit architecture vs a 32-bit architecture in the other CPU doesn't matter as the vast majority of instructions already work perfectly.

bloodline · « **Reply #18 on:** March 29, 2008, 02:24:00 PM »

Quote

HenryCase wrote:
Quote
biggun wrote:
Depending on your application the Coldfire can get away running 90% of the instructions natively.

Which is why Coldfire is the best upgrade for 68k architecture, and if it works, would be a good choice for the Natami.

Sure... but only if we had our 68k source code... With the functionally different instructions and totally different supervisor mode... you may as well use a CPU that has nothing to do with the 68k... but is better supported...

Quote

Quote
biggun wrote:
You can buy the "source" of the Coldfire for an affordable sum. This means that you can "bake" your own Coldfire including AGA/SuperAGA in one Chip.

If you can "bake" your own Coldfire, would it be possible to fix the few misbehaving instructions to make a fully 68k-compatible custom Coldfire CPU?

You buy a licence to use the core, not modify it. You would need the development documents too... and there are reasons why the instructions work differently, it's to get the speed up!!!

Quote

I'm assuming the fact that one CPU uses a 16-bit architecture vs a 32-bit architecture in the other CPU doesn't matter as the vast majority of instructions already work perfectly.

Err... the coldfire is 32bit, just like the 68k... :-?

HenryCase · « **Reply #19 on:** March 29, 2008, 02:32:03 PM »

Quote

bloodline wrote:
You buy a licence to use the core, not modify it.

Shame.

Quote

bloodline wrote:
You would need the development documents too... and there are reasons why the instructions work differently, it's to get the speed up!!!

Surely a hardware implemented function would be faster than an emulated one?

Quote

bloodline wrote:
Quote
I'm assuming the fact that one CPU uses a 16-bit architecture vs a 32-bit architecture in the other CPU doesn't matter as the vast majority of instructions already work perfectly.

Err... the coldfire is 32bit, just like the 68k... :-?

[/quote]

I thought the 68k family of CPUs was 16-bit, classic Amigas were always referred to as 16-bit computers, right?

bloodline · « **Reply #20 on:** March 29, 2008, 02:47:18 PM »

Quote

HenryCase wrote:
Quote
bloodline wrote:
You buy a licence to use the core, not modify it.

Shame.

Quote
bloodline wrote:
You would need the development documents too... and there are reasons why the instructions work differently, it's to get the speed up!!!

Surely a hardware implemented function would be faster than an emulated one?

The Coldfire engineers removed all the bits of the 68k that slowed the design down... if you put them back in, you slow the design down.

Quote

Quote
bloodline wrote:
Quote
I'm assuming the fact that one CPU uses a 16-bit architecture vs a 32-bit architecture in the other CPU doesn't matter as the vast majority of instructions already work perfectly.

Err... the coldfire is 32bit, just like the 68k... :-?

I thought the 68k family of CPUs was 16-bit, classic Amigas were always referred to as 16-bit computers, right?[/quote]

Just the external data bus of the 68000... nothing to do with the architecture of the CPU.

minator · « **Reply #21 on:** March 29, 2008, 03:17:32 PM »

You need a binary scanner of some form, it scans the binary before you run it and adds in routines to replace the unsupported instructions.

Be a lot easier then writing a full emulator or JIT engine for a different CPU.

bloodline · « **Reply #22 on:** March 29, 2008, 03:28:57 PM »

Quote

minator wrote:
You need a binary scanner of some form, it scans the binary before you run it and adds in routines to replace the unsupported instructions.

Be a lot easier then writing a full emulator or JIT engine for a different CPU.

Such a solution might run faster than a full emulator... though what you are suggesting is just a JIT, that sometimes spits out the instructions unchanged...

Karlos · « **Reply #23 on:** March 29, 2008, 03:36:01 PM »

Quote

though what you are suggesting is just a JIT, that sometimes spits out the instructions unchanged.

*cough* Dynamo-style JIT *cough* ;-)

Dynamo (a JIT made by Hewlett Packard) demonstrates the amusing (and at first glance ludicrous) fact that a hotspot JIT can 'emulate' code running the same processor it itself is running on faster than the CPU can run code natively.

The reason this is possible is down to the fact that at runtime you know more state information than you ever did at compile time. Consequently, a lot of if/else/switch/case/for/while etc code ends up taking only one or two possible paths at runtime (compared to many more possible paths at compile time) and unused code paths can be optimised away by the JIT.

The main overhead of any JIT system is the on-the-fly recompilation stage that's kicked off when the system encounters new code. Translating code for one CPU to another can be quite expensive where their architectures are very different. However, when most of your "recompilation" involves simply copying (rather than translating) the original code, that overhead is mitigated substantially.

Using such a mechanism, I expect a current generation coldfire core could run 680x0 code extremely well and without any of the performance problems trapping individual unimplemented instructions cause.

If only there were 24 more hours in my day I'd look at it.

bloodline · « **Reply #24 on:** March 29, 2008, 04:08:15 PM »

Quote

Karlos wrote:
Quote
though what you are suggesting is just a JIT, that sometimes spits out the instructions unchanged.

*cough* Dynamo-style JIT *cough* ;-)

Dynamo (a JIT made by Hewlett Packard) demonstrates the amusing (and at first glance ludicrous) fact that a hotspot JIT can 'emulate' code running the same processor it itself is running on faster than the CPU can run code natively.

The reason this is possible is down to the fact that at runtime you know more state information than you ever did at compile time. Consequently, a lot of if/else/switch/case/for/while etc code ends up taking only one or two possible paths at runtime (compared to many more possible paths at compile time) and unused code paths can be optimised away by the JIT.

The main overhead of any JIT system is the on-the-fly recompilation stage that's kicked off when the system encounters new code. Translating code for one CPU to another can be quite expensive where their architectures are very different. However, when most of your "recompilation" involves simply copying (rather than translating) the original code, that overhead is mitigated substantially.

Using such a mechanism, I expect a current generation coldfire core could run 680x0 code extremely well and without any of the performance problems trapping individual unimplemented instructions cause.

If only there were 24 more hours in my day I'd look at it.

Dynamo is bit more than a JIT :-) since it's more like the front end of a CPU like the Athlon, done in software! Which is out of the scope of this project... especially while we don't know the implementation details of the Coldfire.

If we are to use the coldfire, the the JIT is the only way to go... even though I see this as a good opportunity to rid ourselves of the 68k (no matter how much I like it).

Karlos · « **Reply #25 on:** March 29, 2008, 04:18:12 PM »

Semantics, my dear fellow :-D. Dynamo has been described by its creators as a hotspot JIT (like most other JIT implementation it also allows non-critical code to run through in interpreted mode). It dynamically recompiles critical sections to eliminate dead code branches, early returns etc. It simply happens to be the case that the target CPU is the same class as the source.

What you are alluding to are the deep implementation detail of how it works. That it is similar to the AthlonXP's instruction queue/decoder doesn't mean it is fundamentally different to any existing optimizing JIT as most of them employ the same sorts of code pruning.

biggun · « **Reply #26 on:** March 29, 2008, 04:21:39 PM »

Quote

But HOW do you know which instructions you need to emulate without a full Emulator?

Bloodline, you are funny :-)

You are not to shy to give advices on which CPU to use and rant about the Coldfire, but are you sure that you understood the Coldfire correctly?

No offence, but the risk that in 68k program runs to problem on the Coldfire is very, very small.

If you reread the Coldfire manual, you will realize that there nearly no 68000 instructions that are executed differently and could cause a problem.

A good starting point is:

http://www.microapl.co.uk/Porting/ColdFire/Download/pa68kcf.pdf

Cheers

bloodline · « **Reply #27 on:** March 29, 2008, 04:35:30 PM »

Quote

biggun wrote:
Quote

But HOW do you know which instructions you need to emulate without a full Emulator?

Bloodline, you are funny :-)

I like to think so! :-)

Quote

You are not to shy to give advices on which CPU to use and rant about the Coldfire, but are you sure that you understood the Coldfire correctly?

I may well rant, but I have stated clearly that I've not actually used a coldfire, ever.

Quote

No offence, but the risk that in 68k program runs to problem on the Coldfire is very, very small.

Any risk is too much... but that's not the point, you don't have any detailed stats yet... and I don't intend to test it out myself. Until it's been tested we can't know.

There is however sufficient evidence that the Coldfire is unsuitable. Number one is that no coldfire boards exist for the Amiga, despite it being nearly a decade since the coldfire was released. Number two from what I've read, it doesn't seem so 68k object code compaible...

Quote

If you reread the Coldfire manual, you will realize that there nearly no 68000 instructions that are executed differently and could cause a problem.

A good starting point is:

http://www.microapl.co.uk/Porting/ColdFire/Download/pa68kcf.pdf

Cheers

Yes, I've read it :-)

Karlos · « **Reply #28 on:** March 29, 2008, 04:41:56 PM »

Well, FWIW, I don't think a straightforward trap-and-emulate based amiga accelerator mechanism would work that well, otherwise we'd have seen one by now.

I seem to recall, but I may be wrong, the problem is that certain opcodes actually behave differently to the same operations on m68k. That is to say, they are implemented but operate slightly differently to the 680x0.

I mean an instruction that works but works differently to what you expect is probably worse than one that isn't implemented at all as you can't really trap it in the first place.

biggun · « **Reply #29 from previous page:** March 29, 2008, 04:57:47 PM »

Quote

Karlos wrote:

I seem to recall, but I may be wrong, the problem is that certain opcodes actually behave differently to the same operations on m68k. That is to say, they are implemented but operate slightly differently to the 680x0.

Can you give a real example, or is this a hear say rumor mill?

Author Topic: Coldfire AGAIN (Read 9651 times)

SamOS39

Re: Coldfire AGAIN

biggun

Re: Coldfire AGAIN

bloodline

Re: Coldfire AGAIN

HenryCase

Re: Coldfire AGAIN

bloodline

Re: Coldfire AGAIN

HenryCase

Re: Coldfire AGAIN

bloodline

Re: Coldfire AGAIN

minator

Re: Coldfire AGAIN

bloodline

Re: Coldfire AGAIN

Karlos

Re: Coldfire AGAIN

bloodline

Re: Coldfire AGAIN

Karlos

Re: Coldfire AGAIN

biggun

Re: Coldfire AGAIN

bloodline

Re: Coldfire AGAIN

Karlos

Re: Coldfire AGAIN

biggun

Re: Coldfire AGAIN