Welcome, Guest. Please login or register.

Author Topic: Coldfire AGAIN  (Read 25745 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« on: March 29, 2008, 06:49:19 AM »
@rkauer
Quote

Whats the point?

I don't know much about how all this stuff works, but yeah I could see if you were just going to emulate "all" the 68k codes it seems like it would make more sense to just use a more widely available CPU like a CORE2Duo or something.  Seems like it would be 'as' difficult.  But I don't know.  Hard work either way it seems.[/quote]

Hey, not so quick with your conclusions. :-)

Lets look at all PROS and CONS first!

There is a major advantage that the Coldfire has got.
You can buy the "source" of the Coldfire for an affordable sum. This means that you can "bake" your own Coldfire including AGA/SuperAGA in one Chip.

Continuing from FrenchSharks point:
You can bake a Coldfire including AGA => which is basicly then an AMIGA on a single chip.
You can get Coldfire including AGAChipset to 400/500 Mhz.

If you are into comparing numbers then:
- The resulting SuperAGA blitter can be about 200 times faster than the AMIGA AGA was.
- The CPU is net about 10 times faster than a 68060.
- More than about 100 times faster A1200/020

Even in FrenchSharks "68k emul mode" its still about 10 times faster than an A1200/020.

Nothing to sneeze about.

The key is having it in a the single chip.
Based on this you can create a 500MHz AMIGA of the size and price of a fat USB stick.



Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #1 on: March 29, 2008, 07:54:51 AM »
Quote

rkauer wrote:
 From my knowledge, the goal is emulate only the bad instructions, but even this way, you have to emulate the supervisor mode of the CPU.

 So here is the catch: construct a resident-hardware interpreter outside the Amiga memory space (a simple "mini" Spartan or other FPGA can do this) and, with the right code, forward to the CPU only the "good code" untouched. Almost the same approach as a table interpreter.


I see where you are coming from but I think this is not worth doing.

I'll try to explain why:

For the sake of easy discussion lets quickly compare the net
performance of the Coldfire and 68k.
Every application has of course different needs but lets just use one example for the sake of argument.

Lets say we have a piece of code (a loop) that uses:
4 register operations (ie: cmp, or add.l dx,dy)
2 memory operations (ie: add ,dx)
1 multiplication
2 index adressing modes (ie: move.l 2(a0,d0),d1)
1 branch (taken)

Now look how long (many clocks) a 68000 will need for this:
4 reg x 6
2 mem x 18
1 mul x 40
2 ind x 18
1 bra x 10
------
146 clocks

Now look how long (many clocks) a 68020 will need for this:
4 reg x 2
2 mem x 6
1 mul x 28
2 ind x 9
1 bra x 6
------
72

Now look how long (many clocks) a 68060 will need for this:
4 reg x 1
2 mem x 1
1 mul x 2
2 ind x 1
1 bra x 0
------
10 => 7 clocks

Yes, the 68060 can execute every instruction in one clock (the multiplication takes 2).
The 68060 can do loops for free (taken branch = 0 clocks)
And the 68060 has two instruction units. Allowing it to do two instructions per clock! Depending of your code structure the 10 instructions in the example loop could be folded down to 4.5 clocks in the best base.
For the sake of argument lets say the 68060 needs 7 clocks.


Looking at these numbers will give us a better feeling for the CPUs.

Now comparing the CPU based on their used clock rates:
68000 7.09 MHz =   48K
68020 14.2 Mhz =  196K
68060 50.0 Mhz = 7142K

In other words:
The A1200 (14Mhz 68020) is 4.0 times faster than a 68000
The (50MHz 68060) is 148 times faster than a 68000
The (50MHz 68060) is  36 times faster than a 68020


The above example is a realistic code sequence.
It show us the net CPU performance.


Now the behavior of Coldfire V5 is very much comparable to an 68060 but higher clocked.

In other words (our assumed) Coldfire 500 Mhz AMIGA has the CPU power to be 1480 times faster than a A500 (68000) with fastmem. Yes, thats over one thousand and four hundred times!

This single chip 500 Mhz AMIGA has as well the CPU power
to be 360 times faster than a A1200 (68020) with fastmem. Yes, thats over three hundred times!


Of course this is net CPU power.
Considering cache missed and memory latency the effective gained performance is of course a bit lower.

The point that we are trying to make here is:

If you was running an application on original Amiga then their speed was limited by the GFX subsystem (blitter) and by the CPU power.

As the SuperAGA Blitter is over 100 times faster - there is no GFX speed limit anymore!

The Coldfire is so much faster that even in full emulation mode you have many times the CPU power of the A1200.
Coldfire is net 360 times faster than A1200.
If FrenchSharks Code divides this by 20 its still over 10 times faster than A1200 was.


New applications with "coldfire" clean 68k code will be able to leverage the full performance of the Coldfire.
In other words this single Chip Amiga will then be 10 times faster than the fastest Cyberstorm was.


I don't know about you but for me this is enough!


If you would force me to add something to this chip then I would add a programmable DSP into the chipset.
Something like the AXE of the new e300 Freescale chip.
Such a SuperCopper could be used the decode MP4, MP3, DIVX etc for free.

Then this one Chip Amiga can do all that I need.

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #2 on: March 29, 2008, 11:50:28 AM »
If someone is serious about doing Coldfire development:

There is a limited number of sponsored (free) Coldfire V4 Development systems available on http://www.powerdeveloper.org/


If you propose a sensible project there is a good chance that you get a free board.

I assume that I'll get my board early next week. :-)



Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #3 on: March 29, 2008, 12:31:10 PM »
Quote

bloodline wrote:
I really don't get the obsession with the Coldfire! if you want a small core to use for emulating a 68k... either design one yourself, I made a big post about this in a previous thread... or licence an ARM or MIPS, both of which are smaller, better supported and preferable to the hacked up mess of a CPU that the Coldfire is...

In fact I would go as far to say the MIPS is the better choice, since it has more registers than the 68k, which is a really good idea....


Come on Bloodline,
you could create a higher quality post than this, can't you?


Quote
hacked up mess of a CPU that the Coldfire

WTF? The Coldfire is a very logic, clean design.


Quote
Design one yourself

Very thoughtless proposal.
How many people do you know that can design a fully fledged CPU like the Coldfire, and can do this cheaper than then core is at Freescale?


Quote
MIPS is the better choice

On what experience do you base your claim?
Gave you ever developed for mips?

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #4 on: March 29, 2008, 01:07:56 PM »
Quote

The Coldfire is an interesting design, for sure! But if we are talking about a small efficient CPU core for an ASIC/FPGA that is to be used for Emulating a 68k...
then the Coldfire offers us nothing....


You are aware that you can run 68k code in Coldfire, are you?
Yes the Coldfire does not implement ALL 68k instructions natively but it implement many 68k instructions natively.


The other day I ran an old AMIGA packer with the Coldfire library and funnily enough there were only 4-5 instructions in the whole binary which were not Coldfire native.

Maybe this was a lucky example but it shows that you do not need to emulate every instruction.
Depending on your application the Coldfire can get away running 90% of the instructions natively.

Cheers

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #5 on: March 29, 2008, 04:21:39 PM »
Quote


But HOW do you know which instructions you need to emulate without a full Emulator?


Bloodline, you are funny :-)

You are not to shy to give advices on which CPU to use and rant about the Coldfire, but are you sure that you understood the Coldfire correctly?

No offence, but the risk that in 68k program runs to problem on the Coldfire is very, very small.

If you reread the Coldfire manual, you will realize that there nearly no 68000 instructions that are executed differently and could cause a problem.

A good starting point is:

http://www.microapl.co.uk/Porting/ColdFire/Download/pa68kcf.pdf


Cheers

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #6 on: March 29, 2008, 04:57:47 PM »
Quote

Karlos wrote:

I seem to recall, but I may be wrong, the problem is that certain opcodes actually behave differently to the same operations on m68k. That is to say, they are implemented but operate slightly differently to the 680x0.


Can you give a real example, or is this a hear say rumor mill?

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #7 on: March 29, 2008, 05:37:43 PM »
Quote

Karlos wrote:

Well, for one, I seem to recall that MULS and MULU fail to set the overflow bit of the condition code register.



To be precise here:
The 68000 instruction MULS.W did NEVER set the overflow bit on 68K.
This instruction is working 100% the same on the Coldfire.


The instruction that you are referring to is the muls.L and this instruction was 68020 only!
Programs compiled for 68000 could never include this instruction.
So if you have an A500 program this isssue can never show up.

BTW if muls.L does set the overflow then the calculate is 100% wrong anyway and their is NO way of recovering from it!
The only way to correct this it is using a the 64bit MUL instruction or proper multiplication routine.

The proper usage for this instruction is only to use it when you values will not overflow. And in this case the Coldfire version will work 100% the same.

Please remember, that the issue that you are referring too does not exist for A500 programs.

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #8 on: March 29, 2008, 07:08:17 PM »
Quote

Karlos wrote:
Quote

biggun wrote:

BTW muls.L would calculate wrong of you get the overflow and their is NO way of recovering from it besides using the 64bit MUL version or a proper multiplication routine.
In other words if your code can overflow you will never use this instruction in the first place.


I think you'll find it's used in most 68020+ compiler-generated code where the effects of overflow aren't really defined by the language standard.


But in this case the MUL.L of Coldfire will behave correct too.

Its clear that there can be some a certain number of cases where this behavior was expected and where the saturation will not hit.

The fact is that A500 application are not 100% unaffected of this.
And if you think about it you will agree that over 95% of all 68020 games / applications are certainly not using this affected sequence too.

So what is the effect: 100% of A500 games unaffected.
And if 1 out of 100, A1200 applications is affected, how terrible is this?


So what it the real effect?
A few, very limited number of tools might become buggy.
But 99% of the AMIGA application will run correctly on Coldfire.

This is how it really looks like.

That some people state that the Coldfire is not possible to
run 68k code is certainly a 100% overstatement.

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #9 on: March 29, 2008, 08:19:29 PM »
Karlos wrote:
Quote

So far we've only looked at the user mode. Coldfire supervisor mode is a bit different and if I recall clearly, it doesn't have a separate supervisor stack pointer. This might not sound a big deal but it does have very real implications.

[...]

Can you say with certainty that the 100% of A500 applications you refer to as being compatible aren't doing anything like this?



This is no problem.

You are referring to the very first Coldfire versions.
The V4 and V5 Coldfire have two  a separate supervisor stack pointer.

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #10 on: March 29, 2008, 08:26:27 PM »
Quote

bloodline wrote:
Quote

HenryCase wrote:
Quote
bloodline wrote:
How does this hardware know what is Code and what is Data? It can sit there as a parasite on the Data Bus, but it won't know what it's looking at...


Surely in 68k ASM the first "half" of code is the instruction and the second "half" of the code is the data? As the code would be of fixed length (16-bit? 32-bit?) the 'parasite' would know exactly where to look for an instruction, right?


Nope. Not the instruction format... the actual information traversing the Data bus, could be Code or Data... Only the CPU knows what the information actually is.

-Edit- And the 68k has variable length instructions... The parasite doesn't have a hope in hell's chance of ever correctly identifying the Code.



The 68k "flags" instruction fetched on the debug pins. (Harvard Architecture).
So yes, you can from the outside distinguishe data from code fetched.

Saying that the idea of the paraside is not worth doing, as its much to complicate for the possible benefit.

If you really want to do a lot of work then you would rather alter the Coldfire core directly.
BTW, why did your claim that you are not allowed to change the Coldfire if you buy it?


I think the real story is that you would not want to change it - as its too much work.

The Coldfire is quite fast and powerful as it is.

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #11 on: March 30, 2008, 08:51:34 AM »
Quote

The amiga is a multitasking computer, so why not use 2 coldfires, running separate tasks, while one chip is trapping and emulating code, the other coldfire continues running it's task, hiding the speed penalty that emulation brings.


Running two "normal" applications on two Coldfire CPUs (like SMP) does not work.
For this cache coherency (bus snooping) is required.
The Coldfire does not do bus snooping.
The 68040 and 68060 were supporting bus snooping.
You could create great working multi processor systems out of 68040 and 68060 but not out of Coldfire.


Lets be clear here.

The Coldfire has some advantages:

1st)
Freescale has the Coldfire set up to work like LEGO.
You can easily put together parts, as you please.
If you look at the Freescale side you will see that there are dozens of different Coldfire CPUs put together.
This somewhat shows how simple it is to put new Coldfires together.

This LEGO feature is what makes the Coldfire interesting as the key for the AMIGA is to get something like SuperAGA into the Chip quickly.

Compared the classic AMIGAs the Coldfire is quite fast.
Yes, there are other higher clocked CPUs available but
the Coldfire V5 runs with 400 MHz and has about the same performance as a 68060 clocked with 400MHz.

So for a slim AMIGA OS system a 400 MHz Coldfire / 400Mhz 68060 does fly.

The only situation where you would want more Power is for something like Video encoding. As you might know the Coldfire has a MAC unit which is about something like ALTIVEC for the poor. The MAC Unit helps to accelerate stuff like FF transformations quite good.
So for tasks like Video encoding the Coldfire has more power than a 68060 400MHz would have.

Another option which will fit perfectly into the AMIGA spirit is putting a dedicated unit on this tasks. Freescale offers DSP cores which will be perfect for doing video encoding. Freescale could without any problem put SuperAGA, a 400Mhz Coldfire and a 400Mhz DSP into one chip.

The resulting chip will be relative small (read: cheap),
will be quick to produce, and it will be powerful enough to let AMIGA OS fly.

The DSP could be nicely integrated into AMIGA OS.
Think of it as a 2nd Copper aka a SuperCopper.
The DSP could be used to play video and audio Datatypes.

The SuperAGA chipset already includes DMA engines for stuff like YUV conversion. The idea to upgrade this with a powerfull SuperCopper that would be specialized for the heavy lifting needed for Video encoding, makes good sense.
We were thinking about designing our own mini-DSP for this - but as Freescale has powerful DSP cores in their LEGO toy box it might be clever to just take an existing DSP which has already a wealth of datatypes for audio and video developed for.


This is what makes sense to me.

Please mind the target is not to create a CPU which is faster than a 8-way Opteron or CELL.

The goal is to create a CPU which is not expensive, which runs passiv and which is fast enough to let AMIGA OS fly.

The big advantage of the AMIGA OS system is the elegant design and the resulting low memory and speed requirements of it.

There is a market for AMIGA OS for a small system.
You can think of it as a Amiga-Joystick, AMIGA smartphone or AMIGA-Wii.
There is no market for a Desktop system anymore.
The Desktop area is fully saturated with Windows,Apple and now even Linux.

If you want to build a new high end system take x86 and Linux.

I believe that the future (if any) for AMIGA OS is to power a sub $100 device.
And for this target market the Coldfire is a quite sensible choice.

AMIAG OS needs again something like a A500.
Low price but powerful for its size and price.

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #12 on: April 01, 2008, 10:33:56 AM »
My 2 cents,

* Memory protection is nearly impossible to implement under the idea of AMIGA OS.

* That AMIGA OS does not require memory protection gives it a VERY BIG speed boost.

* To secure a system CPU based memory protection can help.
But the system can still be destroyed by BLITTER or bwrongly set up DMA channels berserking through your system.
To prevent this the OS needs to forbid the direct usage of Blitter or userspace Disk DMA. If you do this you will sacrifice a huge amount of performance.
=> This is 100% the opposite of the idea and spirit of the Amiga OS.

* I would like to point out that there are other ways to stabilize a system. 99% of crashed come from bad pointer arithmetic. You can try to reduce the harm cause by the bad pointer by enforcing memory protection (for a high cost) or you can use coding styles which will not cause this problem in the first place. A would like to point out that the Amiga Oberon programs did NEVER crash!

I agree that this topic has nothing to do with the Coldfire.
And that for continues discussion opening another thread makes good sense.

Quote

Someone actually had an interesting topic going on here about the ColdFire and I was getting the impression that new versions V4e and V5 of the cores might actually have the necessary supervisor stacks to run 680x0 code using the cf68klib.


Yes, this is true.


Quote

@FrenchShark
That asm listing you dumped earlier on, what version of the CF core was it for that made it necessary for you to emulate the entire instruction set?


FrenchShark, post looked very interesting.
I would like to know if he is interested in working together in this project?


Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #13 on: April 01, 2008, 12:28:13 PM »
Quote

Einstein wrote:

Quote
* I would like to point out that there are other ways to stabilize a system. 99% of crashed come from bad pointer arithmetic. You can try to reduce the harm cause by the bad pointer by enforcing memory protection (for a high cost) or you can use coding styles which will not cause this problem in the first place. A would like to point out that the Amiga Oberon programs did NEVER crash!


It's like saying we don't need Police Departments, only if people behave than we could rid of'em have gain an economic boost, but unfortunately this is not reality.




Your post clearly shows that you are NOT understanding the concept of AMIGA OS.


Memory protection on CLASSIC AMIGA is impossible and useless ! That's a fact which is obvious to those with some programming experience.
 
If you want to argue about useless things ok.
But please use another thread for this



Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show all replies
    • http://www.greyhound-data.com/gunnar/
Re: Coldfire AGAIN
« Reply #14 on: April 01, 2008, 01:18:46 PM »

Its not about glorifying. Its about facts!

For "proper" protection you need to sandbox and abstract all and everything.

This means no direct access to Blitter or any HW anymore!

This is the opposite direction of the original AMIGA and the NATAMI or Coldfire designs.

Asking for memory protection makes only sense for people and systems that are willing to trade a lot of performance for abstraction.

Full memory protection => No more direct access to blitter.
A Coldfire with SuperAGA will be very fast but
if you ask for MP then you it will crawl.

Please understand that Coldfire and MP are mutually exclusive.
If you want to argue for MP DON'T do it in the Coldfire thread.