Author Topic: Coldfire AGAIN (Read 25827 times)

minator · « **Reply #164 from previous page:** April 03, 2008, 12:02:32 AM »

Quote

You could do the same trick in the copperlist, DMA and blitter as was done by the MMU. You could design the SuperAGA chipset in such a way that the OS can say to the chipset what actions are allowed by certain tasks like the MMU allows to limit access to certain memory areas. This would normally not impact speed as it is done in parallel. Only when some non-allowed action is asked an exception is raised, just like an exception is raised by the MMU when a wrong memory address is accessed.
This would work best when the SuperAGA and CPU are integrated in one chip.

Don't even have to do that - address range restrictions could be imposed by including an MMU of sorts in the FPGA. Hit specific memory areas and an exception could be raised on the Coldfire, the MP OS would then kick in and handle it - i.e. the MP OS could handle things like the disc I/O.

BTW it doesn't need to necessarily be an MP OS, it could be say AROS, the aim is just to ensure the Classic OS couldn't mangle the second OS's memory or display without effecting the speed of Classic OS.

biggun · « **Reply #165 on:** April 03, 2008, 08:15:23 AM »

Quote

minator wrote:
Quote
You could do the same trick in the copperlist, DMA and blitter as was done by the MMU. You could design the SuperAGA chipset in such a way that the OS can say to the chipset what actions are allowed by certain tasks like the MMU allows to limit access to certain memory areas. This would normally not impact speed as it is done in parallel. Only when some non-allowed action is asked an exception is raised, just like an exception is raised by the MMU when a wrong memory address is accessed.
This would work best when the SuperAGA and CPU are integrated in one chip.

Don't even have to do that - address range restrictions could be imposed by including an MMU of sorts in the FPGA. Hit specific memory areas and an exception could be raised on the Coldfire, the MP OS would then kick in and handle it - i.e. the MP OS could handle things like the disc I/O.

No offense, but these proposals are frankly total non sense.

A Blitter that has a MMU will be extremly expensive to build and its performance will be dissapointing.
Using the Blitter will have a huge overhead as your MMU Blitter table will need to be replaced when switching tasks.

Please get a clue about the HW costs of implementing an MMU and the overhead of maintaining the MMU tables by the OS.

Hans_ · « **Reply #166 on:** April 03, 2008, 02:43:14 PM »

Quote

biggun wrote:

No offense, but these proposals are frankly total non sense.

A Blitter that has a MMU will be extremly expensive to build and its performance will be dissapointing.
Using the Blitter will have a huge overhead as your MMU Blitter table will need to be replaced when switching tasks.

Please get a clue about the HW costs of implementing an MMU and the overhead of maintaining the MMU tables by the OS.

I don't think that the performance will be as bad as you claim. The MMU table could be quite coarse, and would only cover chip RAM. Swapping the tables would only have to be done for tasks that actually use the blitter. Probably just changing the base-pointer to the table would be best; the table itself only needs to be checked if the blitter is actually used. So you don't have to swap the entire table every time.

Having said that, I'm pretty sure that PCI cards can DMA to memory that a particular task doesn't own. The solution, of course, is to provide drivers that do the access. In the case of the Amiga Blitter, having all functions contain pointers to bitmap structures and thus, enforcing blitter within those bitmaps only, would be an easier, and more efficient, solution. This way you could use the CPU's MMU to check that the bitmaps are accessible by the current task.

To be honest, I think that MP can help programmers detect more bugs. I find OS4's current level of protection very useful, as it catches mistakes that would have gone undetected on OS3.x and lower.

Hans

Hans_ · « **Reply #167 on:** April 03, 2008, 02:44:21 PM »

... and back to Coldfire. Where did the Coldfire discussion leave off before the MP tangent?

Hans

darksun9210 · « **Reply #168 on:** April 03, 2008, 03:22:06 PM »

i don't get it.

and i'll be the first to put my hand up and say so.

68k can be emulated,
coldfire can be emulated,
PPC can even be emulated.

so. why bother with a coldfire core at all? why not go for an system-on-a-chip or x86 based mobile core design?
readily available parts, core2duo mobile, bit of ddr2 so-dimm action, onboard USB,firewire,ethernet,gfx,audio, PCI hook up to mate up to an amiga's CPU slot, emulate a 68k and PPC on the x86. we've apparently already got the software to do the job.

bish bash bosh, job's a good'un.
£250 to you guv'ner.
done and done.

that is the only way, except using real 68k chips, i can see a new accelerator seeing light of day going forward from this point.

:-?

biggun · « **Reply #169 on:** April 03, 2008, 03:35:15 PM »

Quote

To be honest, I think that MP can help programmers detect more bugs. I find OS4's current level of protection very useful, as it catches mistakes that would have gone undetected on OS3.x and lower.

Hans

As I said before. UAE is great for development.
You can patch UAE to get your MMU feature for no money.
Then you can use this feature during development to detect bugs.

Adding MMU to real HW would increase the price of the device significant!
If you add the MMU into the HW you will double the costs and lowering performance on the final device.

A clever solution might be to use a free UAE version for the testing. This development UAE can including all bells and whistles of an emulated MMU. It will allow you to detect and fix your bugs early.

minator · « **Reply #170 on:** April 03, 2008, 08:47:01 PM »

Quote

A Blitter that has a MMU will be extremly expensive to build and its performance will be dissapointing.
Using the Blitter will have a huge overhead as your MMU Blitter table will need to be replaced when switching tasks.

Please get a clue about the HW costs of implementing an MMU and the overhead of maintaining the MMU tables by the OS.

You see to have an odd idea of just how big and slow MMUs are - they're neither big nor slow. They're so small in fact they can be found in almost every processor except the very smallest microcontrollers.
They might be big in high end PC processors but I'm not talking about those.

I'm talking about a much simpler mechanism that checks for writes into forbidden memory areas. There'll be a small number of these at most and the permissions can be represented by a single bit (ie 0 = Yes, 1 = No).
It can be set up like an MMU with (very simple) tables, but there's no need for virtual memory or anything like that.

The cost of switching tables on a task switch is zero - you wont need to switch tables on tasks and you wont even need to change them on switching OSs, you just need a single bit to represent the mode (MP OS or classic) and you just switch it. The blitter then gets access to everything - the MP OS can look after it's own memory...

Hans_ · « **Reply #171 on:** April 03, 2008, 09:19:19 PM »

Quote

biggun wrote:

Adding MMU to real HW would increase the price of the device significant!
If you add the MMU into the HW you will double the costs and lowering performance on the final device.

Not that I'm advocating the creation of a chipset MMU, but I don't think that it will be anywhere near as expensive as you claim. You wouldn't need the address remapping part, just a single bit per page indicating if the task is allowed to access the RAM or not. You could even group multiple contiguous pages into blocks for efficiency.

Of course it will lower performance slightly, but no more than the MMU in modern CPUs. All modern processors have them.

Personally I'd be more in favour of an API that prevents you from doing anything stupid (look at the rest of my previous post).

Hans

kreciu · « **Reply #172 on:** April 03, 2008, 09:43:44 PM »

I see you guys have more idea about hardware than me (I can build every think... but I don't now much "how thinks work"

), BUT...

For me Workbench (even 3.x) "is" dead... because of hardware... and lack of software since hardware "is slow". One big circle...

Personally I don't like when system has small lake with swimming fish in the background

(Vista - think). For me system should be FAST, EASY to understand how it works (e.g when I install program I know EXACTLY where are the parts needed to run the think...) yes WB is VERY good here. SO...

We NEVER had a chance to use a WB (expect UAE which I just don't like...) on FAST "native" computer, we always build this "sandwiches" A1200 is like a double burger... A4000 just a burger

and to make everythink work after 10 years... ehh...

We need some nice motherboard which will be compatible with OS3.x enough fast to run: DVD (etc.), mp3 (etc.), play with pictures, run the WB in higher resolution smoooooth

, make a presentation in Hollywood or use new OWB for 68k?

And we need some people who will developer software for that... I can try "donate". I'm just a poor user

Kreciu

Ps. In general computers DO NOT deveolop so much today. There is a set of software we "need" for every day use. Sure we can develop/ change some stuff, but don't be :crazy:

Einstein · « **Reply #173 on:** April 03, 2008, 10:21:57 PM »

Quote

biggun wrote:

As I said before. UAE is great for development.
You can patch UAE to get your MMU feature for no money.
Then you can use this feature during development to detect bugs.

You can't detect all bugs as those rarely reveal themselves in a few sessions, only later when the poor user gets his/her data in other process(es) sent to neverland (directly, or indirectly through corrupted OS data/code).

biggun · « **Reply #174 on:** April 04, 2008, 09:09:43 AM »

Quote

minator wrote:
Quote
A Blitter that has a MMU will be extremly expensive to build and its performance will be dissapointing.
Using the Blitter will have a huge overhead as your MMU Blitter table will need to be replaced when switching tasks.

Please get a clue about the HW costs of implementing an MMU and the overhead of maintaining the MMU tables by the OS.

You see to have an odd idea of just how big and slow MMUs are - they're neither big nor slow. They're so small in fact they can be found in almost every processor except the very smallest microcontrollers.

In other words my friend you have no clue about HW design.

Please get a clue how much resources that would eat in a FPGA design and then come back with proposals.

If I would have the choice to
A) add a MMU around my blitter

B) to get for the same resources a Vector unit compareable to CELL/ALTIVEC/SSE

For me the choice will be clear.

metalman · « **Reply #175 on:** April 05, 2008, 05:12:50 AM »

People have been referencing a Coldfire V5 chip in this thread, but this is the current top of the line Coldfire chip that I can find listed on the Freescale site.
MCF5484
Its listed as a V4e core. (new V4 versions for running Linux® applications are the MCF5445X series) Are there any actual V5 core chips being produced?

The MCF5485 has a digikey price of $33 per unit 1 (it includes a MMU, FPU, Ethernet, USB, DDR/SDR-SDRAM Controller, ect ...)

The coldfire series is similar to the 680x0 series but not 100% code compatible. It seems like missing instructions have been added back to the coldfire as new core generations are released. It's possible that some future V5 core coldfire chips might make it possible to emmulate a 680x0, but based on the problems raised in this thread, it seems unlikely.

So wouldn't it be more practical to just use a current coldfire V4e core (like the MCF5485) as a system co-processor and run the OS on a actual 680x0 and only run blitter routines, ect, that can be rewritten to run as native code on the coldfire, and while getting the benefit of additional HW functions the MCF5485 makes available as part of the 68000 family of chips?

biggun · « **Reply #176 on:** April 05, 2008, 08:09:25 AM »

Hi Metalman,

Quote

metalman wrote:
People have been referencing a Coldfire V5 chip in this thread, but this is the current top of the line Coldfire chip that I can find listed on the Freescale site.
MCF5484
Its listed as a V4e core. (new V4 versions for running Linux® applications are the MCF5445X series) Are there any actual V5 core chips being produced?

Yes, there are V5 cores.

http://www.google.com/url?sa=t&ct=res&cd=1&url=http%3A%2F%2Fwww.freescale.com%2Fwebapp%2Fsps%2Fsite%2Foverview.jsp%3FnodeId%3D0162468rH3YTLC6951fgqk7SDQBWB3&ei=vxz3R9KOHoyQ-ALpp9UQ&usg=AFQjCNGLN56Z3xK5lVnjc1IjPbCh7agRAw&sig2=wiy0UeX65B-siy_LoBI8MQ
For example the HP LaserJet P2015dn Printers use as Processor a 400 MHz Motorola ColdFire V5

Quote

The MCF5485 has a digikey price of $33 per unit 1 (it includes a MMU, FPU, Ethernet, USD, DDR/SDR-SDRAM Controller, ect ...)

Yes, you can get 266Mhz Coldfire V4e for $20

Quote

The coldfire series is similar to the 680x0 series but not 100% code compatible. It seems like missing instructions have been added back to the coldfire as new core generations are released. It's possible that some future V5 core coldfire chips might make it possible to emmulate a 680x0, but based on the problems raised in this thread, it seems unlikely.

The V4 can execute 68k binaries
The question is not if it works, but how big the average performance impact it.
BTW several people are evaluating in this direction already:
http://projects.powerdeveloper.org/project/coldfire/707

Quote

So wouldn't it be more practical to just use a current coldfire V4e core (like the MCF5485) as a system co-processor and run the OS on a actual 680x0 and only run blitter routines, ect, that can be rewritten to run as native code on the coldfire, and while getting the benefit of additional HW functions the MCF5485 makes available as part of the 68000 family of chips?

I see where you are coming from. And for a test system your idea is okay.

Regarding using the Coldfire. I can only speak for the concept idea of the NatAmi www.natami.net here.

The Natami draws a lot of performance out of the SuperAGA blitter. A good Blitter will always be faster than a good CPU. This is because of the way a blitter more effectively pipeline and can fully use the chip select lines, which a CPU can not. When you connect the same memory to a blitter and a CPU, then the CPU can reach at best case 50% of the possible blitter speed.
As the SuperAGA Blitter is many times faster than the Coldfire it makes no sense to use the Coldfire as blitter.
It could make sense to use the Coldfire as main CPU.

I'm very curious to see the results of the Coldfire performance evaluation.
I think that a Coldfire combined with SuperAGA in one chip the the potential to be a winner.
The beauty of this SOC is that you can get blistering fast Blitter plus a decent CPU in one chip for $20.

The question that I'm wondering a bit is how fast do we need to be. Yes I know its cool to be faster than the fastest Cell.
But seriously, how fast does a AMIGA OS system need to be to be fast?
Is the performance of a 68030 with 50 MHz OK?
Is the performance of a 68030 with 100 MHz OK?
Is the performance of a 68030 with 200 MHz OK?
Is the performance of a 68030 with 500 MHz OK?
Is the performance of a 68030 with 1000 MHz OK?

Cheers

minator · « **Reply #177 on:** April 05, 2008, 02:37:35 PM »

Quote

Quote
You see to have an odd idea of just how big and slow MMUs are - they're neither big nor slow. They're so small in fact they can be found in almost every processor except the very smallest microcontrollers.

In other words my friend you have no clue about HW design.

If you are going to reply to me and quote me you could at least read the entire post.

Secondly, you've blatantly quoted me out of context.

If you really think that comment is inaccurate or show any faulty knowledge on my part please explain why.

however...

An MMU might have been a big deal in 1985, but it's not today. Some of the high end processors may have large MMUs but as I said (on the next line that you didn't quote) we're not talking about these. In any case much of these will be taken up by the TLBs, these will not be necessary.

Quote

Please get a clue how much resources that would eat in a FPGA design and then come back with proposals.

Quote

If I would have the choice to
A) add a MMU around my blitter

B) to get for the same resources a Vector unit compareable to CELL/ALTIVEC/SSE

For me the choice will be clear.

A vector unit comparable to those is going to be quite considerably larger than any MMU, if you don't believe me have a look at a die photo of one of the Cell's SPEs - then compare how big it is to the MMU *it contains*.

--

There is a long term general anathema to using MMUs in the Amiga community, possibly based on the assumption that they slow memory access. More knowledgeable folks could argue that page table walks are slow and you have to switch page tables every time you switch tasks.

However, today, none of this is true.

MMUs will increase memory latency but the effect of this is utterly insignificant, I learned this when I first used BeOS about ten years ago. It had full memory protection but it is every bit as responsive as any Amiga.

CPU designers know the slow parts in their designs and fix them in subsequent designs. TLBs cache pages and this means page table walks are relatively rare. Modern processors also shouldn't need to change the page table every time they switch tasks, e.g. the processors in my mobile phone have full memory protection and virtual memory support and they will not switch tables on a task switch - I know because I happen to of had the features of those particular processors explained to me yesterday. I cannot say if it's true for all modern processors though.

However, as I said in my previous post what I'm suggesting is much simpler than this so wont have any of these complexities.

I have some ideas for implementation so I will put these down.

Fats · « **Reply #178 on:** April 05, 2008, 08:14:51 PM »

Quote

biggun wrote:

A Blitter that has a MMU will be extremly expensive to build and its performance will be dissapointing.
Using the Blitter will have a huge overhead as your MMU Blitter table will need to be replaced when switching tasks.

Please get a clue about the HW costs of implementing an MMU and the overhead of maintaining the MMU tables by the OS.

I don't think you need a full blown MMU. I think one or a few mask registers can do the trick. Page paged tables are needed when implementing virtual memory and page swapping which is not needed in this case.

greets,
Staf.

metalman · « **Reply #179 on:** April 07, 2008, 06:24:20 AM »

Quote

biggun wrote:
Quote
metalman wrote:
People have been referencing a Coldfire V5 chip in this thread, but this is the current top of the line Coldfire chip that I can find listed on the Freescale site.
MCF5484
Its listed as a V4e core. (new V4 versions for running Linux® applications are the MCF5445X series) Are there any actual V5 core chips being produced?

Yes, there are V5 cores.

V5 ColdFire Core: Full Superscalar
For example the HP LaserJet P2015dn Printers use as Processor a 400 MHz Motorola ColdFire V5

You found what I did when I searched the first time for a V5 coldfire. The document you linked is a Roadmap document, what I can't find is an link to a actual V5 chip datasheet.

Quote

biggun wrote:
Quote
metalman wrote:
The MCF5485 has a digikey price of $33 per unit 1 (it includes a MMU, FPU, 10/100 Ethernet, USB 2.0, DDR/SDR-SDRAM Controller, PCI Interface, ect ...)

Yes, you can get 266Mhz Coldfire V4e for $20

Which one?
The MCF5445X series and the MCF548x series which are designed to work with the Linux Development kits, with royalty-free, open-source software demonstration applications provided, seem to me to be the best choices.

M5484LITE: Linux Development Kit for the ColdFire MCF548x Family

Quote

biggun wrote:
Quote
metalman wrote:
The coldfire series is similar to the 680x0 series but not 100% code compatible. It seems like missing instructions have been added back to the coldfire as new core generations are released. It's possible that some future V5 core coldfire chips might make it possible to emmulate a 680x0, but based on the problems raised in this thread, it seems unlikely.

The V4 can execute 68k binaries
The question is not if it works, but how big the average performance impact it.
BTW several people are evaluating in this direction already:
Coldfire MCF54455 Project

Seems there are some major problems or products like the Dragon would be shipping by now. Maybe if some more 680x0 instructions are added back in a V5 chip it might work.

Quote

So wouldn't it be more practical to just use a current coldfire V4e core (like the MCF5485) as a system co-processor and run the OS on a actual 680x0 and only run blitter routines, ect, that can be rewritten to run as native code on the coldfire, and while getting the benefit of additional HW functions the MCF5485 makes available as part of the 68000 family of chips?

I see where you are coming from. And for a test system your idea is okay.

Regarding using the Coldfire. I can only speak for the concept idea of the NatAmi www.natami.net here.

The Natami draws a lot of performance out of the SuperAGA blitter. A good Blitter will always be faster than a good CPU. This is because of the way a blitter more effectively pipeline and can fully use the chip select lines, which a CPU can not. When you connect the same memory to a blitter and a CPU, then the CPU can reach at best case 50% of the possible blitter speed.
As the SuperAGA Blitter is many times faster than the Coldfire it makes no sense to use the Coldfire as blitter.
It could make sense to use the Coldfire as main CPU.[/quote]

Cool!!! Wasn't considering someone designing a new AGA hardware blitter.

Use the coldfire as a co-processor to run coldfire.library routines that have been re-written to run native code on the coldfire such a floating-point math ect...

Quote

biggun wrote:
I'm very curious to see the results of the Coldfire performance evaluation.
I think that a Coldfire combined with SuperAGA in one chip the the potential to be a winner.
The beauty of this SOC is that you can get blistering fast Blitter plus a decent CPU in one chip for $20.

I see the Coldfire as a way to add MMU, FPU, USB, Ethernet, DDR/SDR-SDRAM Controller, PCI Interface, hardware functions using the Coldfire as a co-processor.

MCF548x Reference Manual

Quote

biggun wrote:
The question that I'm wondering a bit is how fast do we need to be. Yes I know its cool to be faster than the fastest Cell.
But seriously, how fast does a AMIGA OS system need to be to be fast?
Is the performance of a 68030 with 50 MHz OK?
Is the performance of a 68030 with 100 MHz OK?
Is the performance of a 68030 with 200 MHz OK?
Is the performance of a 68030 with 500 MHz OK?
Is the performance of a 68030 with 1000 MHz OK?

Cheers

A computer only seems as fast as its slowest bottle neck.

The Amiga hardware design philosophy was to offload as many functions as possible to fast co-processors.
Giving the main cpu more idle time by offloading more routines to co-processors (video, FPU, ect) makes the whole systems apparent speed faster.

Author Topic: Coldfire AGAIN (Read 25827 times)

minator

Re: Coldfire AGAIN

biggun

Re: Coldfire AGAIN

Hans_

Re: Coldfire AGAIN

Hans_

Re: Coldfire AGAIN

darksun9210

Re: Coldfire AGAIN

biggun

Re: Coldfire AGAIN

minator

Re: Coldfire AGAIN

Hans_

Re: Coldfire AGAIN

kreciu

Re: Coldfire AGAIN

Einstein

Re: Coldfire AGAIN

biggun

Re: Coldfire AGAIN

metalman

Re: Coldfire AGAIN

biggun

Re: Coldfire AGAIN

minator

Re: Coldfire AGAIN

Fats

Re: Coldfire AGAIN

metalman

Re: Coldfire AGAIN