Author Topic: Motorola 68060 FPGA replacement module (idea) (Read 53014 times)

ChaosLord · « **Reply #74 on:** January 08, 2013, 05:21:26 AM »

Quote from: JimDrew;721748

Well, there are quite a few Amiga programs - including several of my own that all follow 100% legal programming practices (according to common sense and the RKMs) that will not run on an 060 with superscalar and/or branch caching enabled.

Are these programs all emulators?

Emulators have to do weird exotic things and/or deal with weird exotic things in order to get good performance. These weird exotic things are things that a regular programmer never deals with.

Quote

I don't recall all of the reasons behind the issues.

I am not entirely clear if your complaints about the 060 are not simply that all the first 060s had bugs in them. Later on those bugs were ironed out. Iirc at least one of the bugs only happened when was superscalar mode was active so it could be avoided by turning it off.

I am thinking that if you tried a later revision 060 you might like it.

Quote

I should go look at the mmu.library replacement that we made for EMPLANT and FUSION... I know I commented some things there.

I remembered u coded Emplant but I didn't know about Fusion. I never actually used either one. The only mac emu I ever used was A-Max (I think that is what it was called) way back in prehistoric times.

Quote

I know that self modifying code is definitely one of the things that causes a problem when one of the cached instructions in the pipeline has been modified (like a branch table). Yes, I consider self-modifying code 100% legal. You are suppose to flush the caches (or turn them off) with self modifying code, but when you do that you are then running at sub-030 speeds.

If u only modify your code once, flush the cache and go on then its not such a big deal... but if you have to do it in a loop then speed dies.

Does your Emu scan opcodes and runtime replace all those ILLEGAL instructions that MacOS used for Quickdraw etc. ?

Quote

The 060 really only adds dual instruction pipelining and a 4-way cache. A higher speed (100MHz+) 040 core would probably be better in the long run, especially if it handled floating point without completely stalling the core like the 060 does.

040 core is slow at math. 060 is fast at math. 040 has no branch prediction too.

freqmax · « **Reply #75 on:** January 08, 2013, 07:59:40 AM »

Builtin 68881 vs 68882 causes the speed difference?

matthey · « **Reply #76 on:** January 08, 2013, 08:40:38 AM »

Quote from: Mrs Beanbag;721644

It might be a good starting point. [ColdFire] Differences are:

1. No DBcc
2. No bitwise rotation (rol, ror)
3. No bitfield operations
4. Multiply instructions don't set flags. From the Coldfire manual:
CCR[V] is always cleared by MULS/U, unlike the 68K family processors

1-3 on your list are not 68k conflicts but trapping would make them very slow. Here is a list of 68k and ColdFire conflicts that are not fixable by trapping (i.e. ColdFire.library):

1. ColdFire stack is 4 byte aligned (68k 2 byte). MOVE.B/W (SP)+ and -(SP) fail.
2. REMS/REMU encoding is incompatible with DIVSL/DIVUL encoding.
3. ColdFire multiply instructions don't set flags like the 68k.

In addition, practically anything in Supervisor mode will not work.

Quote from: Mrs Beanbag;721644

Coldfire also has a few extra commands (some of which would be quite useful, such as saturate and multiply-accumulate)

MVS, MVZ and BYTEREV should have been in the 68060. SATS is good for DSP/Codec type processing but where is SATU and ABS? The CF MAC processor is powerful but is a bolt on that doesn't fit with the 68k/ColdFire IMO. It's a poor man's SIMD as the CF is low end and cheap, cheap, cheap. Freescale will sell you PPC or now ARM (which they sadly license) if you need some real processing power.

Quote from: JimDrew;721748

Well, there are quite a few Amiga programs - including several of my own that all follow 100% legal programming practices (according to common sense and the RKMs) that will not run on an 060 with superscalar and/or branch caching enabled. I don't recall all of the reasons behind the issues. I should go look at the mmu.library replacement that we made for EMPLANT and FUSION... I know I commented some things there. I know that self modifying code is definitely one of the things that causes a problem when one of the cached instructions in the pipeline has been modified (like a branch table). Yes, I consider self-modifying code 100% legal. You are suppose to flush the caches (or turn them off) with self modifying code, but when you do that you are then running at sub-030 speeds.

Self modifying code needs to flush the caches (including branch cache) which negates the advantage of the caches and any speed gains of self modifying code. If you don't like caches, stick to the 68000 until you change your mind :/. Some early 68060.library's may not have flushed all the caches properly, fixed the superscaler bugs in the 68060 properly or may have had bugs in the CPU support code used for trapping. The best ones matured and work fine. Fusion works fine on the 68060 here except for an occasional random crash. The last ShapeShifter was more stable though. That was using the last version of Fusion which I bought in your Fusion/PCx CD bundle. Fusion had some nice features over ShapeShifter like the file transfer and auto screen mode changes from within the Mac but stability is more important. I would still use Fusion if it was more stable and supported more hard drive options which ShapeShifter is better at.

The Natami fpga CPU was going to use writethrough caching with snooping and auto flushing of detected dirty cache lines. This is a good option that allows very large caches with excellent compatibility. It would be possible to auto flush a branch cache in the address range of the dirty lines that are detected by snooping also. With the faster memory and larger caches of today, this should give cache performance close to that of the 68060 with better tolerance for self modifying code.

Quote from: JimDrew;721748

The 060 really only adds dual instruction pipelining and a 4-way cache. A higher speed (100MHz+) 040 core would probably be better in the long run, especially if it handled floating point without completely stalling the core like the 060 does.

The MC68060UM says:

"The MC68060 allows simultaneous execution of two integer instructions (or an integer and a float instruction) and one branch instruction during each clock."

"The MC68060's FPU operates in parallel with the integer unit. The FPU performs numeric calculations while the integer unit continues integer processing."

The 68060 FPU was a nice improvement over the 68040 FPU. It dropped a few 040 FPU instructions that were very rarely used and added back the FINT and FINTRZ instructions which compilers use commonly. The execution speeds were also improved across the board and more parallel operation is possible. The 040 can do some limited parallel operation also.

The 68060 is a great processor which does a lot of parallel work but it's not easy to make and it's probably not as easy to make in an fpga. A faster clocked more 68040 like CPU makes sense in the fpga. Bigger caches, a branch cache and more parallel operation are needed for maximizing performance though.

JimDrew · « **Reply #77 on:** January 08, 2013, 02:14:28 PM »

I never saw in crashing of the Mac OS itself under FUSION, but there were plenty of apps that did not like superscaler and/or branch caches enabled. However, in all fairness these apps were written for the 020/030, so minimal cache flushing was required for self modifying code, which is quite common for decompression programs, encryption for copy protection, etc. Some of these issues could be due to buggy superscaler mode in early 060 revs. I went from the first rev available from Phase 5 to the PPC, so I never tested any newer revs.

I know that floating point math scores were often times faster with the 68040-33MHz X-Calibur board than the Cyberstorm 060-50MHz.

Sadly even a 75MHz 060 is really slow compared to modern Intel processors, so the idea of a dedicated Intel CPU adapter is probably the fastest and least expensive option.

wawrzon · « **Reply #78 on:** January 08, 2013, 03:46:32 PM »

actually someone is actively working on an a600 fpga accelerator:

http://www.natami.net/knowledge.php?b=6¬e=32232&x=7

http://www.a1k.org/forum/showthread.php?p=589135#post589135

he may not be very experienced, but he is motivated seems to learn quickly and may require all assistance and support from experienced hardware experts (preferably without sarcastic remarks).

it seems to be a rule that the experienced users dont start such projects for whatever reason, maybe bored with it at work or maybe knowing how much effort it is. on the other hand there are those willing newbies. being stubborn and treated right, like the guy recreating thylacine-usb also on a1k, they actually deliver some results. so maybe just try to support them as much as possible.

what concerns coldfire if there were free open source modifyable cores one might adopt changes granting backwards 68k compatibility. i guess natami team might have such a chance at some point as they ve been provided coldfire dev boards by bbrv at some point. but there are no open cf cores as far as i see, and its unlikely freescale will reveal its designs, so this isnt an opportunity.

JimDrew · « **Reply #79 on:** January 08, 2013, 04:29:35 PM »

The experienced users that have this much hardware/software/firmware knowledge generally do not work for free. So, with an extremely limited market, there is really no desire to work on something where you won't at least recoup your time investment.

There are quite a few neat Amiga products I would make if I could sell thousands of each of them. But, there is no chance for that. The C64 market is MUCH bigger (as I am finding out and making products for it).

Cosmos · « **Reply #80 on:** January 08, 2013, 05:13:13 PM »

Quote from: JimDrew;721789

The experienced users that have this much hardware/software/firmware knowledge generally do not work for free. So, with an extremely limited market, there is really no desire to work on something where you won't at least recoup your time investment.

For saving the Amiga, the last Amiga Classics experts must work together for free, in one direction... Together, we are stronger !

I never found someone who understand this evidence...

wawrzon · « **Reply #81 on:** January 08, 2013, 05:43:19 PM »

perhaps if focusing efforts of all involved the (intelectual and work) investments fore each one could be minimized while solutions may be reached easier, as it happens with open source efforts, such as aros. when the common (technical) goal has been reached the actual technic can anyway only be provided by those who are able to handle it. therefore supporting the common goal might be benefitial for all.

on the other hand c64 hardware market except it is bigger is probably more easy to staisfy with simpler solutions.

freqmax · « **Reply #82 on:** January 08, 2013, 05:53:26 PM »

Developers may want make the most comatible solution, others the fastest with most bells and whistles which ofcourse ends up with that you loose the starting point.

Some have different coding styles. Or just use different schematics CAD. It might be more fun to make new than to integrate with existing creations. Some stuff just requires a heavy start like Kickstart+Workbench and thus require a dedicated work like the one undertaken by AROS-m68k.
etc..

There are reasons why efforts diverge.

freqmax · « **Reply #83 on:** January 08, 2013, 06:30:06 PM »

Quote from: wawrzon;721785

actually someone is actively working on an a600 fpga accelerator:

http://www.natami.net/knowledge.php?b=6¬e=32232&x=7

http://www.a1k.org/forum/showthread.php?p=589135#post589135

He has his own site: http://www.majsta.com/

Seems several FPGAs had to give their life to that project due to soldering technique. But he seems on track now.

billt · « **Reply #84 on:** January 08, 2013, 06:33:54 PM »

Quote from: JimDrew;721789

The experienced users that have this much hardware/software/firmware knowledge generally do not work for free. So, with an extremely limited market, there is really no desire to work on something where you won't at least recoup your time investment.

But a few are. That's how we got Minimig, TG68, aoOCS, Suska, Zet, OpenGraphics, etc. Check out opencores.org for a variety of things.

billt · « **Reply #85 on:** January 08, 2013, 06:45:29 PM »

Quote from: wawrzon;721785

actually someone is actively working on an a600 fpga accelerator

Indeed, and I've been very excited about that. It's basically what I imagine for this discussion, only with a 68000 plug on the bottom rather than 040/060. It's not exactly shaped like a 68000, it has level shifting to be 5V safe, has power and memory onboard. But very much the same idea. He's working with the TG68, which has had some issues to work out to fit onto a standard 68000 bus. I'd really like to see the Suska 68000 code in there instead, as I think it would more readily fit the standard bus than the TG68. (Though I understand that further work on TG68 core is improving that as well, in addition to enhancing to 020 compatibility)

wawrzon · « **Reply #86 on:** January 08, 2013, 07:06:10 PM »

Quote from: freqmax;721792

Developers may want make the most comatible solution, others the fastest with most bells and whistles which ofcourse ends up with that you loose the starting point.

Some have different coding styles. Or just use different schematics CAD. It might be more fun to make new than to integrate with existing creations. Some stuff just requires a heavy start like Kickstart+Workbench and thus require a dedicated work like the one undertaken by AROS-m68k.
etc..

There are reasons why efforts diverge.

all that has to be overcome in an software open source community projest as well. everything an individual has to understand about it that together we can get further and faster than each on his own. all that costs some drwabacks like making appointments and coordinating the effort to certain extent but the gain is worth the cost usually. otherwise we wouldnt have any industry, hell we would not have a civilisation. it just a basic social thing.

that said im not trying to insist on anything just pointing out stuff to consider. look at natami, where an isolated one man project tends to end after all that effort. nothing has been gained, save for thomas himself. maybe even him considers it wasted time now. if it was open someone else might have followed up..

wawrzon · « **Reply #87 on:** January 08, 2013, 07:08:27 PM »

Quote from: freqmax;721796

He has his own site: http://www.majsta.com/

Seems several FPGAs had to give their life to that project due to soldering technique. But he seems on track now.

i said he may not be experienced but stubborn. thats valuable too. if it was common effort someone else could solder for him..

billt · « **Reply #88 on:** January 08, 2013, 07:27:38 PM »

Quote from: JimDrew;721783

Sadly even a 75MHz 060 is really slow compared to modern Intel processors, so the idea of a dedicated Intel CPU adapter is probably the fastest and least expensive option.

Cheapest? It adds an Intel CPU and whatever minimal supporting chipset is required for that to run to the FPGA and whatever is required to safely connect the FPGA to the 680x0 socket. Maybe you can get a smaller and thus cheaper FPGA for a minimal PC to 680x0 bus bridge compared to putting as high-end a 680x0 softcore as can be put into an FPGA, but I don't think that will offset the price of the PC motherboard. (ie, can an x86 CPU work without a PCH/FCH chip?) Yes, I know there are some ludicrously expensive FPGAs, but I don't expect they'd be chosen for this sort of product. We should be able to fit this sort of thing into something reasonable.

Would the x86 doing nothing more than emulating a 68K be higher performance than the FPGA? That's possible.

Of the big companies I've tried to get NDAs from for various hardware projects, Intel is one of the few that said we were not worth the time to process NDA paperwork. This 68K emulator from a tiny modern PC would probably be equally uninteresting to them. A custom CPU accelerator with an X86 is unlikely with Intel. I want to see if a PCH or FCH chip would connect to a PowerPC PCI-Express slot, like in Sam460... AMD is much easier to get in with, at least for their embedded class stuff, maybe not their high-end.

I suppose you could see if a nano or pico-ITX PC would be small enough and have an appropriate bus to plug into the FPGA bridge. But that means you still have to buy that

?-ITX computer, and I'm not sure if they have a PCI slot/header/something available. It'd be silly to have USB in your CPU socket pathway...

In the FPGA CPU softcore vs x86 emulator debate, my own interest lies on the FPGA softcore side, perhaps partially as I'm just an FPGA fan and am interested in learning how to do that. Doing a bridge between the 680x0 socket and something on a PC motherboard, also inside an FPGA, is probably an easier Verilog/VHDL project compared to a CPU softcore, but doesn't interest me personally as much. So I will tend to favor FPGA softcore regardless of other benefits on the PC motherboard side of the debate, but my preference gets into somewhat subjective preference and interest areas. It just sounds more fun.

mongo · « **Reply #89 from previous page:** January 08, 2013, 07:55:31 PM »

Quote from: billt;721798

Indeed, and I've been very excited about that. It's basically what I imagine for this discussion, only with a 68000 plug on the bottom rather than 040/060. It's not exactly shaped like a 68000, it has level shifting to be 5V safe, has power and memory onboard. But very much the same idea. He's working with the TG68, which has had some issues to work out to fit onto a standard 68000 bus. I'd really like to see the Suska 68000 code in there instead, as I think it would more readily fit the standard bus than the TG68. (Though I understand that further work on TG68 core is improving that as well, in addition to enhancing to 020 compatibility)

TobiFlex had the TG68 core running on an A500 about 3 years ago.

http://www.a1k.org/forum/showthread.php?t=20223

Author Topic: Motorola 68060 FPGA replacement module (idea) (Read 53014 times)

ChaosLord

Re: Motorola 68060 FPGA replacement module (idea)

freqmax

Re: Motorola 68060 FPGA replacement module (idea)

matthey

Re: Motorola 68060 FPGA replacement module (idea)

JimDrew

Re: Motorola 68060 FPGA replacement module (idea)

wawrzon

Re: Motorola 68060 FPGA replacement module (idea)

JimDrew

Re: Motorola 68060 FPGA replacement module (idea)

Cosmos

Re: Motorola 68060 FPGA replacement module (idea)

wawrzon

Re: Motorola 68060 FPGA replacement module (idea)

freqmax

Re: Motorola 68060 FPGA replacement module (idea)

freqmax

Re: Motorola 68060 FPGA replacement module (idea)

billt

Re: Motorola 68060 FPGA replacement module (idea)

billt

Re: Motorola 68060 FPGA replacement module (idea)

wawrzon

Re: Motorola 68060 FPGA replacement module (idea)

wawrzon

Re: Motorola 68060 FPGA replacement module (idea)

billt

Re: Motorola 68060 FPGA replacement module (idea)

mongo

Re: Motorola 68060 FPGA replacement module (idea)