Author Topic: 68060 and the 68882 (Read 15688 times)

matthey · « **on:** April 21, 2011, 01:28:05 AM »

Quote from: SpeedGeek;632671

But you won't see the performance you would have with a 50 Mhz 68030 by using the instruction trap kludge.

Trapping would slow the CPU down to a crawl. The 68882 is a dog compared to the 68060 FPU also. ~1/8 of the speed on average at the same clock rate comes to mind (not counting any trapping overhead). If I remember correctly, Motorola did something to keep the 68881/68882 from being easily used with 68040+ as well. Don't quote me on the last 2 statements though. Here is a chart of some common FPU instructions and timings in cycles for the 68882, 68040 and 68060 in that order...

FMove FPn,FPn 21 2 1
FMove.D ,FPn 40 3 1
FMove.D FPn, 44 3 1
FAdd FPn,FPn 21 3 3
FSub FPn,FPn 21 3 3
FMul FPn,FPn 76 5 3
FDiv FPn,FPn 108 38 37
FSqrt FPn,FPn 110 103 68
FAdd.D ,FPn 75 3 3
FSub.D ,FPn 75 3 3
FMul.D ,FPn 95 5 3
FDiv.D ,FPn 127 38 37
FSqrt.D ,FPn 129 103 68

For trapping, add in 19 cycles for the trap and 17 for the RTE instruction on the 68060. Also consider that integer instructions and branches can operate in parallel with FPU instructions on the 68060 and can't while trapping. The 68060 would probably be faster with an all software floating point library in most cases. You should look at the Natami project if you want a faster 68k CPU and FPU.

matthey · « **Reply #1 on:** April 21, 2011, 02:34:47 AM »

Quote from: Iggy;632683

A further question. On 68Ks without FPUs, do all floating point operations produce exceptions?

Yes.

Quote from: Iggy;632683

Further, would it be possible to program an FPGA to emulate (or improve upon these trapped illegal opcodes?

It's not that simple. The CPU is set up to communicate with co-processors. The FPU instructions actually have a 3 bit coprocessor ID specified in them. When set up properly, the instructions are sent to the appropriate co-processor without trapping. The co-processor signals when it's done with the instruction. If I remember correctly, Motorola changed something in the 68040+ so that the old external FPU co-processors didn't work any more. They wanted people using the newer style and faster built in FPU as well as customers buying them. You could probably research how external co-processors were done in the 68020/68030 and 68881/68882 manuals. More than 1 FPU was possible too. Someone at C= supposedly made a 16 math coprocessor card (8 should be the limit of co-processor IDs). That would probably have more processing power than a 68060 FPU if they could all be used in parallel. Still, some operations like fmove have less overhead being integrated to the CPU.

An fpga can contain a full FPU running much faster than a 68882. If it's not integrated with the CPU, it's going to have a bottleneck even if the traps can be avoided. The CPU+FPU can be contained in a fpga without the overhead. Less clocks without a longer pipeline than the 68060 are possible. Gunnar (Natami project) claims 1 cycle for a floating point multiply (fmul) should be possible for example.

Quote from: Iggy;632690

Interesting idea. Are you suggesting that an EC processor with a software floating point library might be faster than using the built in FPU of a full 68060?

No. A 68060 without FPU using a floating point software library would likely be faster than using a 68882. A 68060 without a FPU and software floating point would have to run several times faster than a 68060 with FPU to match the same performance. There is still a trap here as well unless the AmigaOS IEEE math libraries are used.

matthey · « **Reply #2 on:** April 21, 2011, 04:42:08 AM »

Quote from: Iggy;632700

Thanks matthey,
While I'm not concerned about the lack of an MMU, I was trying to find a work around for the FPU functions. If trapping exceptions and using a software library is plausible it might provide one solution.

Thomas Richter created a program much like OxyPatcher called MuRedox that avoided the trapped instructions with replacement code on the fly. He posts on the Natami forum frequently. His code would need a fair amount of work to support all FPU instructions though. Optimized 68060 integer code and no traps should be faster than a 68882. If single precision was all that was required then the integer unit might be as fast as 1/4 the speed of a 68060 with FPU (my guess). Most calculations are done with extended precision though which can be time consuming for an integer processor, especially multiplication (no 64 bit integer multiplication in 68060), division and square root. A better option is for everyone who wants a fast 68060 to let the Natami team know so when there is enough demand and bug fixes that a N68070 with FPU can be burned in a real chip. Think 300-500MHz and faster/MHz than a 68060. Probably won't be ready for another year or two though

.

Author Topic: 68060 and the 68882 (Read 15688 times)

matthey

Re: 68060 and the 68882

matthey

Re: 68060 and the 68882

matthey

Re: 68060 and the 68882