Amiga.org

Amiga computer related discussion => General chat about Amiga topics => Topic started by: polyp2000 on November 11, 2015, 08:45:57 AM

Title: FPGA Amiga Possibilities ?
Post by: polyp2000 on November 11, 2015, 08:45:57 AM
This popped up on one of my feeds today :

http://makezine.com/2015/10/12/snickerdoodle-dev-board-fpga-arm-processor/

Sounds like an interesting combination dont you think ?

Anyone with knowledge about FPGA's care to speculate on if this could make a good platform for some unusual minimig or even accellerators ?

Nick
Title: Re: FPGA Amiga Possibilities ?
Post by: Iggy on November 11, 2015, 02:19:38 PM
Interesting.
Xilinx already makes lines that combine FPGA with dual PPC (I have seen one of those chips going for about $50).

But this is a whole board.
Title: Re: FPGA Amiga Possibilities ?
Post by: matthey on November 11, 2015, 04:26:49 PM
Quote from: polyp2000;799083
This popped up on one of my feeds today :

http://makezine.com/2015/10/12/snickerdoodle-dev-board-fpga-arm-processor/

Sounds like an interesting combination dont you think ?


The ARM+FPGA SoCs have been available for awhile but a $55 base price for a whole board is cheap and should attract attention. They will probably sell hundreds of thousands of this board mostly for embedded and educational purposes. Meanwhile, the Amiga PPC is lucky to sell a few thousand low end boards at 10x the price or more as they beat a dying old horse.

Quote from: polyp2000;799083

Anyone with knowledge about FPGA's care to speculate on if this could make a good platform for some unusual minimig or even accellerators ?


The number of FPGA I/O pins available is pretty good with 154 or 179 available. The FPGA is still fairly small though with 430k and 1.3M gates (the FPGA Arcade uses a Xilinx 1.6M gate FPGA). The Apollo project had a sandwich accelerator which connected to a standard FPGA board with a similar price to the high end Snickerdoodle but had a little bigger FPGA as I recall. I wouldn't be surprised if the base $55 Snickerdoodle is sold at cost to attract attention and the profit is in the the high end board and value added options. These guys understand marketing and payed attention to the Raspberry Pi's success (piSmasher anyone?). An Amiga 68k+FPGA would have standard hardware with software already available. Amiga developers might even come back with good sales numbers. Meanwhile, Amiga PPC continues to beat that dying old horse.
Title: Re: FPGA Amiga Possibilities ?
Post by: kolla on November 11, 2015, 08:37:44 PM
It should be OK to put Minimig chipset on FPGA and emulate 68k on one of the ARM cores. Or do funny things with native AROS ARM and Amiga chipset on FPGA. Or... there are many possibilities with such a board.
Title: Re: FPGA Amiga Possibilities ?
Post by: polyp2000 on November 12, 2015, 08:55:29 AM
Quote from: kolla;799101
It should be OK to put Minimig chipset on FPGA and emulate 68k on one of the ARM cores. Or do funny things with native AROS ARM and Amiga chipset on FPGA. Or... there are many possibilities with such a board.


That sounds pretty cool - with an ARM chip of this calibre , what kind of speed could we hope to achieve?
Title: Re: FPGA Amiga Possibilities ?
Post by: matthey on November 12, 2015, 06:50:12 PM
Quote from: polyp2000;799114
That sounds pretty cool - with an ARM chip of this calibre , what kind of speed could we hope to achieve?

The Raspberry Pi 2 will emulate a 68000+ECS Amiga generally without slowdown. Full speed 68020+AGA emulation is a challenge (not there yet) on the Raspberry Pi 2 but Amiga emulators are only using 1 core (using more cores to boost performance is also challenging).

Raspberry Pi 1 single core ARM1176JZF-S@700MHz 1.25 DMIPS/MHz/core = 875 DMIPS/core
Raspberry Pi 2 quad core ARM Cortex-A7@900MHz 1.90 DMIPS/MHz/core = 1710 DMIPS/core
Snookerdoodle dual core ARM Cortex-A9@667MHz 2.50 DMIPS/MHz/core = 1668 DMIPS/core
Snookerdoodle+ dual core ARM Cortex-A9@866MHz 2.50 DMIPS/MHz/core = 2165 DMIPS/core

Tabor dual core PPC P1022@1200MHz 2.4 DMIPS/MHz/core = 2880 DMIPS/core

The Tabor board puts the low integer performance in perspective as these are all power efficient weak RISC CPU designs. The P1022 is basically a PPC G3 design (introduced in 1997 like the Pentium II) with faster buses, more bandwidth, die shrinks for increased clock speeds (necessary for a shallow pipeline CPU design) and a non-standard PPC FPU. I won't bother mentioning the costs. Supposedly the Tabor board has a Lattice FPGA of unknown size. A big enough FPGA with unused gates could simulate the Amiga custom chips in the Snookerdoodle and Tabor offloading a lot of CPU processing power. There are many variables including how the FPGA is connected, what kind of graphics are available, etc.
Title: Re: FPGA Amiga Possibilities ?
Post by: kolla on November 12, 2015, 09:30:31 PM
More interesting for this board is if one can couple m68k emulation on ARM CPU with Amiga chipset on FPGA. Where Tabor/PPC et all fits into this I am not sure :)
Title: Re: FPGA Amiga Possibilities ?
Post by: matthey on November 12, 2015, 10:11:43 PM
Quote from: kolla;799131
More interesting for this board is if one can couple m68k emulation on ARM CPU with Amiga chipset on FPGA. Where Tabor/PPC et all fits into this I am not sure :)

The Tabor board could also couple m68k emulation (but on a PPC CPU) with Amiga chipset in the FPGA, if the FPGA was big enough and available for use. This could allow a PPC Amiga without the overhead of emulation although with non-standard PPC FPU, possibly not open enough hardware and AmigaOS to make it easy and with a much higher price. If you are thinking of a cheap Amiga then the Tabor probably doesn't "fit" here even though it was useful for CPU speed comparison. The FPGA Arcade and Mist could be considered also, despite being higher in price than the Pi or Snookerdoodle. Their fast and more accurate Amiga custom chip simulation more than makes up for their slower CPU simulation.
Title: Re: FPGA Amiga Possibilities ?
Post by: Acill on November 12, 2015, 10:42:35 PM
Dont let them hear you talking about this as an Amiga emulation platform. It will raise the price!! ;)
Title: Re: FPGA Amiga Possibilities ?
Post by: fishy_fiz on November 14, 2015, 05:08:09 AM
@Matthey

RPi2 is quad core, not dual core.
Also, performance of uae on rpi2 is around the '060 type range when using uae4arm (which has jit CPU emulation). Far in excess of "not quite there yet" (in regards to 68020+aga).

Seriously, did you just pick random numbers? :)
Title: Re: FPGA Amiga Possibilities ?
Post by: matthey on November 14, 2015, 05:41:13 AM
Quote from: fishy_fiz;799177

RPi2 is quad core, not dual core.
Also, performance of uae on rpi2 is around the '060 type range when using uae4arm (which has jit CPU emulation). Far in excess of "not quite there yet" (in regards to 68020+aga).


Yes, RPi2 is quad core not that it makes much difference to the single core using UAE (I was comparing single core performance). The RPi2 cores are pretty weak as they are well below PPC G3 or Tabor performance. I have a nearly 20 year old free PC which destroys the RPi2 in single core performance. Emulating the 68k CPU is much easier than emulating the Amiga custom chips. Also, JIT cheats and is less accurate emulation.
Title: Re: FPGA Amiga Possibilities ?
Post by: fishy_fiz on November 14, 2015, 07:20:31 AM
Yeah, I'm well aware of the fact that uae uses mostly single threads, and that the a cortex a7@900mhz can be beaten silly by even a p3.

That's all a sidebar though. I was simply suggesting that it you're going to write down figures you might want to get them right. :)
Title: Re: FPGA Amiga Possibilities ?
Post by: psxphill on November 14, 2015, 11:48:06 AM
Quote from: matthey;799178
Also, JIT cheats and is less accurate emulation.


All emulators cheat to some extent, but it is possible to write a JIT that is as accurate as an interpreter.
Title: Re: FPGA Amiga Possibilities ?
Post by: johnklos on November 14, 2015, 09:31:37 PM
Quote from: matthey;799178
The RPi2 cores are pretty weak as they are well below PPC G3 or Tabor performance. I have a nearly 20 year old free PC which destroys the RPi2 in single core performance. Emulating the 68k CPU is much easier than emulating the Amiga custom chips. Also, JIT cheats and is less accurate emulation.


While you've made your point, a nearly twenty year old CPU certainly doesn't destroy a Raspberry Pi 2 in single core performance. Being generous, by the end of 1997 the fastest x86 you could get was a 300 MHz Pentium II, and the fastest PowerPC was a 266 MHz PowerPC 750.
Title: Re: FPGA Amiga Possibilities ?
Post by: matthey on November 14, 2015, 11:06:48 PM
Quote from: psxphill;799182
All emulators cheat to some extent, but it is possible to write a JIT that is as accurate as an interpreter.

I believe you are correct. However, the whole point of JIT is to maximize performance so short cuts are usually considered advantages if they don't cause too many problems. JIT code is usually executed in an isolated sandbox where problems don't kill the system. Also, JIT is error prone because of complexity.

Quote from: johnklos;799192
While you've made your point, a nearly twenty year old CPU certainly doesn't destroy a Raspberry Pi 2 in single core performance. Being generous, by the end of 1997 the fastest x86 you could get was a 300 MHz Pentium II, and the fastest PowerPC was a 266 MHz PowerPC 750.

I may have exaggerated the "almost" 20 year age of my free PC a little to make my point. There are 20 year old CPU designs which have stronger single core performance than the RPi2 though. Put a Pentium 2 or even PPC G3 design in the same size die at the same clock speed as the RPi2 and it should beat it in single core performance. The PPC P1022 (Tabor) design is a good example of a recycled G3 design in a modern die size and it is some 26% better at single core performance than the RPi2. CISC is even stronger and the Pentium 2 had a long enough pipeline to allow clocking it up even if Intel had not figured out all the complexity of the x86 ISA yet (they were already designing the Pentium 3 which had it figured out before they became side tracked marketing high clocked Pentium 4 room heaters). Clock the 68060 design up to RPi2 speeds and I bet it will have stronger real world single core performance also. I wouldn't be surprised if the Apollo core in FPGA has stronger single core integer performance. Actually, it can handle 3 complex integer instructions per cycle (equal up to 12 integer RISC instructions/cycle) with no load/store bubbles while the P1022 is limited to 2 simple integer instructions per cycle. It is also much stronger at multiplication than the P1022 which is as bad at multiplication as the old G3. This is with a weakened Apollo core in an FPGA which could be considerably stronger in an ASIC. Of course these RISC cores are low power consumption minimal logic embedded processor designs.
Title: Re: FPGA Amiga Possibilities ?
Post by: psxphill on November 14, 2015, 11:38:00 PM
Quote from: matthey;799195
However, the whole point of JIT is to maximize performance so short cuts are usually considered advantages if they don't cause too many problems.

That could also describe "enhancements" made to 68060 when implementing in an FPGA as well. You can take shortcuts in a pure interpreting emulator as well (almost all of them do).

Quote from: matthey;799195
JIT code is usually executed in an isolated sandbox where problems don't kill the system.

No they aren't. JIT is used to implement a sandbox for Java/C#, but that is because the JIT adds code to do range checks, null checks, etc. The generated code just runs.

Quote from: matthey;799195
Also, JIT is error prone because of complexity.

Unit testing and other forms of automated testing are important if you are writing something as complex as a 68000 cpu core anyway. But things like self modifying code are always going to be a little bit of a problem with a JIT, there are multiple strategies for this which each have their own overheads. I would be interested in how well a 68060 JIT would work, if it only kept generated code that fit into the instruction cache. Code that runs fast on a real 68060 would run fast on the emulator & the same for slow code.
Title: Re: FPGA Amiga Possibilities ?
Post by: matthey on November 15, 2015, 01:17:32 AM
Quote from: psxphill;799196
That could also describe "enhancements" made to 68060 when implementing in an FPGA as well. You can take shortcuts in a pure interpreting emulator as well (almost all of them do).

In the case of the classic Amiga, CPU "enhancements" which aren't 68000 or 68020 ISA compatible will break.

Quote from: psxphill;799196
No they aren't. JIT is used to implement a sandbox for Java/C#, but that is because the JIT adds code to do range checks, null checks, etc. The generated code just runs.

Most other OSs have memory protection for process protection at least.

Quote from: psxphill;799196
Unit testing and other forms of automated testing are important if you are writing something as complex as a 68000 cpu core anyway. But things like self modifying code are always going to be a little bit of a problem with a JIT, there are multiple strategies for this which each have their own overheads. I would be interested in how well a 68060 JIT would work, if it only kept generated code that fit into the instruction cache. Code that runs fast on a real 68060 would run fast on the emulator & the same for slow code.

Self modifying code is a performance limitation on real processors also. A real CPU can have hardware help to snoop out the writes but flushing and reloading the ICache still has overhead, especially if the code has been fetched by the CPU. I suppose the JIT could use the MMU but often changing the way pages are marked and interrupts from responding to page violations may be more overhead than it is worth.
Title: Re: FPGA Amiga Possibilities ?
Post by: psxphill on November 15, 2015, 12:46:53 PM
Quote from: matthey;799199
Most other OSs have memory protection for process protection at least.


Sure, but that affects normal programs too. JIT compiled code doesn't normally run under a further level of protection, which is what your statement implied.

Quote from: matthey;799199
Self modifying code is a performance limitation on real processors also. A real CPU can have hardware help to snoop out the writes but flushing and reloading the ICache still has overhead, especially if the code has been fetched by the CPU. I suppose the JIT could use the MMU but often changing the way pages are marked and interrupts from responding to page violations may be more overhead than it is worth.


That is why I suggested implementing the code cache so that it mirrors the 68060 instruction cache. Code that works on a 68060 would automatically work on the JIT. You don't have to snoop as the 68060 didn't, just flush the JIT cache when the program clears the cache. The reason to implement it with the same size as the 68060 is in case something is modifying code that would have been automatically retired from the 68060 cache.

Code that fits in the cache would run fast in the JIT, the same as the 68060. Code that doesn't fit in the cache would run slow in the JIT, the same as the 68060.

The problem comes when you want to accelerate code that was slow to begin with, because this is likely to slow down code that was fast to begin with.