Welcome, Guest. Please login or register.

Author Topic: ARM or x86 with FPGA emulator  (Read 21393 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« on: May 06, 2014, 07:18:22 PM »
Quote from: IanP;763913
An ARM CPU card for the FPGA Arcade would make an interesting addition to it. It would give you a lot of options. You could configure the system to work as a 68k AGA+ machine with the ARM emulating the 680x0 (assuming it would outperform a softcore) or the ARM could be used as a coprocessor to offload jobs like video, audio, jpeg decoding and provide networking and USB features or you could configure it to work as an AROS on ARM NG Amiga.


The FPGA Arcade has an ARM processor. It is low end so probably wouldn't help with emulation much but it can be used to offload some tasks much as the version of the MiniMig with ARM.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #1 on: May 07, 2014, 05:24:55 PM »
Quote from: ppcamiga1;763978
Adding x86 or arm to the fpga to emulate 68k/powerpc is a bad idea.

No one will make software for the emulator.

Amiga 68k is not a bad computer, but 68k is too slow.
 
What we really need is to add powerpc to fpga.

Fast 68k does not exist, the mythical 68k from natami exists only in a sick imagination of gunnar von boehn.

In other news Gunnar, Jens and Chris just tested the new Phoenix Demo on Majsta's Vampire 600 using ECS chipset, 16 bit fast memory and a too small Cyclone II fpga:

http://www.apollo-core.com/bringup/roto64.jpg

Not only is the demo working but that 31 in the upper left hand corner is the frames per second (fps). Here are some other results:

 A4000/040 = ~7 fps
 A4000/030 = ~3 fps
 A600/TG68 = ~3 fps
 A600/TG68-020+cache = ~5 fps
 AMIGA 1000 (fastmem) = <1 fps
 A1200 68030@50 = ~5 fps
 A1200/1260@80MHZ = ~19 fps
 3000T CSMK3 68060@75MHz = ~20 fps average (15-28fps)

A600/Phoenix = 31 fps

The CSMK3 68060@75MHz is my Amiga. The Phoenix demo is here:

http://www.apollo-core.com/phoenix_demo4

The reason why the 68k is slower than other processors is that no large companies with deep pockets are developing it. Of course the 68060 is slow, now :).

Quote from: ppcamiga1;763978
PowerPC is the only reasonable solution for the Amiga in an FPGA.

There is a comparison of the old Apollo core with the PowerPC 440 core in fpga:

http://www.apollo-core.com/index.htm?page=performance

I believe the 68k register memory architecture is better able to take advantage of the large memory bandwidth of an fpga than the load/store architecture of the PowerPC (and most other RISC fpga cores). High clock speeds are not necessary for strong performance when the processor can work directly in memory where there is plenty of memory bandwidth. This is even better when it can work in memory in parallel. We will have to wait for the Apollo core for this as the Phoenix core is just a single integer unit (68060 has 2 integer units).
« Last Edit: May 07, 2014, 05:27:13 PM by matthey »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #2 on: May 07, 2014, 06:54:55 PM »
Quote from: ElPolloDiabl;763981
That speed... is it because of the better memory speed or is it because the core has been optimized?

It's impressive.


The memory speed and bandwidth are good but the fast memory is only 16 bit and the chip memory is still super slow. Optimizations can get around some of these barriers obviously. I would say a good design and some optimizations help to minimize the bottlenecks. My understanding is incomplete though.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #3 on: May 08, 2014, 01:53:57 PM »
Quote from: Hattig;764019
Are there any details of what the Phoenix core is, compared to the Apollo core?

Getting ~140MHz '060 performance is amazing, if true. Maybe this benchmark is rather slanted towards running well on the Phoenix core, which may be very fast at specific operations that normal 68k processors aren't fast in.


I don't expect the single integer pipe Phoenix to normally outperform a 68060 at the same clock speed. It's more comparable to a 68040 and was clocked around 90-100MHz last I checked. However, the 68060 has some bottlenecks which make it underperform a 68040 at the same clock speed in some cases. I didn't post to compare the overall performance of Phoenix by this early demo but rather to show that the 68k can have good performance in a fpga. Much more is possible in a larger but still affordable fpga. I think the performance of a 68060 can be exceeded. Apollo is designed for superscalar with 2-3 integer pipes that are each stronger than the 68060 and without the bottlenecks. From my limited understanding, the potential looks amazing.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #4 on: May 08, 2014, 06:17:15 PM »
Quote from: IanP;764059
Sounds like the Phoenix core is looking good. Shame if the Replay FPGA is too small to include it, I assume that's the case with the rest of the current "Amiga" FPGA boards (MCC216, MiST, Chameleon). What about the Acube Minimig+ is that big enough, I assume the Phoenix core could outperform the Dragonball on it. To get back on topic what we need is an "Amiga" FPGA board with a Cyclone 5 SoC FPGA. Has anybody ported Minimig to one of the dev boards like the SOCkit yet?


The newest version Mist board has a large enough fpga and it's Altera which we prefer (Altera Cyclone III with 25k LE). The board has some other limitations and otherwise offers less value than the fpga Arcade. The MiniMig+ fpga is less than half the size of the fpga Arcade fpga. I expect the Phoenix core will outperform the Dragonball CPU. The MCC216 and Chameleon have too small of fpga. Natami would have been a perfect development environment with a large enough fpga to play.

We also looked at making a retro I/O expansion board for standard fpga developer boards. It would be cheap to make. The fpga developer boards have modern I/O themselves but some is not easy to use from the fpga and may require accessing with an on board ARM processor.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #5 on: May 08, 2014, 07:44:30 PM »
Quote from: ppcamiga1;764065
So there was no progress in the past three years, and as three years ago NatAmi 68k performance reaches only 060 120 Mhz.


Of course there was progress all along but many CPU units have to be working correctly to make visible progress. How many man years do you think it takes to produce a modern processor? How how long do you think it takes for 3 knowledgeable people to make a modern processor in their spare time?

Quote from: ppcamiga1;764065

 It's not enough. Amiga powerpc accelerators achieve better performance 17 years ago.

It does not make sense. The best solution for the Amiga FPGA would be adding powerpc processor to FPGA or using an FPGA with an already built in powerpc.


What makes sense is debatable and depends on the application. A PowerPC chip with a small fpga for the Amiga custom chips could give good performance and compatibility with a good 68k emulator. The disadvantages are lack of control over the processor ISA and availability and the 68k emulation may be limiting. Designing a 68k in fpga gives full control to the designers. There can be advantages for integrating Amiga buses in a more compatible way and maybe even helping with SMT on the Amiga (duplicating cores in an fpga is easy). This makes a SoC design easier. The 68k has advantages in ease of programming, code density and working in memory over the PowerPC. Of course an fpga core would not be as fast as a professionally designed hard processor but I like this path. I have nothing against the PPC but I like 68k. I would like to see what it is capable of if Motorola hadn't abandoned it and had tried to improve it. You are free to develop your PPC project your way and I wish you luck. I may even buy it in the end if you make something nice for a reasonable price.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #6 on: May 09, 2014, 12:50:27 AM »
Quote from: wawrzon;764092
@matt, trying this phoenix demo under winuae on my surface pro gave me around 37fps. isnt that a little low in comparison? probably it rather reflects memory throughput than cpu speed, gunnars favorite argument if i recall well.


The Phoenix demo is not a very good test of CPU speed. However, a certain amount of processing power is needed to have a good frame rate. The demo shows that Phoenix has good performance and potential despite the scaled down core. It makes me wonder how much performance the full Apollo core would have in a larger fpga. Then I wonder how much rendering power the Apollo core would have with a fast fpga Amiga chipset like the Natami would have been. I don't think the 37 fps of UAE would be very difficult to beat.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #7 on: May 09, 2014, 03:22:59 AM »
Quote from: wawrzon;764094
right.

judging by the link the demo relies on 64bit multiplication, so thats probably where this softcore scores the best.   but the actually good news is that it runs already on vampire. would be fun to have more details.


I don't believe there was room for 64 bit integer multiply in the Phoenix core so it's like the 68060 in this regard. There had been talk of letting multiply and division execute OoO while the processor continues provided that there are no dependencies. The 68060 is in order for integer multiply and divide and only the first pipe can be used. With a few more resources, the 68060 could have had 1 cycle 16x16=32 in both pipes which would have quadrupled the most common multiply performance. That and adding SWAP into the second pipe would have been simple and provided a nice boost in performance. As for details on Phoenix, I think the team is too busy playing right now ;).

Quote from: Iggy;764098
I like the idea of adding an FPGA to our existing PPCs.
JIT 68K translation would be much quicker than implementing the 68K in the FPGA and would leave room for chip set enhancements.
Running 68K and PPC code concurrently on processors operating at up to 2.7 GHz would greatly outperform any solution involving only the 68K.
A complete melding of legacy and NG environments.
What do you think?


It's not a bad idea if implemented correctly. It could probably be done cheaper and with better performance that a 68k in fpga. An fpga 68k can have better integration with the Amiga chipset and an eventual Amiga SoC could be created. An ASIC of the SoC may outperform the PPC setup (CPU+SoC performance) and has advantages. The cost could then be cheaper with enough unit sales. It's 2 totally different concepts but neither is bad. The PPC way could almost be done on the SAM with it's Lattice fpga but it's too small. Maybe the CIAs could be implemented ;).
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #8 on: May 09, 2014, 07:13:19 PM »
Quote from: ElPolloDiabl;764140
@above
You like the PowerPC to program in assembler?

I've never tried, but it looked several times harder than 68k.


PowerPC uses too many acronyms and aliases for the mnemonics. It is tedious and tricky to program efficiently being a load/store architecture but it does have more instructions than most RISC processors (the down side being some instructions are missing from the hardware of some PowerPC processors). Most instructions have 3 operands which is convenient. More has to be done manually and considered than the 68k which is auto everything and forgiving. Other than that, PowerPC is a clean and consistent 32 register load store architecture. It's actually similar in many ways to the new ARMv8 ISA.

The 68k is still much easier to program and debug. It's simple (mostly 2 operands) and consistent (unlike x86) with powerful addressing modes for working directly in memory. I love the tiny programs which helps performance also. It's the easiest and funnest halfway modern CISC processor. It could use some modernization where ease of use and code density can be improved more.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #9 on: May 10, 2014, 07:11:56 PM »
Quote from: wawrzon;764211
its just your opinion. doesnt look that there are that many who back you up.


Most of the PowerPC fans are on other web sites waiting for cheap PowerPC hardware and SMP to lead them to the promised land. Anyone creating new PowerPC hardware had better ask (pay) for support before going too far. Another option is to reverse engineer hack and patch WOS more than it already it is. The best option, and the open option these PowerPC fans don't realize they need yet, is to improve AROS PPC. That is what would make PPC+fpga Amiga custom chip projects feasible. Without it, these projects are probably dead in the water and will never gain momentum (look at the UltimatePPC accelerator). If AROS PPC is improved enough, it could surpass AmigaOS 4 and MOS drawing their users into one open (and cheaper) PowerPC Amiga platform. This would in turn create demand for more PowerPC hardware. Now that I have given away the secret to new cheaper PowerPC hardware, maybe they will jump over to the AROS web site and demand new PowerPC development like they are demanding new PowerPC hardware here ;).
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #10 on: May 11, 2014, 01:30:20 AM »
Quote from: wawrzon;764223
i beg to differ, aros ppc is as it seems almost as obsolete as 68k. as example aros mesa is about to be working at least software wise, lets call it "wrong colors stuff". aros maintainer, deadwood, has understandibly no much motivation to seek endian dependencies within the genuine engine source. i have had a brief contact with our ppc linux expert (xeno) about the issue, and he said, the interest to fix endianness issues in mesa, which concerns both 68k and ppc, is limited to tell the least. so as much as i hate it to be the case, as 68k fan, big endian platforms ppc and 68k, are fading together into the obscurity as we look at it.


I'm aware that AROS PPC is the deadest branch of AROS. I'm also aware that developing AROS PPC would help to fix big endian bugs. I'm also aware that AROS PPC gaining in popularity would unify the Amiga API. I also prefer 68k over PPC but developing AROS PPC would help the Amiga so I encourage it's support. Neither the 68k or PPC path are necessarily the correct or easiest choices. PPC seems to be slowly dieing and the 68k is old and little developed (it can be emulated easier on other processors though). ARM makes the most sense for low end power efficient processors and x86_64 makes the most sense for affordable performance processors.

Quote from: Iggy;764225

And, of course, if I really want to invest in an alternate with the right "endianess" there is always ARM which looks to have a bright future.

An Amiga/MorphOS solution with a 64 bit AMD ARM processor with built-in AMD/ATI graphics would suit me fine.


ARM is natively little endian. It can switch to big endian mode but there are disadvantages like with the PPC switched to little endian mode. I don't like the bi-endian processor concept that changes endianess with a control bit. I prefer instructions or MMU mappings for converting the endianess.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #11 on: May 12, 2014, 08:53:52 PM »
Quote from: Fats;764320
My reaction was also partly to matthey who was lobbying for using AROS PPC as the mother of all Amiga-like OSes.


That should probably read "My reaction was also partly to matthey who was lobbying for using AROS PPC as the mother of all PPC Amiga-like operating systems." It's a minor change but important difference ;).
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #12 on: August 20, 2014, 10:22:38 PM »
Quote from: TeamBlackFox;771283

I prefer working with an architecture at a low level. The reason I give two %&$#?@!%&$#?@!%&$#?@!%&$#?@!s is because I actually would like to code some for Amiga down the line. Reason I don't now is simple: I am learning 68k assembler and that is one of the most tedious and pedantic languages I've learned because all the current compilers for C are too old and broken for OS 3.9 for me to use. I have been coding on a UNIX style C compiler such as Clang and that's how I intend to do this. GCC is a load of junk though so I may be forced to build my own compiler >_>. Anyways, back to my point: If the Amiga were to become x86 based I'd probably say screw it and not code for it because I have long given up optimising programs on x86 properly. Nothing about the tools I use exposes anything, rather, it is how I treat C as more flexible assembler. I like to be aware of the underlying hardware. With that said I'd probably stick to RISC boxes running BSD or some other UNIX if Amiga moved to x86.


There is a new version of vbcc in the works with much better C99 support. The complete source is available and I compile it with itself on my Amiga (although it can cross compile) along with Frank Wille's vasm and vlink. Only vclib is not publicly available because parts of it use copyrighted code with restrictions (but it's still available for programmers working on vclib). There are no dependencies or too complicated of build tool chains as the original target was embedded systems. It still has sophisticated optimizations (where fully implemented and not buggy) and a working instruction scheduler (for limited targets). It supports simple and easy inline assembler (including many system functions) although it's not as powerful as GCC and CLANG/LLVM assembler inlines (I would like to see support for this but it would be a lot of work). Vasm blows away GAS in ease of use and 68k peephole optimizations. The Amiga support is good and it's easy to install on the Amiga. Before you write your own compiler, maybe you could try it out and consider helping?

http://sun.hasenbraten.de/vbcc/
http://sun.hasenbraten.de/vasm/
http://sun.hasenbraten.de/vlink/

The binaries at the above vbcc link are old. You should e-mail Frank to get the latest sources (my sources may be old). Vbcc does need about 100MB of memory to compile so you would have to either upgrade your Amiga 3000 memory or cross compile. Vasm and vlink can probably be compiled with less than half the memory and the latest sources are available at the links above. You would probably want to install Frank's Posix.Lib for vbcc and my improved C99 68k math libs and support also:

http://aminet.net/dev/c/vbcc_PosixLib.lha http://aminet.net/dev/c/vbcc_PosixLib.lha
http://eab.abime.net/showthread.php?t=74692

The math libs are in 100% 68k assembler by the way. It's almost as easy as a high level language. I wouldn't try that in x86/x86_64 (not much love here from anyone) but your RISC processors wouldn't be much, if any easier. What we really need is a super compact enhanced 68k 32 bit CPU and a new properly designed Super-CISC 64 bit ISA big brother. CISC has real advantages but there hasn't been a new design in years despite the possibility to easily beat the x86_64 in encoding efficiency, code density and ease of programming.

Quote from: TeamBlackFox;771283

As previously explained I could give a flying freak whether or not it would break binary compatibility. Sure other users care, but I care more about having a 64-bit OS with a CPU that can perform well today.

Don't feed anymore bull, you can look up the dhrystone measurements and other benchmarks yourself, but do keep in mind all of these systems are 64-bit and SMP capable so they're going to beat any slow uniprocessor 32-bit mode OS


64 bit processors can actually be slower in more than a few cases and they are more expensive in terms of resources and therefore cost. They generally do perform better because they are higher end with the expensive and extensive caches needed to make them fast. 64 bit is an advantage for servers and workstations but 90% of Amiga users would be more than happy with a 32 bit 68k Amiga with 300Mips and 1GB of memory.

Quote from: biggun;771301
Please show use where we could buy new Laptops or new Desktop with these for < $500. If you can do this - then your would have a reasonable point.


ARMv8 may be a better future RISC target than PPC, MIPS and SPARC because it's the most likely to have a CPU in portable computers sold to the masses. Performance shouldn't be any more of a limitation than these other RISC processors as the design is not that much different. The first devices may require jail breaking and Android will probably be the logical replacement unless AROS gets an ARMv8 target and is improved very quickly.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #13 on: August 21, 2014, 12:00:18 AM »
Quote from: TeamBlackFox;771333
Matthey, I'll check it out. Thanks. I've been meaning to get a faster CPU, but that will take a while with my money.


Maybe there will be a cheaper big box 68k fpga accelerator sometime next year with 128MB or more of memory, ethernet, etc.

Quote from: TeamBlackFox;771333

And in regards to that tidbit on 32 vs 64-bit overhead, I  find 32-bit unacceptable for a modern computer. While many Amigans may be happy with that: I will not be, because I primarily use BSD. I have no use for AmigaOS as my daily driver. It is a distraction and game machine, nothing more.


There is a place for both small 32 bit "fun" devices which can still do a lot of work and 64 bit power machines for large compiles, big number crunching and servers. Frank Wille programmed the Solid Gold and SQRXZ games for a 68k with less than 1 MB of memory but he also writes code for NetBSD which he uses on the SUN server that is used for the vbcc links above:

http://sun.hasenbraten.de/
http://sun.hasenbraten.de/~frank/projects/

Hmm, another Amiga user with connections to BSD. He does support PPC Amigas also although he doesn't like the attitude of the AmigaOS 4 developers. The AmigaOS 4 developer attitude turns me off even more than the high prices. I love the 68k but PPC would be acceptable for a high end AmigaOS if that was a good choice that was going somewhere.

Quote from: TeamBlackFox;771333

I agree though that ARM may be a decent target, but I'd have to wait one more ISA generation because I'm not interested in portable computers much, I want workstations and servers. I own a Nexus 7 and a LG Optimus F6 but I'd never want that to be running AmigaOS or AROS as they're already buggy enough. Much of the time they just function as portable media players and remote terminals


The AmigaOS 3.1 on is not so buggy although some parts are kludgy. The AmigaOS never got much server software and even the networking support is lacking. It doesn't have the security, memory protection or resource tracking that would be highly desirable for a server. Maybe AROS could be improved to be an acceptable server but BSD will likely always be a better choice for what you are interested in.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ARM or x86 with FPGA emulator
« Reply #14 on: August 22, 2014, 01:24:54 AM »
Quote from: TeamBlackFox;771431
You're not correct at all, because the majority of designs that are getting notice besides RISC are VLIW and EDGE. In VLIW's case it is like super-RISC and even adopts some CISCy advantages.  VLIW is register-register/load-store architecture and basically just adds instruction level parallelism.

EDGE is another way to add instruction parallism by adding one advantage CISC genuinely has: variable length instruction words.

Anyways no, RISC is going to always be simpler and more efficient for most forms of computing. Even x86 cores nowadays break instructions down into simpler ones before processing them.

SIMD has generally been favored over VLIW for parallel operations because it doesn't have the major drawbacks of VLIW. VLIW has strong advantages also but it is not practical for general purpose computing despite huge expensive attempts which failed.

RISC is simpler and cheaper to make if you want a low end ARM processor. That short pipeline is going to have bubbles, the instructions are going to be weak and the additional instructions that need to be executed at a higher clock speed have more dependencies than CISC as well as the disadvantages of a higher clock speed. Lengthening the pipeline and/or adding OoO to make the RISC stronger gives the advantages of CISC except the register memory architecture and variable length instructions which give good code density and the decoding can be hidden in the pipeline. RISC has the advantage that more registers can be encoded (needed to avoid dependencies and bubbles) and fewer cache/memory accesses. RISC compilers were supposed to be able to avoid enough bubbles and dependencies that they would outperform CISC in cache/memory but this never came about, even with double the registers. Compilers were supposed to make VLIW practical for general purpose computing but they also failed. There continue to be people that keep repeating the same mistakes though.

Quote from: TeamBlackFox;771431
Of course, biggun you can always try proving me wrong by building this super orthogonal CPU and trying to benchmark it against a processor of the same application that is RISC. I'll be waiting.

http://www.apollo-core.com/index.htm?page=performance

Quote from: bloodline;771433
You haven't studied the ARMv8 yet have you? It's not RISC or CISC or any other marketing term you can think of, it's a weird hybrid of ideas.

It's clearly RISC just not very Reduced Instruction Set much like PPC. RISC should have been called LSAC Load/Store Architecture Computer. There are a lot of conditional instructions and we will see how that works out. It's an advantage on some hardware implementations while no gain on others. They may have gone overboard with this to keep original ARM fans happy while removing the conditional instruction field to add more registers. This change should improve performance to be close to, if not a little better than PPC for integer performance. IMO, ARMv8 should have good performance but it is more complex than it needs to be. This extra complexity could cost them in electrical efficiency and it's not clear that compilers will be able to take advantage. Processor designers have a tendency to add features they visualize as advantages in a hardware implementation they like and often ignore what compilers actually use and need. Then all those instructions that most compilers don't use are dropped and trapped turning the ISA into a mess of pitfalls.