Welcome, Guest. Please login or register.

Author Topic: Can FPGA 060 run more than 100 Mhz?  (Read 12433 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline SpeedGeek

Re: Can FPGA 060 run more than 100 Mhz?
« on: April 15, 2014, 06:37:32 PM »
250MHz FPGA chips have been available for quite a while now. That's the easy part... the hard part is reverse engineering the 060 and programming the FPGA to correctly emulate an 060 under a fairly large number of different operating conditions.
« Last Edit: April 15, 2014, 07:29:50 PM by SpeedGeek »
 

Offline SpeedGeek

Re: Can FPGA 060 run more than 100 Mhz?
« Reply #1 on: April 18, 2014, 02:57:31 PM »
Quote from: matthey;762628
The internal clock speed of the fpga is not the same as the speed an fpga CPU will run. It will be less even with a deep pipeline.

The same thing can be said about the 68K family variants themselves. The effective CLK speed of internal pipelines and other logic functions is one very big variable. So chip manufacturers rate their chips at a given input CLK speed, otherwise possible effective CLK speed ratings would be practically useless. Do you want to buy a CPU or FPGA with a typical CLK speed rating of 50-200 MHz?

   
Quote from: billt;762754
Not that simple. When they say an "FPGA runs at  250MHz", what does that actually mean? What is the FPGA configured to  do/be for that measurement? Different designs will have different  pathways in the FPGA, and thus different results and different max  running speeds. You're extremely unlikely to end up 1:1 with whatever  that 250MHz claim, or any other clock rate claim is based on in an FPGA.  Marketing people at chip compnies feel the need to say things, but  IMHO, stating system clock rates for an FPGA chip doesn't make very much  sense. If a chip vendor is saying a clock rate based on some reference  design they've put into each FPGA, the exact same reference design, then  such a clock rate might be a useful general comparison of speed from  one FPGA product to another, so you might be able to say that Generation  2 of our product runs at approximately 2x clock rate as our Generation 1  product. But in terms of knowing what clock rate your design (such as  tg68 or 68K00 or 68K30) will run at is pretty useless. You learn this  when you define a clock period and run synthesis if that speed works or  not. (Done by timing analysis of the synthesized and place/routed result  in the FPGA tools) If you decide on a target clock speed for your final  product based on whatever particular FPGA chip, then you essentially  get a yes or no answer from the timing checks. If yes, maybe you can go  even faster, and try that. If no, then either work on improving your  design and creep closer toward your goal, or realize that you need to  change your goal (and datasheet and marketing) or maybe change your FPGA  to a faster one. Even if whatever marketing blurb said the chip runs at  your target clock rate.

I used to be part of a team that designed FPGA silicon, so that other  people could buy those FPGA chips and put their own stuff inside. The  architect/team lead once said that the 0.35micron chips could do 350MHz,  as long as you only wanted an inverter chain that never left the IO  buffers to get into the core of the chip. That wouldn't be very useful.  Any design that was inside the core would be slower than 350MHz due to  the longer pathways.

A simple implementation of a particular instruction set processor will  run at a slower clock rate than a pipelined implementation of the same  instruction set thing. A longer pipeline design will run at a faster  clock rate than a shorter pipeline. (The "simple design" is essentially a  1-stage pipeline)

But longer pipelines waste more time on a branch (stuff happening in the  pipeline that gets thrown out, then refill the pipeline with the  branch's stuff)  than a shorter pipeline design. So there is a tradeoff  between pipeline length, clock speed and branching.



You don't need so much to reverse-engineer the 060 chip. Anything you  put into an FPGA will be somewhat different from the 060 silicon anyway.  No FPGA place and route tool will give you exactly what's in the 060  silicon, even if it came from the same RTL, which isn't very likely to  be the case anyway. You'd have to tool the RTL to the FPGA paradigm, and  perhaps to some extent to your particular FPGA and tool (ISE or  Quartus, etc.)

What such a design would really be, is an entirely new implementation of  the instruction set. That instruction set is published and known  publicly. It would be nice to do that as portable-friendly as possible,  considering that there may be some differences in inferring memories  (FIFOs, RAMs cor cache, etc) from one FPGA architecture to another  (spartan3 vs spartan6 vs Cyclone5 etc) or tool to another (ISE vs  Quartus), so such things should find their way into instantiable blocks  to keep the rest of the RTL code portable. Such things were likely  instantiable blocks at Motorola way back when, but as they were  targeting a particular fab process, they may have coded to whatever  their memory compiler tool (automated memory block generator to give you  a chunk of silicon layout, and whatever connections that comes out of  that). One would have to think of a good generic "API" of connections to  fit a generic memory into, and then make such a defined wrapper around  any particular FPGA inferred or instantiated memory block to keep the  parent block nice and generic.

Anyway... Someone needs to define an instruction fetch and parse unit,  an ALU block, etc. that is compatible with the 68060 instruction set. Or  68020 or 030 instruction set, whatever is really the best choice.

We tend to say 68060, as we perceive that to be the newest and fastest  68k chip. Motorola made certain decisions that all together concluded  with the 68060 implementation. Such decisions would consider how often  an instruction has actually been used, how complicated it is (and thus  how much it affects die area/cost, power, and max clock rate capability)

We might today want to reconsider certain decisions, such as added or  removed instructions compared to earlier 68k family parts, and make some  adjustments. Maybe add some instructions back in. Maybe redo the 68040  instruction set with more of a 68060 block diagram. Or an 020, whichever  particular set of instructions is most beneficial. If AmigaOS or some  target application software, or more recent compiler innovations tend to  make frequent use of some instruction that was removed in the 68060,  and thus spends a lot of time emulating that instruction on a real 68060  accelerator, then maybe we should put that instruction back in. Putting  back certain instructions can make for a better performance of the  software running an 020-optimized binary compared to an 060-optimized  binary running on an exact-as-possible 060 implementation.

This is not really reverse-engineering. It would be a new engineering of  the published 68k instruction set. Maybe a new engineering of a new  combination of all 68k family instructions not before seen in any  particular silicon from Motorola/Freescale... Or even a superset, adding  in some entirely new instructions in addition to whatever we take from  the Motorola books. (SIMD anyone??) Apply whatever microprocessor design  concepts you like. Make an 020 instruction super-scalar. Or make an 060  instruction set not super-scalar. Whatever floats your boat. I'm not  sure how fancy the tg68 is, or 68k00 or any other of the several 68k  softcores out there. (ao_68000 etc)

This is an opportunity to do something even more modern in concept than  any 68k ever was. Depending on the FPGA, that coud come out at a higher  or at a lower clock rate than previous Motorola products. Depending on  the price of the FPGA, some particular "possible" performance may or may  not be worth achieving. (I'm not going to pay $10000 (ten grand) for a  particular FPGA to achieve the highest possible speed, but I'll pay a  few hundred maybe to get the best we can from that price range)


And now that we have SoC type FPGAs coming out, with hard-wired ASIC  style ARM processors inside them, we have a new possibility. The FPGA  part for IO/connetivity interoperation with whatever (060 PCB socket)  and/or for Minimig circuitry, and then emulate the 68k in software on  the ARM. Interpret it (Cyclone emulation) or jit (might need to be  created). I'm not sure how either form of software emulation in the ARM,  at the hardwired ARM clock rates, would compare to FPGA implemented  softcore 68k processor.

As explained above you don't get 1:1 with any of the 68K family variants for most internal operations so why would you expect 1:1 with an FPGA?

There is quite bit more to emulating an 060 than implementing the 68K instruction set. What about the MMU, FPU, bus arbitration, cycle termination, interrupts, exception handling, RESET operation, 1/2 CLK bus speed operation, etc?