Author Topic: Can FPGA 060 run more than 100 Mhz? (Read 12433 times)

SpeedGeek · « **on:** April 15, 2014, 06:37:32 PM »

250MHz FPGA chips have been available for quite a while now. That's the easy part... the hard part is reverse engineering the 060 and programming the FPGA to correctly emulate an 060 under a fairly large number of different operating conditions.

SpeedGeek · « **Reply #1 on:** April 18, 2014, 02:57:31 PM »

Quote from: matthey;762628

The internal clock speed of the fpga is not the same as the speed an fpga CPU will run. It will be less even with a deep pipeline.

The same thing can be said about the 68K family variants themselves. The effective CLK speed of internal pipelines and other logic functions is one very big variable. So chip manufacturers rate their chips at a given input CLK speed, otherwise possible effective CLK speed ratings would be practically useless. Do you want to buy a CPU or FPGA with a typical CLK speed rating of 50-200 MHz?

Quote from: billt;762754

Not that simple. When they say an "FPGA runs at 250MHz", what does that actually mean? What is the FPGA configured to do/be for that measurement? Different designs will have different pathways in the FPGA, and thus different results and different max running speeds. You're extremely unlikely to end up 1:1 with whatever that 250MHz claim, or any other clock rate claim is based on in an FPGA. Marketing people at chip compnies feel the need to say things, but IMHO, stating system clock rates for an FPGA chip doesn't make very much sense. If a chip vendor is saying a clock rate based on some reference design they've put into each FPGA, the exact same reference design, then such a clock rate might be a useful general comparison of speed from one FPGA product to another, so you might be able to say that Generation 2 of our product runs at approximately 2x clock rate as our Generation 1 product. But in terms of knowing what clock rate your design (such as tg68 or 68K00 or 68K30) will run at is pretty useless. You learn this when you define a clock period and run synthesis if that speed works or not. (Done by timing analysis of the synthesized and place/routed result in the FPGA tools) If you decide on a target clock speed for your final product based on whatever particular FPGA chip, then you essentially get a yes or no answer from the timing checks. If yes, maybe you can go even faster, and try that. If no, then either work on improving your design and creep closer toward your goal, or realize that you need to change your goal (and datasheet and marketing) or maybe change your FPGA to a faster one. Even if whatever marketing blurb said the chip runs at your target clock rate.

I used to be part of a team that designed FPGA silicon, so that other people could buy those FPGA chips and put their own stuff inside. The architect/team lead once said that the 0.35micron chips could do 350MHz, as long as you only wanted an inverter chain that never left the IO buffers to get into the core of the chip. That wouldn't be very useful. Any design that was inside the core would be slower than 350MHz due to the longer pathways.

A simple implementation of a particular instruction set processor will run at a slower clock rate than a pipelined implementation of the same instruction set thing. A longer pipeline design will run at a faster clock rate than a shorter pipeline. (The "simple design" is essentially a 1-stage pipeline)

But longer pipelines waste more time on a branch (stuff happening in the pipeline that gets thrown out, then refill the pipeline with the branch's stuff) than a shorter pipeline design. So there is a tradeoff between pipeline length, clock speed and branching.

You don't need so much to reverse-engineer the 060 chip. Anything you put into an FPGA will be somewhat different from the 060 silicon anyway. No FPGA place and route tool will give you exactly what's in the 060 silicon, even if it came from the same RTL, which isn't very likely to be the case anyway. You'd have to tool the RTL to the FPGA paradigm, and perhaps to some extent to your particular FPGA and tool (ISE or Quartus, etc.)

What such a design would really be, is an entirely new implementation of the instruction set. That instruction set is published and known publicly. It would be nice to do that as portable-friendly as possible, considering that there may be some differences in inferring memories (FIFOs, RAMs cor cache, etc) from one FPGA architecture to another (spartan3 vs spartan6 vs Cyclone5 etc) or tool to another (ISE vs Quartus), so such things should find their way into instantiable blocks to keep the rest of the RTL code portable. Such things were likely instantiable blocks at Motorola way back when, but as they were targeting a particular fab process, they may have coded to whatever their memory compiler tool (automated memory block generator to give you a chunk of silicon layout, and whatever connections that comes out of that). One would have to think of a good generic "API" of connections to fit a generic memory into, and then make such a defined wrapper around any particular FPGA inferred or instantiated memory block to keep the parent block nice and generic.

Anyway... Someone needs to define an instruction fetch and parse unit, an ALU block, etc. that is compatible with the 68060 instruction set. Or 68020 or 030 instruction set, whatever is really the best choice.

We tend to say 68060, as we perceive that to be the newest and fastest 68k chip. Motorola made certain decisions that all together concluded with the 68060 implementation. Such decisions would consider how often an instruction has actually been used, how complicated it is (and thus how much it affects die area/cost, power, and max clock rate capability)

We might today want to reconsider certain decisions, such as added or removed instructions compared to earlier 68k family parts, and make some adjustments. Maybe add some instructions back in. Maybe redo the 68040 instruction set with more of a 68060 block diagram. Or an 020, whichever particular set of instructions is most beneficial. If AmigaOS or some target application software, or more recent compiler innovations tend to make frequent use of some instruction that was removed in the 68060, and thus spends a lot of time emulating that instruction on a real 68060 accelerator, then maybe we should put that instruction back in. Putting back certain instructions can make for a better performance of the software running an 020-optimized binary compared to an 060-optimized binary running on an exact-as-possible 060 implementation.

This is not really reverse-engineering. It would be a new engineering of the published 68k instruction set. Maybe a new engineering of a new combination of all 68k family instructions not before seen in any particular silicon from Motorola/Freescale... Or even a superset, adding in some entirely new instructions in addition to whatever we take from the Motorola books. (SIMD anyone??) Apply whatever microprocessor design concepts you like. Make an 020 instruction super-scalar. Or make an 060 instruction set not super-scalar. Whatever floats your boat. I'm not sure how fancy the tg68 is, or 68k00 or any other of the several 68k softcores out there. (ao_68000 etc)

This is an opportunity to do something even more modern in concept than any 68k ever was. Depending on the FPGA, that coud come out at a higher or at a lower clock rate than previous Motorola products. Depending on the price of the FPGA, some particular "possible" performance may or may not be worth achieving. (I'm not going to pay $10000 (ten grand) for a particular FPGA to achieve the highest possible speed, but I'll pay a few hundred maybe to get the best we can from that price range)

And now that we have SoC type FPGAs coming out, with hard-wired ASIC style ARM processors inside them, we have a new possibility. The FPGA part for IO/connetivity interoperation with whatever (060 PCB socket) and/or for Minimig circuitry, and then emulate the 68k in software on the ARM. Interpret it (Cyclone emulation) or jit (might need to be created). I'm not sure how either form of software emulation in the ARM, at the hardwired ARM clock rates, would compare to FPGA implemented softcore 68k processor.

As explained above you don't get 1:1 with any of the 68K family variants for most internal operations so why would you expect 1:1 with an FPGA?

There is quite bit more to emulating an 060 than implementing the 68K instruction set. What about the MMU, FPU, bus arbitration, cycle termination, interrupts, exception handling, RESET operation, 1/2 CLK bus speed operation, etc?

Author Topic: Can FPGA 060 run more than 100 Mhz? (Read 12433 times)

SpeedGeek

Re: Can FPGA 060 run more than 100 Mhz?

SpeedGeek

Re: Can FPGA 060 run more than 100 Mhz?