Welcome, Guest. Please login or register.

Author Topic: Full 68060 implementation?  (Read 8416 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline freqmaxTopic starter

  • Hero Member
  • *****
  • Join Date: Mar 2006
  • Posts: 2179
    • Show only replies by freqmax
Full 68060 implementation?
« on: December 13, 2012, 01:10:57 AM »
Does there exist enough documentation and FPGA fabric size to make a full soft-HDL 68060 processor? and at full speed?  (preferably with Xilinx Spartan series, Virtex is expensive)

This could then be used to mitigate shortage, insane prices and make tweaks possible.
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show only replies by ChaosLord
    • http://totalchaoseng.dbv.pl/news.php
Re: Full 68060 implementation?
« Reply #1 on: December 13, 2012, 01:24:34 AM »
Sure, if you want to spend the years to write one.  FPGA size is not the problem.
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show only replies by matthey
Re: Full 68060 implementation?
« Reply #2 on: December 13, 2012, 03:15:40 AM »
Quote from: freqmax;718824
Does there exist enough documentation and FPGA fabric size to make a full soft-HDL 68060 processor? and at full speed?  (preferably with Xilinx Spartan series, Virtex is expensive)


A 100% logic equivalent is not possible in fpga but a fully functionally equivalent soft CPU is possible. Note that muxes are much slower and bigger in an fpga than in silicon which messes up the timing as other logic elements are forced to wait. A fast fpga can more than make up for this deficiency and others but an optimal design will be different and optimized for an fpga, likely even for a particular fpga. My understanding is that an fpga 68060 like CPU design could be programmed for:

1) speed in the fpga (fastest in fpga, programming can get messy with optimizations)
2) small size in an fpga (for small fpga, slower)
3) burning in an ASIC (slower in fpga but fastest in ASIC and potentially easiest to program and maintain)
4) compatibility close to cycle exact (slower in fpga, difficult to make as deep understanding of a 68060 and logic testing or copying would be necessary)

Personally, I would go for #3 even though speed may be 20% lower than the same soft CPU optimized for speed (probably talking 100MHz speed in affordable fpgas). It would be easier to make enhancements and changes (which can also increase speed), faster to program and we could always go ASIC ;).

Quote from: freqmax;718824

This could then be used to mitigate shortage, insane prices and make tweaks possible.


The last revision of the 68060 is the one that is expensive and difficult to find. You should be able to find slower ones for between $50 and $100. Ask MikeJ what the slower ones are going for in China. Even the older RC revisions have an MMU, FPU and less bugs than any 68k fpga CPU yet.
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show only replies by ChaosLord
    • http://totalchaoseng.dbv.pl/news.php
Re: Full 68060 implementation?
« Reply #3 on: December 13, 2012, 03:23:12 AM »
Quote from: matthey;718839
A 100% logic equivalent is not possible in fpga but a fully functionally equivalent soft CPU is possible. Note that muxes are much slower and bigger in an fpga than in silicon which messes up the timing as other logic elements are forced to wait. A fast fpga can more than make up for this deficiency and others but an optimal design will be different and optimized for an fpga, likely even for a particular fpga.

+1



Quote

 Even the older RC revisions have an MMU, FPU and less bugs than any 68k fpga CPU yet.
:roflmao: good point!
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline freqmaxTopic starter

  • Hero Member
  • *****
  • Join Date: Mar 2006
  • Posts: 2179
    • Show only replies by freqmax
Re: Full 68060 implementation?
« Reply #4 on: December 13, 2012, 04:14:47 AM »
I think functionally equalient is a good goal. But ASIC is no good. 1) It's darn expensive 2) Any bugs will be unfixable
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show only replies by matthey
Re: Full 68060 implementation?
« Reply #5 on: December 13, 2012, 04:44:48 AM »
Quote from: freqmax;718841
I think functionally equal is a good goal. But ASIC is no good. 1) It's darn expensive 2) Any bugs will be unfixable


I did not mean to program an ASIC, I meant to program an fpga as one would do if they were later going to burn an ASIC. The logic would be tested and bugs fixed before even contemplating burning an ASIC but that would be an option for later. Having a tested fpga CPU ready to burn could open up possibilities like partners that would be willing to help with the expense part. The 68k FIDO CPU, for example, is burned by a company that specializes in fpgas and making ASICs from them. They have fpga code for many old chips that are no longer available. Several of them can be put in one fpga to reduce component cost and new software programming cost of a newer chip for embedded or retro systems.
 

Offline danbeaver

Re: Full 68060 implementation?
« Reply #6 on: December 13, 2012, 04:46:57 AM »
What about a less complicated CPU (68000/68020) running at a high speed (100 MHz)?
 

Offline freqmaxTopic starter

  • Hero Member
  • *****
  • Join Date: Mar 2006
  • Posts: 2179
    • Show only replies by freqmax
Re: Full 68060 implementation?
« Reply #7 on: December 13, 2012, 05:07:21 AM »
Quote from: danbeaver;718843
What about a less complicated CPU (68000/68020) running at a high speed (100 MHz)?


Already covered by MikeJ and his main developer. ;)

But it could serve as a starting point.
 

Offline mikej

  • Hero Member
  • *****
  • Join Date: Dec 2005
  • Posts: 822
    • Show only replies by mikej
    • http://www.fpgaarcade.com
Re: Full 68060 implementation?
« Reply #8 on: December 13, 2012, 08:28:18 AM »
Quote from: matthey;718839
Ask MikeJ what the slower ones are going for in China. Even the older RC revisions have an MMU, FPU and less bugs than any 68k fpga CPU yet.


They are really cheap - you can even can brand news ones for ~20USD.
Getting the latest mask set is Really tricky.
/MikeJ
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show only replies by matthey
Re: Full 68060 implementation?
« Reply #9 on: December 13, 2012, 02:51:33 PM »
Quote from: mikej;718850
They are really cheap - you can even can brand news ones for ~20USD.
Getting the latest mask set is Really tricky.
/MikeJ


That's even cheaper than I thought :). A 50-60MHz 68060 is already a big improvement over what the average Amiga user is using. Supporting faster memory speeds and easier overclocking will make it faster than most of the old 68060@50MHz accelerators.
 

Offline mikej

  • Hero Member
  • *****
  • Join Date: Dec 2005
  • Posts: 822
    • Show only replies by mikej
    • http://www.fpgaarcade.com
Re: Full 68060 implementation?
« Reply #10 on: December 13, 2012, 03:10:10 PM »
Quote from: matthey;718874
That's even cheaper than I thought :). A 50-60MHz 68060 is already a big improvement over what the average Amiga user is using. Supporting faster memory speeds and easier overclocking will make it faster than most of the old 68060@50MHz accelerators.

Yes, but they are older revisions and cannot be clocked up - and have bugs.
The E41J mask set are tricky to get at all (most are fakes) and price is negotiable.
I am taking a chip tester with me to screen locally.
/MikeJ
 

Offline freqmaxTopic starter

  • Hero Member
  • *****
  • Join Date: Mar 2006
  • Posts: 2179
    • Show only replies by freqmax
Re: Full 68060 implementation?
« Reply #11 on: December 13, 2012, 03:35:49 PM »
Perhaps they can be overclocked with peltier and fluid cooling? or even nitrogen for a few hours?
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show only replies by matthey
Re: Full 68060 implementation?
« Reply #12 on: December 13, 2012, 03:50:43 PM »
@mikej
The 68060.library can make the older mask 68060s reliable with very little decrease in speed. It would be great if the 68060.library was installed from flash/kickstart so that bootable games could work. The older mask 68060s are not nearly as overclockable but still overclockable. They should be good for 54-66MHz depending on cooling and temperature environment where used. Add to this memory that is 25-50% faster than the old accelerator SIMMS and it should be significantly faster than a CSMK2 with 68060@50MHz. The newest mask 68060 is very nice though and worth the premium to me to fix all bugs and overclock to 100MHz. I have an extra one waiting ;).
 

Offline billt

  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 910
    • Show only replies by billt
    • http://www.billtoner.net
Re: Full 68060 implementation?
« Reply #13 on: December 13, 2012, 04:48:03 PM »
Quote from: freqmax;718824
Does there exist enough documentation and FPGA fabric size to make a full soft-HDL 68060 processor? and at full speed?  (preferably with Xilinx Spartan series, Virtex is expensive)

This could then be used to mitigate shortage, insane prices and make tweaks possible.


The assembly language manuals and the 68060 databook are all that you should need, for someone that knows how to do this kind of thing.

Though I myself would not try to do an exact 68060 clone. The 060 has some instructions removed that were present in 68040 for example. If I were to make an FPGA 680x0, I'd put them all back in, and avoid trapping to software emulation.

If you're redoing things in an FPGA, you have freedom to make improvements on things lke that. Want to make the cache bigger? Why not? You need to design a cache controller anyway if you support a cache, so do what you like in a way that is compatible with everything.

I just did a very small mircoprocessor design for a Computer Architecture class that finished last night. Very small as in a total of 6 instructions, including load, store, and two kinds of branching, leaving only two ALU operations, add and subtract of BCD numbers. 8bit instruction and 16bit memory address bus, and four registers. But it was nice to learn how this stuff works and the fundamentals of how to approach designing a processor. I actually wrote some assembly language code for what I'd want such a terribly simple thing to do before doing the logic design. You break down the instructions into stages (not pipeline stages, but individual steps taken to complete a single instruction at a time), and then you can extract your hardware logic design based on that. I found it very interesting, and I was surprised at how complicated it was NOT, even for my uselessly simple thing. Sure, a more complete and useful design like a 680x0 will be bigger and more complex than this, but it's not absurdly complicated as I would have imagined previously. That said, it would still be a great deal of work. Could be fun for someone with the time.

TG and Yaqube have taken some steps in improving the TG68 sortof in that direction. I thought I'd seen something about the Suska guy taking his 68000 core up to 68020 or 030 but I didn't see anything available last I checked. I think the aoocs guy has a 68000 core as well, not sure what his plans for the future of that are. There's also closed-source Natami CPU, but I'm not sure what's happening with that anymore. But they are things you can look at for inspiration.
Bill T
All Glory to the Hypnotoad!
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show only replies by matthey
Re: Full 68060 implementation?
« Reply #14 on: December 13, 2012, 06:25:27 PM »
Quote from: billt;718885

Though I myself would not try to do an exact 68060 clone. The 060 has some instructions removed that were present in 68040 for example. If I were to make an FPGA 680x0, I'd put them all back in, and avoid trapping to software emulation.


It's not necessary to put all the trapped instructions back in. It made sense to get rid of CAS2, CHK2, CMP2 and MOVEP. Getting rid of the integer 64 bit result MULx was a mistake as it's used commonly by compilers to do an invert and multiply by a constant instead of a divide by a constant. It wouldn't have been so bad if they would at least have defined and allowed a MULx where Sz=1 (64 bit result) and Dl (result register low) = Dh (result register high) giving the upper 32 bits of the result (like PPC MULH). This is worth while to define even with 64 bit results as it saves trashing a register when the lower 32 bits are not needed. See MULS and MULU here:

http://www.heywheel.com/matthey/Amiga/68kF_PRM.pdf

The integer 64 bit result DIVx is used much less commonly but it is much slower to do in software using the shift method. MOVEP is used in older Amiga software (mostly games) but it is poorly encoded, has limited usefulness and most patched games have already removed them. CAS2 and CHK2 are uncommon supervisor instructions. CMP2 is user mode but has limited usefulness as designed and it's very rare. WHDload mentions only 1 known game and 1 demo as I recall. It would be better to install the 68060.library from flash or kickstart so that traps are never a problem. Some instructions and addressing modes are better trapped and simplifying gives gains elsewhere. Notice that I removed (for trapping) the double indirect addressing modes that used the outer displacement in the 68kF ISA pdf above as there was little advantage (only useful when no free registers) and simplifies the decoder. If all remaining full extension word format addressing modes could be 1 cycle faster then it would be worth it.

The SWAP instruction in the 68060 should have worked in both integer units which was an oversight. The result is longword for forwarding and is common. The bitfield instructions should have been made faster which is possible and they have 32 bit results for forwarding. Of course bigger caches, a link stack and instruction combining like the ColdFire has would be done at minimum if modernizing the 68060.

Quote from: billt;718885

I just did a very small microprocessor design for a Computer Architecture class that finished last night. Very small as in a total of 6 instructions, including load, store, and two kinds of branching, leaving only two ALU operations, add and subtract of BCD numbers. 8bit instruction and 16bit memory address bus, and four registers. But it was nice to learn how this stuff works and the fundamentals of how to approach designing a processor. I actually wrote some assembly language code for what I'd want such a terribly simple thing to do before doing the logic design. You break down the instructions into stages (not pipeline stages, but individual steps taken to complete a single instruction at a time), and then you can extract your hardware logic design based on that. I found it very interesting, and I was surprised at how complicated it was NOT, even for my uselessly simple thing. Sure, a more complete and useful design like a 680x0 will be bigger and more complex than this, but it's not absurdly complicated as I would have imagined previously. That said, it would still be a great deal of work. Could be fun for someone with the time.


Sounds like fun :). It's not rocket science and it's very logical but I imagine complex and time consuming when doing more advanced design and coding.

Quote from: billt;718885

TG and Yaqube have taken some steps in improving the TG68 sortof in that direction. I thought I'd seen something about the Suska guy taking his 68000 core up to 68020 or 030 but I didn't see anything available last I checked. I think the aoocs guy has a 68000 core as well, not sure what his plans for the future of that are. There's also closed-source Natami CPU, but I'm not sure what's happening with that anymore. But they are things you can look at for inspiration.


68020+ support should be the minimum for the Amiga IMO. It makes programming much easier than the 68000 while offering significantly better speed and code density. The Natami CPU (N68050) is not finished and was only partially supporting the 68020 last I understood but it is fairly advanced as far as cache and pipeline design. I don't think Jens is working on it much if any anymore. He has talked about making it open source though. Gunnar is working on a soft CPU based on it but he is not very reliable. He claims to be experimenting with a 200MHz softcore although he increased the pipeline length significantly in order to do it. This increases branch penalties and can cause other stalls much like a highly clocked DSP, GPU or x86 CPU. Even if I could believe him, it's experimental at best and Gunnar has a history of not completing much.