Author Topic: Motorola 68060 FPGA replacement module (idea) (Read 54536 times)

freqmax · « **Reply #434 from previous page:** January 19, 2013, 05:06:18 PM »

Why were these instructions dropped?

And would be more efficient performance wise to implement a 020, 030, or 040 and then horrendously overclock it?

Mrs Beanbag · « **Reply #435 on:** January 19, 2013, 06:16:01 PM »

Quote from: freqmax;723203

Why were these instructions [CALLM & RTM] dropped?

More to the point, why were they ever included in the first place?

Personally I'd suggest a minimalist implementation (68000 + some 020 features) and see how fast we can get it, before adding anything else.

psxphill · « **Reply #436 on:** January 19, 2013, 06:32:04 PM »

Quote from: freqmax;723203

Why were these instructions dropped?

And would be more efficient performance wise to implement a 020, 030, or 040 and then horrendously overclock it?

Nobody used the instructions and they required support from the 68851 MMU. It made no sense to bloat the 68030 and it's MMU with them.

It depends on how you measure efficiency, but you'll hit the upper clock speed quickly.

matthey · « **Reply #437 on:** January 19, 2013, 06:58:36 PM »

Quote from: freqmax;723203

Why were these instructions dropped?

CALLM and RTM were for calling subroutines (probably from the OS) but MacOS and Atari were trapping A-line instructions to supervisor mode OS calls. Traps are slow so the Amiga uses regular JSR (Jump to subroutine) instruction for OS calls and the OS is mostly in user mode (as libraries) like everything else. Supervisor violations do still trap to the OS on the Amiga. Having all of the OS in supervisor provides a little more security though. The 68020 was used for big Unix boxes and similar back then (before they dropped 68k for RISC) and CALLM was probably to cater to that market although I doubt they ever used it because of the additional overhead.

Oxypatcher, Cyberpatcher and Remus do not even bother patching CALLM/RTM because they are unused on the Amiga except for a very few programs that supposedly use them to detect a 68020 and count on this to trap if not a 68020. This is a poor assumption and so rare that it can be patched if necessary.

Quote from: freqmax;723203

And would be more efficient performance wise to implement a 020, 030, or 040 and then horrendously overclock it?

It doesn't really matter as the fpga implementation would be different than the real chip. The 68020+ ISA is practically the same between all of them except for the FPU and MMU. You certainly wouldn't want the limitations of the earlier 68020/68030 unless a cycle exact CPU was needed.

wawrzon · « **Reply #438 on:** January 19, 2013, 09:21:17 PM »

talking, talking, talking... i see heiroglyph has backed off here, so maybe he is up to something. ;P

freqmax · « **Reply #439 on:** January 19, 2013, 11:44:25 PM »

Regarding instruction set (ISA) I was thinking in general why they changed it. Because the end result is a slight confusion.

matthey · « **Reply #440 on:** January 20, 2013, 12:01:49 AM »

Quote from: freqmax;723245

Regarding instruction set (ISA) I was thinking in general why they changed it. Because the end result is a slight confusion.

Which ISA change:

1) 68000-68020 (major change)

2) 68020->68030 removal of CALLM/RTM (very minor change)

bloodline · « **Reply #441 on:** January 20, 2013, 12:35:53 AM »

I have been reading matthey's 68kF2 ISA proposal, and it reminded me how complex the 68k instruction encoding is,

I have included a link here for others to learn about Instruction encoding (in this case, nice simple MIPS):
http://www.cs.umd.edu/class/spring2003/cmsc311/Notes/Overall/instruction.html

matthey · « **Reply #442 on:** January 20, 2013, 02:57:06 AM »

Quote from: bloodline;723248

I have been reading matthey's 68kF2 ISA proposal, and it reminded me how complex the 68k instruction encoding is,

Complex? Take a look at a decoder for x86

. Yea, the 68k does need more logic in the decoder but the improved code density allows more instructions to be piped into the processor. Most RISC instructions use a consistent 32 bit fixed length encoding which is great for decoding. The 68k needs several separate decoding tables (lacking a better name) for different encoding areas. Some encoding holes are even divided into a separate table of instructions. This part of the 68k could have been a little better but it's not too much of a problem. The 68k does compress a lot of data with sign extended values which works very well and can be improved on. The overall slowdown from the decoder is minimal on the 68k and can be made up for with powerful instructions and addressing modes which it has and can be improved on. ARM with Thumb 2 works well because of the code density plus powerful instructions for RISC. This was a good tradeoff even though they now have a little more complex decoder. MIPS and PPC have also experimented with code compression (MIPS16E and CodePack respectively) but it never caught on or fit as well for them:

http://www.embedded.com/electronics-blogs/significant-bits/4024933/Code-compression-under-the-microscope

Plaz · « **Reply #443 on:** January 20, 2013, 05:42:24 AM »

FYI - if referencing the TG68 VDHL code you'll need this link to the latest version 1.08.
http://tinyurl.com/LatestTG68
The "download" button on the project page has older version 1.0

Plaz

bloodline · « **Reply #444 on:** January 20, 2013, 10:03:23 AM »

Quote from: matthey;723256

Complex? Take a look at a decoder for x86 . Yea, the 68k does need more logic in the decoder but the improved code density allows more instructions to be piped into the processor. Most RISC instructions use a consistent 32 bit fixed length encoding which is great for decoding. The 68k needs several separate decoding tables (lacking a better name) for different encoding areas. Some encoding holes are even divided into a separate table of instructions. This part of the 68k could have been a little better but it's not too much of a problem. The 68k does compress a lot of data with sign extended values which works very well and can be improved on. The overall slowdown from the decoder is minimal on the 68k and can be made up for with powerful instructions and addressing modes which it has and can be improved on. ARM with Thumb 2 works well because of the code density plus powerful instructions for RISC. This was a good tradeoff even though they now have a little more complex decoder. MIPS and PPC have also experimented with code compression (MIPS16E and CodePack respectively) but it never caught on or fit as well for them:

http://www.embedded.com/electronics-blogs/significant-bits/4024933/Code-compression-under-the-microscope

I rather like Mrs Beanbag's idea of a nice simple RISC core tailored to executing instructions that have been decoded from 68k instructions, it could simplify the decode stage maybe

psxphill · « **Reply #445 on:** January 20, 2013, 12:07:49 PM »

Quote from: bloodline;723285

I rather like Mrs Beanbag's idea of a nice simple RISC core tailored to executing instructions that have been decoded from 68k instructions, it could simplify the decode stage maybe

How could it simplify the decode stage? All it does is split one decode stage into two.

matthey · « **Reply #446 on:** January 20, 2013, 05:33:25 PM »

Quote from: bloodline;723285

I rather like Mrs Beanbag's idea of a nice simple RISC core tailored to executing instructions that have been decoded from 68k instructions, it could simplify the decode stage maybe

psxphill has a point. Decoding and re-encoding into a completely different format adds complexity (the 68060 RISC is likely simpler and based on the CISC encoding). The 68k has EAs which are too long to be encoded into 16 bit RISC and some CISC instructions would have to decode into multiple RISC instructions which increases the data to be dealt with. RISC has a lot of simple instructions to handle while CISC has fewer complex instructions to handle. The extra logic in a CISC decoder is more than made up for by the reduced logic for caches and memories. This even applies to the worst case x86 decoders. There is a potential problem of a slowdown with poorly encoded CISC due to lack of parallel decoding. The instruction length needs to be simple to determine for parallel operations. The x86 has historically had a longer pipeline because this is not possible (and to increase processing power at the expense of good branch performance). The 68k is significantly better although the outer displacement of double memory indirect 68020+ addressing modes hurts decoding significantly as well as increasing the max instruction length (we don't even know how the 68060 deals with these very long encodings). There is not much advantage to them if the EA has to be calculated 2x which also makes execution more complex. I suggested trapping the double memory indirect modes with an outer displacement in the 68kF ISA. We need to see if the decoder is still the bottleneck after that. I'm guessing that it won't be in fpga because of the slow execution of 32 bit multiply and shift in the ALU.

The N68040 fpga pipeline looks like this:
1) Instruction Fetch
2) Decoder *
3) Register Fetch
4) EA Calculation *
5) DCache-Read
6) ALU Execution *
7) Write-Back

With 3 potential bottlenecks in fpga:
B1) Decoder (identifying the instruction length, to be able to fetch the next instruction)
B2) EA Calculation
B3) ALU Execution

billt · « **Reply #447 on:** January 20, 2013, 05:54:51 PM »

Quote from: freqmax;723245

Regarding instruction set (ISA) I was thinking in general why they changed it. Because the end result is a slight confusion.

where was that? I seem to have missed it.

Mrs Beanbag · « **Reply #448 on:** January 20, 2013, 05:58:09 PM »

Right, simplifying the decoding stage wasn't the idea so much. But if you can split a problem into two parts, it is usually easier to solve. I'm trying to make the developer's job easier really.

The advantages are that each part can be developed, tested and optimised separately, and indeed the RISC core could conceivably be useful on its own (and an assembler could be modified to compile 68k asm to run on it). It would be easier to add new instructions, much in the same way that microcode does, but the "microcode" in this case is more readily understandable, being 68k-like itself.

psxphill · « **Reply #449 on:** January 20, 2013, 06:13:39 PM »

Quote from: matthey;723339

(we don't even know how the 68060 deals with these very long encodings).

What do you mean? AFAIK the longest instruction is 10 bytes and that is what gets transferred from the FIFO in the decode stage.

Quote from: Mrs Beanbag;723343

Right, simplifying the decoding stage wasn't the idea so much. But if you can split a problem into two parts, it is usually easier to solve. I'm trying to make the developer's job easier really.

I think you'd either have to put up with it being slower, or making the rest of it much more complex to compensate. It's a juggling act.

Author Topic: Motorola 68060 FPGA replacement module (idea) (Read 54536 times)

freqmax

Re: Motorola 68060 FPGA replacement module (idea)

Mrs Beanbag

Re: Motorola 68060 FPGA replacement module (idea)

psxphill

Re: Motorola 68060 FPGA replacement module (idea)

matthey

Re: Motorola 68060 FPGA replacement module (idea)

wawrzon

Re: Motorola 68060 FPGA replacement module (idea)

freqmax

Re: Motorola 68060 FPGA replacement module (idea)

matthey

Re: Motorola 68060 FPGA replacement module (idea)

bloodline

Re: Motorola 68060 FPGA replacement module (idea)

matthey

Re: Motorola 68060 FPGA replacement module (idea)

Plaz

Re: Motorola 68060 FPGA replacement module (idea)

bloodline

Re: Motorola 68060 FPGA replacement module (idea)

psxphill

Re: Motorola 68060 FPGA replacement module (idea)

matthey

Re: Motorola 68060 FPGA replacement module (idea)

billt

Re: Motorola 68060 FPGA replacement module (idea)

Mrs Beanbag

Re: Motorola 68060 FPGA replacement module (idea)

psxphill

Re: Motorola 68060 FPGA replacement module (idea)