Author Topic: in case you are interested to test new fpga accelerators for a600/a500 (Read 39028 times)

guest11527 · « **Reply #329 from previous page:** April 03, 2015, 04:53:31 PM »

Quote from: psxphill;787340

Any opcode that would cause an exception on a real 680x0 cpu can be used by software as a virtual opcode. LINEA & LINEF were officially available for that, but it's certainly possible that a piece of software could rely on any exception. The MMU & FPU that is proposed is certainly incompatible.

Actually, it is a bit more complicated. Line-A is certainly not available for new opcodes as this line is taken, both by MacOs (operating system traps) and also by Atari TOS (the blitter).

The F-Space is partially available. Motorola has the standard 68882 FPU mapped there, and the MMU, and some extended instructions of the 68060/040 map here.

Besides that, some new functions can at least cause problems, e.g. a new set of data registers or a new address register. Problem is that these are not saved and restored by the exec scheduler. Thus, the scheduler would need to be patched. But then, a couple of utilities depend on the stack frame of the scheduler, thus you can either continue to use these programs and not make use of the new registers, or use the registers and get rid of the programs.

Actually, I personally would rather get rid of the extra registers as I consider the Amiga "market" too small for such an experiment.

Anyhow, all these are engineering arguments, but my central point is not even related to engineering. I don't want to repeat this all over again.

asymetrix · « **Reply #330 on:** April 03, 2015, 05:48:12 PM »

68k -> coldfire may be useful : http://www.microapl.co.uk/Porting/ColdFire/pacf_download.html

One also needs to consider a universal assembler language for any CPU this generation or the next.

20 years ago it was, 68k, then PPC, now mips is cheap and popular what is next 50 years ?

Virtual ASM + virtual CHIPSET

For example use IOMMU technology (graphics address remapping)
port this to the current generation cheapest CPU + GPU combo and BINGO - Amiga runs.

CPU + GPU = AmiChip

Open source fully documented registers, one could just learn ASM+GPU coding on any device. Hardware does not matter.

We have bigger problems than worrying about which hardware to use. want to use 68k - great, x86 great, DSP great - whatever floats your boat.

It should be all AmiChip I, II, III compatible.

However, due to the urgency of having apps and games ASAP I would favour a single GFX mode eg 1024x768 common monitor resolution at 24 bits colour.

A clean slate so to speak.

We can develop apps and games for this single mode, slowly testing Amiga compatibility later in stages.

kolla · « **Reply #331 on:** April 03, 2015, 07:04:48 PM »

So you don't need an operating system?

xboxOwn · « **Reply #332 on:** April 03, 2015, 07:09:32 PM »

Quote from: kolla;787347

So you don't need an operating system?

Technically you do not need an operating system. Dos, windows, Linux, mac, amigaos, basic, etc are all luxury. You can have a computer without os period. NES and Snes are two great examples where OS does not exist and Games developed directly at machine language.

kolla · « **Reply #333 on:** April 03, 2015, 07:11:07 PM »

Quote

Phoenix does support _ALL_ address modes of the 68K family
this includes _ALL_ address modes of 68020 to 68060.
Phoenix is designed to provide full compatibility to existing software. We are not aware of any incompatibilities.

http://www.apollo-core.com/knowledge.php?b=2¬e=2861&z=fwkq39

What does this imply? I presume FPU and MMU to not be in that soup, so compatibility is against 68EC030-EC060?

matthey · « **Reply #334 on:** April 03, 2015, 08:34:33 PM »

Quote from: psxphill;787340

Any opcode that would cause an exception on a real 680x0 cpu can be used by software as a virtual opcode. LINEA & LINEF were officially available for that, but it's certainly possible that a piece of software could rely on any exception.

A-line is documented as user reserved. IMO, gated or switched on A-line instructions would be under user control so acceptable.

Quote from: M68060UM

An unimplemented A-line exception corresponds to vector number 10 and occurs when an instruction word pattern begins (bits 15%&$#?@!%&$#?@!%&$#?@!8211;12) with $A. The A-line opcodes are user-reserved and Motorola will not use any A-line instructions to extend the instruction set of any of Motorola%&$#?@!%&$#?@!%&$#?@!8217;s processors. A stack frame of format 0 is generated when this exception is reported. The stacked PC points to the logical address of the A-line instruction word.

Where did you get your information for F-line? Some operating systems did use F-line for traps but I have not seen documentation designating F-line as user reserved. Motorola's own MMU and FPU were incompatible with some software.

Quote from: M68060UM

An unimplemented F-line exception occurs when an instruction word pattern begins (bits 15%&$#?@!%&$#?@!%&$#?@!8211;12) with $F, the MC68060 does not recognize it as a valid F-line instruction (e.g., PTEST), and the processor does not recognize it as a floating-point MC68881 instruction. This exception corresponds to vector number 11 and shares this vector with the floating-point unimplemented instruction and the floating-point disabled exceptions. A stack frame of type 0 is generated by this exception. The stacked PC points to the logical address of the F-line word.

Quote from: psxphill;787340

The MMU & FPU that is proposed is certainly incompatible.

I am not aware of any interface to the MMU. It's possible the interface disappears completely or an unused coprocessor ID is used for maximum compatibility. Compatibility with a particular 68k could be provided by trapping. The FPU could be moved to another coID also but that wouldn't be very convenient when executing FPU code. The FPU changes I proposed are only incompatible with BCD floating point which I am not aware of any program on the Amiga which used it and it was trapped already on the 68040. It's in the same category as the 68020 only CALLM/RTM. Jim Drew said the MacOS does use this support but it should be possible to make it faster by working around it considering it is trapped on the 68040-68060 anyway. Other workarounds are already necessary with the "incompatible" 68060 FPU which has a mostly incompatible stack frame size causing crashes on initialization in most FPU software. This is patched in the AmigaOS and probably MacOS where detected. There could be some incompatibility if enabling 8 more FPU registers but then the performance could be up to twice as fast if FPU instruction results are not available to the next instruction. An SIMD unit would also have the same incompatibility when enabled but could provide several times the performance in some cases. The performance gains are big enough to warrant these additions, IMO. I believe the performance gain from more CPU integer registers would be significantly smaller and the potential for compatibility problems greater once the new registers were enabled.

Quote from: kolla;787349

http://www.apollo-core.com/knowledge.php?b=2¬e=2861&z=fwkq39

What does this imply? I presume FPU and MMU to not be in that soup, so compatibility is against 68EC030-EC060?

Addressing modes should work with coprocessors. The 68k/6888x from the beginning has done decoding and EA calculation in the integer units before passing the instruction to the coprocessor for completion. The Apollo core was going to have a fully pipelined FPU and I don't know what affect this would have.

kolla · « **Reply #335 on:** April 03, 2015, 08:45:31 PM »

Quote from: xboxOwn;787348

Technically you do not need an operating system. Dos, windows, Linux, mac, amigaos, basic, etc are all luxury. You can have a computer without os period. NES and Snes are two great examples where OS does not exist and Games developed directly at machine language.

Of course, but what I responded to also mentioned "apps", applications too do not technically need an OS, but a functional OS is damn usefull for anyone creating or using an application.

kolla · « **Reply #336 on:** April 03, 2015, 08:47:09 PM »

Quote from: ChaosLord;787331

All my games have use for an improved 680x0 CPU.

Even my old A500 games required a 68020+ after circa 1990. 25Mhz 68030 accelerators were sold everywhere for A500 in 1990s. They had 68020 and 68030 accelerators in 1980s too.

All my games have use for a faster CPU.
All my games have use for additional instructions.
All my games have use for more RAM.

Maybe you targeted the wrong platform?

psxphill · « **Reply #337 on:** April 03, 2015, 10:15:33 PM »

Quote from: matthey;787359

A-line is documented as user reserved. IMO, gated or switched on A-line instructions would be under user control so acceptable.

Reserved for the user, meaning the application and not the CPU.

Quote from: matthey;787359

Where did you get your information for F-line? Some operating systems did use F-line for traps but I have not seen documentation designating F-line as user reserved. Motorola's own MMU and FPU were incompatible with some software.

Documentation is irrelevant, only how the chips behave is relevant if you want to be compatible. Past incompatibilities can't be helped, only future ones. I'd pick whatever the 68060 + FPU + MMU does as anything that needs to run fast will have been written for it & a lot of the work to make sure everything else can run on 68060 has been pretty much done.

johnklos · « **Reply #338 on:** April 03, 2015, 10:37:12 PM »

Quote from: psxphill;787330

100% 68060 compatible CPU+FPU+MMU is the only "compromise" that I'm happy with, Motorola took things out and people have spent decades making sure the software runs. We shouldn't have to start that process again just yet.

Something new doesn't have to be 100% m68060 compatible in the sense that it could appear to the OS and software as 100% m68060 compatible, but the instructions in the m68060 which are unimplemented can be implemented in the new processor core. No sense running a trap call for emulated instructions.

guest11527 · « **Reply #339 on:** April 03, 2015, 10:37:50 PM »

Quote from: matthey;787359

I am not aware of any interface to the MMU. It's possible the interface disappears completely or an unused coprocessor ID is used for maximum compatibility.

I am not aware that there are currently MMU-activites for the Phoenix/Apollo core... On the 68K, there is a coprocessor ID reserved for it, but in reality, the MMU instructions differ already between 68020/68030 and the 68040/68060 substantially. Actually, given the small number of programs that really depend on the hardware interface, it is less a problem to create a new interface here. The FPU is more often used at instruction level, hence, compatibility on instruction level is much more important here.

Quote from: matthey;787359

Compatibility with a particular 68k could be provided by trapping.

Partially. For the FPU, certainly. For the MMU: This is not sufficient, because the MMU does more than execute instructions and update registers. If you would want to emulate the MMU at instruction level, you would also need to emulate the table-walk of the MMU. Actually, concerning the FPU: Additional FPU registers are again problematic, again due to the exec scheduler. I believe it would be wiser to have the second set of FPU registers, or a specialized vector-FPU available under an additional coprocessor ID. The standard scalar FPU would preserve the legacy interface, and could be saved and restored by exec. Hence, programs and tools for applications that only use the scalar FPU could remain unchanged. If you need more speed, you would engange the "vector FPU" under a new coprocessor ID, and an updated exec scheduler would save and restore its registers. Hence, the incompatibility would only involve programs that actually use the new FPU, and not all programs.

Quote from: matthey;787359

There could be some incompatibility if enabling 8 more FPU registers but then the performance could be up to twice as fast if FPU instruction results are not available to the next instruction. An SIMD unit would also have the same incompatibility when enabled but could provide several times the performance in some cases. The performance gains are big enough to warrant these additions, IMO.

See above. I would advice against extending the scalar FPU. If you need more registers, or vector instructions, enable an extended vectorial FPU. This would allow exec to continue using its legacy stack frame if the vectorial FPU is not used, and hence incompatibilities could be minimized.

Quote from: matthey;787359

Addressing modes should work with coprocessors. The 68k/6888x from the beginning has done decoding and EA calculation in the integer units before passing the instruction to the coprocessor for completion. The Apollo core was going to have a fully pipelined FPU and I don't know what affect this would have.

Well, for the FPU yes. For the MMU, this was only the case up to the 68030. The MMU interface changed substantially in the 68040, and there is no longer any flexibility in the supported addressing modes.

psxphill · « **Reply #340 on:** April 03, 2015, 10:52:35 PM »

Quote from: Thomas Richter;787371

Actually, given the small number of programs that really depend on the hardware interface, it is less a problem to create a new interface here.

As long as you don't care about running old software, but as soon as you open up that for debate then why worry about any old software. Just stick a fast x86 in there and run an emulator.

matthey · « **Reply #341 on:** April 04, 2015, 10:07:49 AM »

Quote from: Thomas Richter;787371

I am not aware that there are currently MMU-activites for the Phoenix/Apollo core... On the 68K, there is a coprocessor ID reserved for it, but in reality, the MMU instructions differ already between 68020/68030 and the 68040/68060 substantially. Actually, given the small number of programs that really depend on the hardware interface, it is less a problem to create a new interface here. The FPU is more often used at instruction level, hence, compatibility on instruction level is much more important here. Partially. For the FPU, certainly. For the MMU: This is not sufficient, because the MMU does more than execute instructions and update registers. If you would want to emulate the MMU at instruction level, you would also need to emulate the table-walk of the MMU.

The FPGA Arcade and Mist have small enough memory sizes to get away with a 68040/68060 MMU. If Gunnar is not interested in a 68040/68060 MMU compatibility layer for his Apollo core then there isn't much reason to create a standard unless other FPGA Amiga hardware is introduced with more memory. Maybe a better way to query what hardware is available would be useful though.

Quote from: Thomas Richter;787371

Actually, concerning the FPU: Additional FPU registers are again problematic, again due to the exec scheduler. I believe it would be wiser to have the second set of FPU registers, or a specialized vector-FPU available under an additional coprocessor ID. The standard scalar FPU would preserve the legacy interface, and could be saved and restored by exec. Hence, programs and tools for applications that only use the scalar FPU could remain unchanged. If you need more speed, you would engange the "vector FPU" under a new coprocessor ID, and an updated exec scheduler would save and restore its registers. Hence, the incompatibility would only involve programs that actually use the new FPU, and not all programs. See above. I would advice against extending the scalar FPU. If you need more registers, or vector instructions, enable an extended vectorial FPU. This would allow exec to continue using its legacy stack frame if the vectorial FPU is not used, and hence incompatibilities could be minimized.

This is a good idea in theory but there are problems. A new SIMD/vector unit in an FPGA may only support integer operations because of the cost of even single precision floating point. There are several choices:

1) 8 register FPU with integer only SIMD = slow fp performance
2) 8 register FPU and wait for an SIMD with single precision fp = slow fp performance now
3) No FPU with SIMD supporting single precision fp = poor fp compatibility
4) 16 register FPU with integer only SIMD = average fp performance
5) 16 registers FPU and wait for single precision SIMD = average fp performance now

I thought 8 FPU registers was adequate when Gunnar wanted 16. I encoded it and found that it works out very well (unlike adding integer registers). The registers are orthogonal except for FMOVEM but the upper 8 FPU registers can and should be scratch registers, IMO. The 68k FPU would be much more efficient with more scratch registers (the cost of saving and restoring extended precision FPU registers is very expensive). Making all 8 new FPU registers scratch registers would cut the number of FPU register saves and restores in half or more, would be very efficient for passing arguments to functions in FPU registers which do not need to be preserved and fp instructions can be interleaved which could up to double performance when the result of one instruction can't be used in the next instruction. This could allow reasonable performance of wide operations in a slow FPGA and/or a 2nd FPU superscalar unit. Keeping the FPU extended precision makes more sense with 16 registers because of the register argument passing and reduced register saves and restores. Most compilers waste the 68k FPU extended precision by passing arguments as 64 bits on the stack which is easy and efficient to improve with 16 FPU registers and a new ABI. Which do you think would cause the least incompatibility?

A) adding 8 FPU registers and patching the exec scheduler
B) reducing FPU precision from 80 bits to 64 bits

I would choose A) above. Most FPU code does not rely on the extended precision but I know that several 68060FPSP algorithms would need fixing (if possible) or new algorithms. The performance advantage of a 64 bit FPU is significantly reduced when extra instructions are needed to retain maximum precision which are not needed with extended precision. I agree that the extra few bits of precision can be very useful too.

Compatibility is very important given the current state of the 68k Amiga but we need to increase performance and plan for the future also. Adding 8 FPU registers looks like a tremendous opportunity to me even if it adds some incompatibility when the extra FPU registers are used.

vxm · « **Reply #342 on:** April 04, 2015, 10:44:43 AM »

So if I understood this post :
1 + 2 always equals 2 + 1 and nothing else.
The instruction set of a cpu is the set of its users.
A quantum 680x0 would be a heresy unless its clock frequency is less than or equal to 7 MHz.
I have a migraine.

psxphill · « **Reply #343 on:** April 04, 2015, 11:21:00 AM »

Quote from: matthey;787396

Compatibility is very important given the current state of the 68k Amiga but we need to increase performance and plan for the future also.

The only reason to rush through suggestions for new instructions is to stamp your name on it for kudos. I'd rather buy something that can run all software at 68060 66mhz speeds than something that may run at 0mhz or 300mhz depending on the software (if it isn't compatible then it's 0mhz).

We can go out and buy 030 cards relatively cheap, it's the top end 060 cards that are in demand and that is where the biggest market is.

But compatibility doesn't appear to be very important to this project.

guest11527 · « **Reply #344 on:** April 04, 2015, 12:17:14 PM »

Quote from: psxphill;787372

As long as you don't care about running old software, but as soon as you open up that for debate then why worry about any old software. Just stick a fast x86 in there and run an emulator.

Sorry, I don't get your argument. There are already two incompatible(!) MMU models, the 68030/68851 and the 68040/68060. Even within the same family, subtile differences exist, so a single program cannot depend on a single set of instructions already. Note again that the instructions and the programming logic is already different from MMU to MMU.

Thus, MMU is "system programming" and "supervisor instruction set", whereas the above integer instructions are "user programming" and "user instruction set". A user program has no reason to access the MMU in first place. That's the job of the Os, or in case of the Amiga, of the CPU support library.

The supervisor programming logic is already different from family to family in the 68K land, so nothing new here. The user-land programming did not, or rather, only a single cut was introduced with the 68020.

Author Topic: in case you are interested to test new fpga accelerators for a600/a500 (Read 39028 times)

guest11527

Re: in case you are interested to test new fpga accelerators for a600/a500

asymetrix

Re: in case you are interested to test new fpga accelerators for a600/a500

kolla

Re: in case you are interested to test new fpga accelerators for a600/a500

xboxOwn

Re: in case you are interested to test new fpga accelerators for a600/a500

kolla

Re: in case you are interested to test new fpga accelerators for a600/a500

matthey

Re: in case you are interested to test new fpga accelerators for a600/a500

kolla

Re: in case you are interested to test new fpga accelerators for a600/a500

kolla

Re: in case you are interested to test new fpga accelerators for a600/a500

psxphill

Re: in case you are interested to test new fpga accelerators for a600/a500

johnklos

Re: in case you are interested to test new fpga accelerators for a600/a500

guest11527

Re: in case you are interested to test new fpga accelerators for a600/a500

psxphill

Re: in case you are interested to test new fpga accelerators for a600/a500

matthey

Re: in case you are interested to test new fpga accelerators for a600/a500

vxm

Re: in case you are interested to test new fpga accelerators for a600/a500

psxphill

Re: in case you are interested to test new fpga accelerators for a600/a500

guest11527

Re: in case you are interested to test new fpga accelerators for a600/a500