Welcome, Guest. Please login or register.

Author Topic: ADOOM on A600 running 22-35 FPS  (Read 55983 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« on: February 12, 2015, 03:17:06 PM »
Quote from: Thomas Richter;783714
While I'm all for the Phoenix, this is an "improvement" I do not agree with. There are reasons for these "restrictions" all along, and Mot made a choice with these restrictions. The reason why d(PC) and d(PC,Xn) is read-only is a good one: Everything that these two ea's can address is in the "text segment", also known as "code". "code is not supposed to be modified", this is what Motorola expresses here clearly. If you want data, put that into a data segment and address it either absolute or relative to a segment pointer (aka a4).  One should not forget that the Motorola FPUs have an external address space that is larger than 32 bit. It is 32+3 bit, 32 address bits plus three "function code bits". Whether these have been used in the Amiga is another question, but as far as the CPU architecture is concerned, this is very consistent with the "restrictions" of the addressing modes. d(PC) is a "instruction space access" and hence read-only, as all instruction space accesses. d(an) is a data access, and hence has a function code identifying data accesses, and hence is read-write.  I would pretty much prefer to keep this clean model, no matter whether it is actually enforced in the Amiga or not.


Were you not invited to the Apollo Forum before the ISA decisions were made? Your name was mentioned in the ISA discussions and someone even put words in your mouth in your absence. If you are worried about such a minor issue as "PC" relative writes then maybe you should have a look at the lack of orthogonality necessary to add more registers, of which there are now 4 types and one of which is called A8. I was willing to compromise on "PC" relative addressing as I consider it a minor issue that is inadequate for code protection and it would help code density to a very minor degree (I did not draw the same conclusions from analyzing dissassembled code not that it is the first time or that I even bother mentioning such things any more). Maybe you would have had been given enough respect and been able to argue your issues enough in the ISA committee to make a difference but I doubt it. The committee of one has already decided the future of the 68k. I'm sorry you wern't there and missed your oppurtunity. Oh wait, never mind, this is somebody's toy project and the committee dissolved for lack of "yes" men.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #1 on: February 12, 2015, 06:00:28 PM »
Quote from: biggun;783739
Matthey,

the amount of "voting" to the project is in level to their work contributed.
There are some people who have contributed some hundred thousand lines and some which have not.


Then ThoR would have had no vote either and I was correct about it being someone's toy project.

Quote from: OlafS3;783740
I am a little disappointed of your behavior


I have never been someone's "yes" man but instead have had my own opinions which I have not hidden. I hope the Phoenix project is successful but I believe that the "enhancements" are missing the mark as far as being adopted and used as a 68k Amiga standard (something you want Olaf). Also, better ColdFire support would have made adoption and support easier outside of the Amiga community and made an ASIC for combined embedded use more likely. The enhancements are driven by one persons insatiable desire for more registers and performance at any cost, including future planning. Yes, he has done most of the work so perhaps it is his decision for his pet project. No one is stopping him or being rude but it does leave open Karlos's pondering, "I've occasionally wondered what the 680x0 might have become if it had achieved the same sort of popularity as the x86." We may never know what a professional company would have done instead of a pet project driven by one man's desires. I shouldn't complain as at least we get a faster 68020 compatible CPU but neither can I sell the enhancements I do not believe in.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #2 on: February 12, 2015, 07:57:35 PM »
Quote from: wawrzon;783749
@matt

Two matters that make things a bit less scary:
1. Fpga isnt set in stone hardware, it might be reprogrammed and adjusted according to the reaction it meets. Reacting to the feedback would be wise on part of developers. But there always will be different opinions, thats sure.


Yes, this is true. Phil "meynaf" also did not like the direction of the new ISA but challenged for it to be created as a test. It is possible to learn from doing things the hard way. The few people that write code for the new ISA would be disappointed if the ISA changed significantly. The new ISA is unlikely to be (even partially) adopted in the TG68 or other 68k fpga cores or eventually UAE (I believe Toni's position would change if multiple fpga hardware was using the same ISA) . It is less likely that a non-standard complex 68k ISA for a single fpga CPU would gain wide spread support in compilers. A core which is more compatible with both 68k and ColdFire is more likely to be interesting to embedded developers. Enough money could probably gain a custom ISA and then we could have a thousand variations of an ISA like ARM but this is what I hoped to avoid. People thought I was too early trying to push for the creation of a standarized ISA and trying to get input from others. I tried to create a standards committee/group by bringing in people to our discussions including inviting ThoR, Frank Wille and Dave Alsup (of Innovasic). I would have loved to bring in people like Tony Wilen, Jason McMullan, Volker Barthelman, Kalms and maybe even a Karlos who have understanding of an ISA from different view points. I guess people are too busy or believe the Amiga is too dead to care anymore. At least Gunnar is doing something.

Quote from: wawrzon;783749

2. Any extensions beyond what the legacy 68k provides will not have much effect until compiler backends will assimilate these extensions. Which will not happen any soon, giving time to reconsider. The legacy instruction set and its execution efficiency is what counts atm.


ISA decisions make a big difference in how easily and quickly the ISA changes can be adopted. ColdFire enhancements are the easiest to adopt because they already exist and only need to be switched on in a compiler backend and in peephole optimizing assemblers. I bet Frank Wille could have ColdFire support in vasm working in a few days and already making a noticable difference in shrinking program sizes. ColdFire support in the backend could take a few weeks to add and test as it is a more delicate process to add. Taking advantage of the current ISA with more registers in the backend would likely take many months and bugs could turn up for years. Few developers are knowledgable and familiar enough with a compiler to add this kind of support. Are they going to dedicate this kind of time for a non-standard in an fpga CPU sold in the hundreds or low thousands at the most when they could be improving a compiler target with tens of thousands of hard processors? I don't think so. Phoenix is not going to immediately set the world on fire. IMO, it's better to have an easy standard to adopt with a few benefits and incremental improvements than a core specific non-standard with theoretical high performance that will never be utilized completely in compilers. Splits seem to be the Amiga way though. I'm tired of arguing and trying to create something better. Gunnar did make the right decision to add better 68020 compatibility (all addressing modes without trapping) and we do have this as a base which is the most important thing. We are moving forward past the 68060 in performance with this too. I should be thankful as we need new 68k hardware to revitalize the Amiga. I would have liked to create something like a cross between an Amiga Raspberry Pi, a Natami and a CD32+ but there is not enough cooperation, at least not yet.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #3 on: February 12, 2015, 08:52:43 PM »
Quote from: OlafS3;783759
I wrote that not because you disagreed or because of your opinion but because I dislike it when internal discussion are published in public. Internal should stay internal. That is my opinion.


But the problem was that the ISA discussion was internal and the group too small. There were only 3 people with enough understanding to even have an opinion. Meynaf and I have somewhat similar (overlapping) experience and perspectives. Gunnar's perspective is very different. We did not have a diversified enough or large enough group that a consensus even matters. Gunnar opened up the apollo forum to the public recently so I don't even know what is considered internal discussion,

Quote from: OlafS3;783759

To topic... new features always have the problem that they are not supported as long compilers are not adapted and software not recompiled. But a superfast 68020 compatible processor would be a lot already.


Phoenix may help the 68k targets of compilers to get some attention, I do not expect any major compiler support beyond that. Do you think the 68020 ISA is modern enough to compete with newer ISAs? Do you think it's modern enough to attract developers?
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #4 on: February 12, 2015, 09:32:08 PM »
Quote from: OlafS3;783774
compete with modern ISA´s? You mean X64 or ARM? That is completely unrealistic. How much money and developers are invested in that ISAs and how much time you think Gunnar and the few others can invest in it? We need not to compete with ARM but the best solution for our platform. A good FPGA based device at affordable price has a market but of course not a mass-market at the level of ARM or X64. Or what is your idea? What would the "market" you see?


The 68020 ISA is from the late '80s and what we are stuck with if the new ISA is not adopted. The 68k probably doesn't need to compete with x86_64 or ARM (they have more baggage also) but it could use some modernization. I was looking for interest in the embedded "market" where the small memory footprint is an advantage but that was before Gunnar dropped much of the ColdFire compatibility.

Quote from: OlafS3;783774

And we have a lot of compilers already (partly even with source code). We need best support for them.


I have submitted changes/fixes to the 68k backend of vbcc which I expect is one of the simplest compiler backends. While I was generally successful in getting my changes to work, I was not at all sure of rare side effects that my code could have caused. It would take months of studying the code and working with it before I would feel comfortable in making changed which were not reviewed by Volker. Frank Wille is a more experienced and better programmer than me and he does not make any major changes to vbcc without submitting them to Volker. Compilers are not rocket science but they have intricate sensitive code that requires knowledge and experience to change even with sources.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #5 on: February 13, 2015, 01:04:30 AM »
Quote from: Karlos;783788
If you look at the evolution of the 680x0 while it was in development, the 68020/882 represents the peak of (non-privileged) 68K instruction complexity. The 040 and 060 focused on streamlining it so that only the most used operations were implemented in silicon.


Yes, this is another reason that the 68020+6888x is a poor ISA base target. When we talk of this target we are talking about a simplified version of it but what compilers generate may not be. Motorola went overboard with increasing the complexity of the 68020 and 6888x ISA but then they chopped too much. The removal of the FINT/FINTRZ in the 68040 was a *huge* mistake and the removal of 64 bit integer MUL/DIV from the 68060 was also a poor decision. Modern 32 bit ISAs have these instructions and more. Motorola was attracted to the simplicity of RISC and tried to turn the 68k into an old school simplified RISC processor. Most of the old school RISC processors are gone now (except MIPS and some would say SPARC) while we are left with PPC and ARMv8 which have larger and in some ways more complex instruction sets than the 68k (more operands for example).

I didn't want to bring back all the complexity and mistakes that Motorola removed. The evaluation ISA looked something like this:

http://www.heywheel.com/matthey/Amiga/68kF_PRM.pdf

It probably needs some slimming and tuning but there are some very good ideas also. Gunnar decided that more registers was the holy grail of performance and decided to sacrifice nearly everything else. It is possible to increase the number of FPU registers in an orthogonal way as demonstrated in the evaluation ISA above but not so for new integer registers. Compilers have problems with the An/Dn split but now we have 8 non-orthogonal En and an A8 register encoded in any way that could be found. The new Phoenix ISA does get new planar gfx instructions also. I would give a link to the new Phoenix ISA but I haven't seen any documentation yet. That doesn't stop it from being the "standard" as I have seen nothing indicating it is a trial or evaluation ISA.

Quote from: Karlos;783788

Now, if you compile code for 020+, a good compiler avoids emitting anything needing traps on 040/060 (of course it can and does happen).


You are correct that most compilers will generate trapped 6888x FPU instructions when compiling for the 68040 and 68060 (but most avoid trapped integer instructions). Vbcc should not generate any 6888x FPU instructions when compiling for the 68060 and linking with -lm060. If it does or you have any other problems with the direct FPU libs then let me know and I'll fix it (details here):

http://eab.abime.net/showthread.php?t=74692

Quote from: Karlos;783788

In my humble opinion then, if the FPGA implementation covers the 68040/60 operations (not necessarily every 020/882 operation) then it'll be great just as is. Any extra additions are niceties as it will take time to implement assembler/compiler/debugger tools for them. It's really up to the guys making the hardware. And fair play to them.


Sure, it is up to the hardware guys. It is "fair play" for them to make their own standards, sell out, move on, fail and quit also. Business is business and hope is fleeting.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #6 on: February 13, 2015, 05:52:43 PM »
Quote from: ElPolloDiabl;783802
That might be fine for a CPU, but FPGAs are designed differently and may not get an advantage from CPU improvements.

Hard processors are designed in FPGAs first (some of the Amiga custom chips were designed in FPGA too). There are some differences but functionally they are very similar. Some ISA improvements only make sense in a hard CPU or ASIC because of limitations of muxes and clock speeds in an FPGA. Most simple changes which reduce the number of instructions, reduce the code size, add new instructions with new functionality, add new addressing modes, add registers, etc. will be of benefit to the CPU. Adding complexity (usually too many muxes) in some critical areas of the fpga CPU would slow down the core clock speed.

Quote from: OlafS3;783813
You have not yet answered what you see as "market". And perhaps you could be wrong too, there can be more than one true. If I understood you right you preferred different choices because of it would be better for ASIC implementations whereas Gunnars design is better for FPGA (I hope I recalled it correctly) and you wanted to sell the core outside to other companies. To say it clear if today somebody needs power he uses X64 and if somebody wants something for mobile/embedded he will take ARM. There is not much room for other ISAs so to make a design that is not perfect running on FPGA (=fastest possible) just because it might perhaps be used outside is simply waste of resources. We need BEST solution for the Amiga platform. There are years of development in the project and I trust Gunnar that they made wise decisions when they worked at it.

I see a potential market for the Amiga but it needs to be built up and probably requires investment to get it back on its feet (Trevor@A-EON understands but misses that the 68k has more potential for the masses than PPC). Piggy backing on and cooperating with embedded projects and businesses could reduce the investment needed (economies of scale are especially killer for low production hardware). Yes, I did contact some embedded businesses. Yes, it is tough to compete against mature ARM processors and ARM is the best solution for super low power consumption embedded targets. An enhanced 68k has clear advantages over ARM with Thumb 2 for more powerful embedded (and computing) uses which includes:

1) easier to use
2) lower memory requirements (better code density)
3) more powerful addressing modes
4) stronger in memory without OoO
5) stronger single core performance without OoO

Adding OoO to ARM substantially increases power consumption perhaps to the level that an in-order superscalar CPU would use. Phoenix (and even the 68060 which is also in-order superscalar) outperforms practically all non-OoO ARM processors in integer performance clock for clock. This should scale to higher clock speeds (most code is executed from the cache where the 68k can fit a little more code). Gunnar can't see an ASIC in the future so he tries to adapt Phoenix (ISA and internal fpga optimizations) for maximum speed at all cost. He adds registers which does increase theoretical performance but requires complex changes to compilers that are unlikely to be implemented. Problems with his ISA changes:

1) haphazard, non-orthogal register additions make the 68k less easy to use and less consistent
2) unlikely to be implemented in a compiler backend
3) loses the advantages of simpler enhancements that could be implemented quickly in compilers
4) sacrifice of ColdFire compatibility limits oppurtunities for embedded use
5) less likely to be made into an ASIC
6) unlikely to be adopted as an Amiga or 68k standard in other cores and emulators
7) radical changes are less likely to be accepted by the conservative 68k loving community

Quote from: OlafS3;783813
BTW as I said most developers do not care about ISA details or certain instructions, that is a much too technical discussions not related to reality. How many 68k hardcore coders are still there? I know Novacoder and then I must think. Propably one hand is enough to count them. Even hobby developers are using compilers, that is even more true for commercial developers. Most people do not hack on the hardware anymore or program in asm. I had contacted former amiga developers because I thought I might create some interest again. Unfortunately that was not the case. So we have to build upon what we have, existing software and existing compilers. Anything else is unrealistic.

High level programmers don't deal with the abstracted ISA but it is important to how optimized and how much code is produced by the compiler. The key is to making the compiler programmer's life easier not more difficult but bone headed hardware guys are more worried about theoretical performance instead and keep repeating the same mistakes. The ISA shouldn't even be set in stone until after the compiler programmers (and assembler programmers) have attempted using what is new. You are a programmer so maybe you need to try some low level programming to appreciate the ISA. The source for vbcc is here:

http://www.ibaug.de/vbcc/vbcc.tar.gz

The 68k backend is in machines/m68k. Documentation for writing a backend is in the vbcc manual:

http://www.ibaug.de/vbcc/doc/vbcc.pdf

This is probably the simplest and easiest C backend you will find. Once you have some experience making use of Gunnar's "extra registers", you can help with the more challenging GCC and LLVM backends. I'm sure you will have a different opinion about the ISA after a few months and probably before you make any significant changes. Now if we could just get car engineers to be the mechanics for the cars they design for a few months then we would have much better designed cars which are easier to work on too ;).
« Last Edit: February 13, 2015, 05:57:31 PM by matthey »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #7 on: February 14, 2015, 11:56:47 AM »
Quote from: Karlos;784002

Whether the 68000 is 16 or 32-bit simply depends on which decade's thinking you look at it from.


The 68000 is a 16/32 bit hybrid (more 16 bit) and the 68020 is full 32 bits in my eyes. What is important is that the ISA was forward thinking enough to allow for 32 bit programs from the beginnning. Programs >64kB become less optimal with the 68000 ISA but programs that that were 32 bit clean with their pointers still work today. No memory bank switching or base registers are needed like most 16 bit processors. It's software that matters and good ISAs ;).
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #8 on: February 14, 2015, 01:52:44 PM »
Quote from: Karlos;784015
Quite. And this is why I don't share what I take to be your pessimism (sorry if I've misinterpreted your objection) to Gunnar's augmentation to the 680x0.

The reason being that if he adds super awesome vector unit X and relaxes various existing restrictions, it won't make any difference to the corpus of existing software. As long as his implementation is compatible with existing object code and is faster than existing 68K solutions then I believe he's got the formula right. Most folks, myself included, want an affordable, performant accelerator card. I missed out on a good 68060 card when they were being sold originally and now they are like rocking horse poop with a price to match. Anything faster than my 040 is a result for me.

I supported doubling the number of FPU registers (I came up with 99% compatible 6888x compatible encodings) as it can be done in a consistent and orthogonal way. More than 2 scratch FPU registers is a big advantage so no FMOVEM is needed for 8 new scratch FPU registers (An updated ABI would be advantageus though). I support adding a vector unit with many vector registers. I don't support combining the integer unit(s) and vector units while adding non-orthogonal and larger integer/vector registers in whatever encoding whole can be found with whatever limitations from what can't be supported, Maximum FPGA optimization involves combining units but this is a bad idea as it becomes more complex and an ASIC becomes less likely. The ISA would only be usable by ultra-optimized 68k like FPGA cores (of which there will be only one) and the 68k ease of use will become like a DSP with all the funky registers (take a look at the StarCore DSP ISA for example). I would be very surprised if a compiler tries to take advantage of the extra vector/integer hybrid whatever registers and 68k assembler fans are likely to turn up their nose also, Gunnar's new ISA is like trying to force a square peg through a round hole but he has decided that the square peg is better and that he will have it at all costs.

Quote from: Karlos;784015
That said, I believe you are right to want some steering on any changes to the ISA. I am optimistic that this can happen simply because anything new requires software to be written for it. And unless he aims to write it himself then  cooperation with software guys is inevitable. I'm sure he wants any augmentation to be genuinely useful

Sure. Change and cooperation happen quickly in the Amiga realm. Maybe he will give up on his ISA in a few years after no compilers implement it but then he is mighty proud of it. Maybe Hyperion will eventually see the light of cooperating with the 68k Amiga masses before they go broke too.
« Last Edit: February 14, 2015, 02:16:00 PM by matthey »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #9 on: February 14, 2015, 02:26:43 PM »
Quote from: biggun;784020

You are aware that the people who have developed the APOLLO work as professionally FPGA and CPU designer, don't you?

You know that some of IBM fastest Accelerators and latest POWER chips were desinged by those people.


Your time would be better spent writing up your ISA and submitting it to the GCC and LLVM maintainers along with your credentials. Olaf is the only one here who might make a backend with your ISA and he already knows your credentials.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #10 on: February 15, 2015, 12:26:27 AM »
Quote from: ElPolloDiabl;784023
@ Matthey
Could you tell us why you need an ASIC so badly? There is only one coldfire accelerator card for an Atari.

Seriously a $50 ras pi, must be a better choice for embedded systems. I don't feel a sudden need to control something from AmigaOS

The Raspberry Pi is $50 because it is an ASIC and not an fpga. A $50 Amiga Pi using an ASIC could:

1) be easier to program (embedded applications often ditch the OS so this is important)
2) use less memory and need less caches
3) have stronger memory performance and single core performance than the Rasberry Pi

An fpga CPU can't compete against a hard (ASIC) CPU in performance or price. The ColdFire has nothing to do with an ASIC other than that an ASIC needs to be sold in quantity to reduce the price and ColdFire support would open up the embedded market more. Most of the ColdFire enhancements are good for the 68k as well improving performance and code density which are especially valuable in embedded applications. They would also be good for emulation as the functionionality is available in many modern processors (sign and zero extension, endian conversion, etc.). ColdFire support is already available in most 68k/ColdFire shared compiler backends so it would be very easy to impliment. Atari 68k/ColdFire, the TG68 and WinUAE could possibly all adopt one unifying 68k standard which I don't think will happen with Gunnar's ISA.

Quote from: cunnpole;784063
I actually don't doubt any of that and also think talk of ASIC or any other future step is premature. This is already likely to turn the amiga upgrade market upside down. When the board become available for all systems then we'll be able to see where to go next

The FPGA needs to improve before an ASIC is viable but the decisions today affect how easy it would be to make an ASIC in the future. An ISA that is not well received and a non-standard outside of the Amiga community would make the possibility of an ASIC highly unlikely. Too much internal FPGA optimization like combining units could make the source code more difficult to adapt to an ASIC as well.

Quote from: ChaosLord;784088
Ok.  I would like to read the instruction manual for the new MiniApollo that you have crammed into Majsta's board.  Where is it at?

Does it contain any new instructions that I can use to produce the World's Best 2D Strategy Board Game(tm) ?

Have you left away any instructions that you are saving for the rumored future A1200 Apollo card?

I would like to see the ISA documentation and encoding maps also. I would like knowledgeable people to take a look at it and have a discussion about it.


Quote from: ChaosLord;784097
What is wrong with having vector registers added?

Nothing if they are in a vector unit. Let's ask ThoR what he thinks about overlaying 64 128 bit wide vector registers on the integer register file.

Quote from: ChaosLord;784097
I am trying to understand what is the problem with the new ISA.

Could you give me some example instructions that are critical and should be added, but Gunnar banned them?

Gunnar didn't ban anything. He used ColdFire encoding space to add more integer registers which he unilaterally decided was more important than anything else.

Quote from: ChaosLord;784097
Did he add some good new instructions but chose stupid encodings for them?

He has some instructions which I think would be acceptable and some that are questionable to add to a standardized ISA (like planar gfx instructions which may be ok in the core or as optional).

Quote from: ChaosLord;784097
Remember I have no idea what has been going on with this.  The last thing I remember, I was telling Majsta what a good job he was doing in trying to cook up an FPGA 680x0 CPU (It wasn't very fast, he had just started) and now all of a sudden it seems Gunnar has taken over and Doom is running at 22fps on A600.

Let's do a little time warp back to a time you remember. We go back past when Hyperion was bankrupt...back when the Natami was at it's peak generating over 300,000 hits in a single thread while Hyperion was selling a few hundred copies of AmigaOS 4 for the PPC...back when Gunnar was still part of Natami...Ok. Here we start in September of 2010. Gunnar was working on the Apollo/Natami core as a Natami Team member and brain storming for an new ISA. You were a Natami moderator then TCL and very active then. Gunnar suggested adding more registers then when some big Amiga names were on the Natami forum. The thread follows:

http://www.natami.net/knowledge.php?b=2¬e=26237

There were many suggestions about how to add more integer registers but the developers decision on more registers went something like this:

Gunnar von Boehn: yes, yes, yes
Ceti 331: yes, yes

Deep Sub Micron (Jens): no, maybe
Morgan Johansson: maybe
Claudio Wieland: maybe
Steve Thomas: maybe
Megol: maybe

ThoR: maybe, no
Phil "meynaf" G.: maybe, no
Cesare Di Mauro: no, no
Samuel D Crow: maybe, no
Marcel Verdaasdonk: maybe, no
Matt Hey: maybe, no

You were there TCL but you seemed to be moderating and didn't express an opinion that I could determine. By my count, I come up with 2 developers wanted it and 6 thought it was not a good idea. Note that Deep Sub Micron is one of the current Apollo developers and he could have been mildly in the no category but he still did try to come up with a workable solution to add more integer registers.

A 68k+ColdFire ISA was then developed with me documenting some of the better ideas and which was generally accepted as the Natami ISA. This is basically the 68kF ISA along with some ideas of my own and others since then. Gunnar left the Natami Team for reasons unknown (by some) and created the Apollo Team of which Meynaf and I were invited. Gunnar once again pushed adding registers but Meynaf and I didn't like the new idea preferring the 68k+ColdFire ideas better. Jens and Chris of the big 3 Apollo developers gave no input nor did anyone else have a major opinion. Gunner had free FPGA memory blocks to add more integer registers and determined that more registers were necessary for performance. Gunnar tried to get us to help encode and document this crazy ISA but we declined. It is true that I don't know the details of his ISA as it is less documented than the Natami ISA where there were encoding maps showing the 68k+ColdFire ISA. Rune Stensland even had enough info on the 68k+ColdFire ISA that he started adding support into the Asm Pro Assembler:

http://www.natami.net/knowledge.php?b=6¬e=33870

So I support much of what Gunnar has done with Phoenix but people need to realize that it is his pet/toy project. I'm not sure anyone with more Amiga clout than Meynaf or I could make a difference but I can say I tried. Maybe enough money could make a difference but I can't invest in his toy but only a community wide effort that is bigger than him. That is the end of story and maybe the end of the Amiga and 68k as well if some of the other important Amiga people don't change and begin cooperating very soon.
« Last Edit: February 15, 2015, 12:32:08 AM by matthey »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #11 on: February 15, 2015, 02:39:38 AM »
Quote from: ElPolloDiabl;784199
Adding more registers is fine with me. Is it possible to maintain coldfire compatibility as well?


Almost everyone would like more registers but what has to be give up to get them?

RISC processors give up powerful addressing modes and working directly in memory. They need several instructions for a simple load/store operation and bubbles are created when the code is not scheduled perfectly.

CISC requires more encoding space as the addressing mode as well as the register number is part of the instruction encoding. Some CISC processors like the x86/x86_64 use longer variable length encodings but this makes the code size bigger (x86->x86_64 programs are commonly 20%-40% larger and the caches have to be larger too). There is some free space in the 68k encoding map which includes the ColdFire MVS (Move with Sign extend) and MVZ (Move with Zero extend) but Gunnar's ISA would rather use this encoding for MOVEQ #,En so that his new registers don't enlarge the code. These are the most common ColdFire new instructions and without them it is a joke to call the processor CF compatible. The only other places big enough for the what Gunnar wants to do are A-line which is reserved on the 68k and would affect 68k compatibility for systems like Atari ST, Mac 68k, Sega Genesis, Neo-Geo, x68000, etc. where they are commonly used for function calls to the OS. The other place is F-line which is usually reserved for co-processors and an unknown amount of space would be needed in the future for a vector unit and possibly more. All the registers Gunnar wants to add eat up encoding space limiting other options. It would be possible to add a prefix encoding word (Megol pushed for this) which would give more consistent and orthogonal encodings but at the cost of larger instructions when using the new instructions and complexity in the core. Gunnar rejected this proposal. There are many trade-offs as you can see.

Quote from: ElPolloDiabl;784199

How about you compile for compatible mode or Vampire core on the software side.


The compatible mode would be the 68020 ISA from the late '80s. Amiga users are probably happy with a retro core like this but it would be a handicap when trying to sell for newer applications like embedded. With the ColdFire extensions and a few other minor ISA changes, we could be mostly ColdFire compatible at the instruction level and I believe we could be 5%-15% better code density (smaller code). Gunnar's major ISA changes are much riskier as far as acceptance and compatibility, especially for a FPGA processor with a tiny market, limited documentation and a lot of work for compilers.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #12 on: February 15, 2015, 07:52:08 PM »
Quote from: ChaosLord;784216
Is there something about those [ColdFire] instructions that just totally sucks?

If I would have had those coldfire instructions available to me in the 1980s, 1990s and 2000s then I would have used them and had faster code.  But I have put no thought into them at all for around 10 years.

Does Gunnar have a logical reason to murder the coldfire instructions?  Like they were stupid instructions?  Or they were to slow?  Or they bog down his ALU?  Or they consume to many read/write ports?  Or ?

The only thing wrong with the ColdFire instructions is the ColdFire naming conventions which are not friendly, consistent or 68k like. The solution I came up with are 68k names with ColdFire aliases:

Code: [Select]
mvs.b -> sxtb.l (Sign eXtend Byte . Long)
mvs.w -> sxtw.l  (Sign eXtend Word . Long)
mvz.b -> zxtb.l (Zero eXtend Byte . Long)
mvz.w -> zxtw.l  (Zero eXtend Word . Long)

This 68k already has EXTB.L so SXT[B/W].L and ZXT[B/W].L are only a small variation. There is some other minor massaging here and therer. These instructions are quite valuable as they reduce the number of instructions, improve code density, allow for ColdFire compatibility, make x86 emulation easier, allow peephole optimizations in assemblers like vasm and are almost as easy to implement in compiler backends where they already exist for the ColdFire. Gunnar doesn't have anything against them but needed the encoding space and more to encode his non-orthogonal En registers (originally he called them D8-D15 which is even worse having a bunch of non-orthogal registers as data registers; A8 is bad enough).

Quote from: biggun;784223
The A-line extension to NOT clash with ATARI or APPLE or any other old A-line usage.
Matt what you say is just technically not true. You should know this better.

I didn't say you used A-line but rather that was one of your other options which you most certainly were considering (and from your reaction may have used). The 68k MacOS uses A-line for OS function calls as can clearly be seen in MacOS disassemblies:

Code: [Select]
  196: 4268 0004      'Bh..'           CLR     4(A0)
   19A: 4228 0006      'B(..'           CLR.B   6(A0)
   19E: 4228 0007      'B(..'           CLR.B   7(A0)
   1A2: 43FA 036E      1000512          LEA     data2,A1    ; len= 1
   1A6: 45E8 0009      'E...'           LEA     9(A0),A2
   1AA: 4EBA 0392      100053E          JSR     proc2
   1AE: 43FA 03A2      1000552          LEA     data4,A1    ; 'Multi'
   1B2: 4EBA 038A      100053E          JSR     proc2
   1B6: 43FA 03AC      1000564          LEA     data7,A1    ; len= 2
   1BA: 4EBA 0382      100053E          JSR     proc2
   1BE: 4A6E FFEC      200FFEC          TST     vab_2(A6)
   1C2: 6756           100021A          BEQ.S   lab_13
   1C4: 4FEF FFFE      'O...'           LEA     -2(A7),A7
   1C8: 2F2E FFEE      200FFEE          PUSH.L  vab_3(A6)
   1CC: 4EBA 2C88      1002E56          JSR     proc29
   1D0: 301F           '0.'             POP     D0
   1D2: 6646           100021A          BNE.S   lab_13
   1D4: 4FEF FFCE      'O...'           LEA     -50(A7),A7
   1D8: 204F           ' O'             MOVEA.L A7,A0
   1DA: 317C FFF6 0018 '1|....'         MOVE    #$FFF6,ioCRefNum(A0)
   1E0: 216E FFEE 001E 200FFEE          MOVE.L  vab_3(A6),ioSEBlkPtr(A0)
   1E6: 317C 00FC 001A '1|....'         MOVE    #252,CSCode(A0)
   1EC: A004           '..'             _Control ; (A0|IOPB:ParamBlockRec):D0\OSErr
   1EE: 4FEF 0032      'O..2'           LEA     50(A7),A7
   1F2: 206E FFEE      200FFEE          MOVEA.L vab_3(A6),A0
   1F6: A01F           '..'             _DisposPtr ; (A0/p:Ptr)
   1F8: 486D FFFC           -4          PEA     glob1(A5)
   1FC: A86E           '.n'             _InitGraf ; (globalPtr:Ptr)
   1FE: A8FE           '..'             _InitFonts  
   200: A912           '..'             _InitWindows  
   202: A9CC           '..'             _TeInit  
   204: 42A7           'B.'             CLR.L   -(A7)
   206: A97B           '.{'             _InitDialogs ; (resumeProc:ProcPtr)
   208: A850           '.P'             _InitCursor  
   20A: 42B8 0A6C         $A6C          CLR.L   DeskHook
   20E: 487A 0302      1000512          PEA     data2       ; len= 1
   212: 4EBA 3198      10033AC          JSR     PUTREGISTERDLOG
   216: 4EFA 0316      100052E          JMP     com_2
   21A: 4227           'B''    lab_13   CLR.B   -(A7)
   21C: A99B           '..'             _SetResLoad ; (AutoLoad:BOOLEAN)
   21E: 42A7           'B.'             CLR.L   -(A7)
   220: 2F3C 4452 5652 '/   226: 487A 2156      100237E          PEA     data35      ; len= 12
   22A: A9A1           '..'             _GetNamedResource ; (theType:ResType; name:Str255):Handle
   22C: 1F3C 0001      '.<..'           PUSH.B  #1
   230: A99B           '..'             _SetResLoad ; (AutoLoad:BOOLEAN)

The Atari and Sega Genenis may only use TRAP but the x68000 looks like it uses F-line for some OS calls which is not allowed in the 68k ISA documentation as A-line is. I don't know how the Neo-Geo calls functions. Of course ColdFire does not reserve A-line (it is not 68k) and placed MOV3Q in there which is one of the few 68k ColdFire incompatibilities.

Quote from: biggun;784223
Also you did say that the FPGA Vector implementation would prevent an ASIC version of the core.
As the way the Registerfile they way Apollo does it  would not be good fro ASICS.
Again that is technically not true.

Technically it should be possible to make an ASIC with a combined vector and integer unit but nobody is going to make an ASIC out of such a screwed up CPU with such a screwed up ISA. It also may be more difficult to create an ASIC out of a ultra-optimized FPGA core all jumbled together.

Quote from: biggun;784223
Matt, we are very happy to discuss compiler optimizations ideas.
Technical ASIC/FPGA discussions should be done by people understanding them fully.

You are overbearing and dominate the "technical" decision making. I know enough about ISAs to know that you have chosen a radical ISA (see the Natami link above where you called more registers a "major ISA change") for a conservative maket which is all wrong. You had 25% support (including you so less excluding you) for more registers and your "major ISA changes". How are you going to get people to use something the majority doesn't support? Your ISA needs major work in compiler backends but how are are you going to gain support for an FPGA CPU with a few hundred users? Even if you were to overcome these large obstacles then how do you plan to compete against hard processors even with the extra registers? My ASIC plan is more feasible than your radical FPGA ISA. An ASIC isn't that expensive and raising money for what people want and like is a lot easier than trying to sell them what they don't want.

Quote from: Linde;784251
The Raspberry Pi is cheap because it is based on an existing cheap SoC that could realistically be produced in quantities of millions of units for less esoteric purposes than as a replacement for a legacy CPU.

An Amiga Cherry Pi (I like cherry better) would have to be a SoC ASIC to be close to as cheap as the Raspberry Pi. It would be difficult to compete with the Raspberry Pi in price and energy efficiency. I would rather target a DVD (optionally Blu-ray DVD) player box kind of like the old PS2 (which is way better than a cheap CD32) but with a removable DVD drive and more expandable (usb, ethernet, wifi, etc,). I think an ASIC enchanced 68k could play HD movies no problem. I would include an FPGA like a Cyclone V which could be used for emulation, acceleration of some tasks, and for embedded uses. It would eventually be able to emulate whatever gaming CD is placed in the drive up to a PS2. Keep everything open so people can use an internet browser (which still can't be done on the PS3). Using the AmigaOS or AROS 68k, 1GB of memory should be plenty. I would aim for a price of $100-200. We would probably need to sell 40k units. I wonder if there were that many people on the Natami forum in it's prime when it generated 300,000+ hits in one thread.

Quote from: ppcamiga1;784260
In 2009, gunnar von boehn promised  NatAmi with a cpu many times faster than any ppc used in Amiga and with graphics better than PlayStation 3.

I recall a target of PS2 level performance and faster than a 68060. I believe Gunnar has delivered on the latter and had limited control of the former (I don't believe Gunnar can take all the blame for the hibernation of Natami).

Quote from: ppcamiga1;784260
That's all for less than 100 euros.

I didn't ever see anything close to this price although it's possible that someone was wishing for this price.

Quote from: ppcamiga1;784260
Then every spring NatAmi team promised that this summer Natami will be produced.

I don't recall any promises like this but there were a lot of expectations like this.

Quote from: ppcamiga1;784260
Now it is 2015 and there is still nothing.

gunnar where is my NatAmi?

I want my Natami too. Gunnar has the FPGA core that was needed to lower the cost of Natami mostly working and with good performance. Thomas Hirsch knows about it. Some people have different ideas about what team work and cooperation are though.
« Last Edit: February 15, 2015, 08:00:07 PM by matthey »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #13 on: February 15, 2015, 09:46:41 PM »
Quote from: Linde;784282
One of these "plans" is an already implemented and working solution that seems to give its users the fastest 68k alternative available. No need to consider feasibility at that stage.

I'm sure people would have been interested in a processor as fast as a 20 year old Pentium and with a 30 year old ISA 15 years ago :D.

Quote from: Linde;784282
Cool. Wake me up from my cryo-sleep when you have a prototype ready! I'm sure people would have been interested in that DVD drive 15 years ago.

It's important to keep the price down for the masses and provide more freedom with a replaceable drive. Some people would rather have a DVD-R than a Blu-ray read only drive for example. A DVD drive may be a good enough base standard but it depends on how cheap Blu-ray drives could be bought. Consoles, DVD/Blu-ray players and TV service boxes are too limiting and not open enough. It's really frustrating to have so much power sitting there and only being able to do what they will let you.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: ADOOM on A600 running 22-35 FPS
« Reply #14 on: February 16, 2015, 09:04:09 PM »
Quote from: kolla;784359
I agree, FPGA is cheap enough and fast enough for m68k AmigaOS. Matthey, I understand your concerns, and I'm sure your points will materialize themselves when/if no compilers supports the changes/improvements that Gunnar has done. In the meantime, I dont think anyone/anything prevents someone else, for example you, from doing a more "conservative" m68k core for the Vampire boards.

FPGAs are wonderful tools for core development but the 68k will never be more than retro (and a slim chance for embedded) as FPGA only. A more compatible core and ISA is needed for both retro and embedded instead of an unused "performance" ISA which Gunnar is targetting. The following Atari forum link with one of the Mist creators talks about the Phoenix core on Mist:

http://www.atari-forum.com/viewtopic.php?f=101&t=27442&sid=20370baaedca3a502b06a214d70aa186

Compatibility is the first concern, then license, and performance is nice but less important. I bet the concerns are the same for other 68k FPGA hardware creators. I expect embedded markets to be the same way. Existing embedded 68k and ColdFire customers may want something faster but they need compatibility with their current code base. Phoenix isn't going to win over ARM customers overnight but it could win existing 68k and ColdFire customers if it was compatible enough and liked enough (radical 68k changes won't fly with these 68k fans any more than the retro guys). We are also missing an oppurtunity to have one united 68k standard which could be used in Phoenix, TG68, Suska, UAE, and with so many people using one standard, more likely an eventual ASIC. Instead, we could end up with another incompatible Amiga split like the current ones which are killing Amiga.

Quote from: kolla;784359
But yeah, a m68k cherry pi would be awesome, I would guaranteed buy a few, especially if the m68k has MMU and can run Linux, I want real hardware for my Linux/m68k again ;)

While the 68k integer and FPU ISA need a little refreshing and modernization, the 68k MMU design needs major changes. It may not be practical to keep it compatible. It would be good to investigate ways to ease adding at least partial memory protection and/or memory isolation and extended memory into the AmigaOS while also allowing for possible future SMP (AmigaOS 3 using a custom CPU has a better chance to maintain compatibility than AmigaOS 4 using an off the shelf CPU). ThoR needs to be involved and design us a new MMU standard ;).

Quote from: kolla;784359
My suggestion is to ignore what Gunnar is doing and join forces with other more likeminded people and "do it right", the best solution will win, right?

Like with AmigaOS 4, MOS, AROS, AmigaOS 3, etc.? Is the best Amiga winning or is all of Amiga losing?

Quote from: OlafS3;784360
I am sure that you are honest with what you are writing and really mean it but for now Gunnar offers the best 68k solution ever available and a payable also. That is what we need, a major hardware upgrade including 68k (at least 68020 compatible) and better graphics and sounds. I do not know whom you know or not but I do not believe at people investing millions of dollars in the market, not before products are there and the need is obvious. When you can show a working system and proof your concept by sales then you can go to a investor, not the other way round. Investors are cold calculators, they look how big is risk, what have I to invest and what do I earn and they expect a business plan. So first step is a working FPGA based system that can already be used with software being adapted to. I think we should gunnar simply let do his job. I see (from videos) more and more software running at a very high speed (and there is no improved chipset/RTG yet) that counts for me (and most others) and not abstract discussions about ISA details.

I agree that new affordable 68k hardware is what the Amiga needs but we need to plan and work together to keep the 68k Amiga pipeline full, to put it in processor terms. It is important to have a product to show but it is also important to have a good plan to show investors. I am an investor and I know other investors. You might be surprised by what I could make happen but I'm sure not going to pull my money out of safer investments to invest in something I don't believe in and in people I can't trust. Neither will I try to convince other investors to do the same or try to find other partners to invest with. I see potential here but I do not see anything investable yet.

Quote from: ppcamiga1;784410
I recall a target of PS3 level performance, gunnar von boehn later changed his promises to PS2 level performance. gunnar always promised that the cpu will be faster than g4.

Any mention of PS3 performance must have been before I was involved with Natami which wasn't particularly early. The PS3 has a lot of potential performance but it it difficult to take advantage of. An enhanced 68k Amiga SoC ASIC with good 3D implementation added might not have as much theoretical performance but could be much easier to program perhaps making it seem surprising close to the performance of a PS3. I think a Cherry Pi could be made which could outperform the new Raspberry Pi 2 at a moderatly higher cost, with everyone working together and with proper funding (oddly never tried with the Natami despite tremendous interest).

I believe the current Phoenix core outperforms a PPC G4 clock for clock in integer performance and in memory performance. Enable the 2nd (and possibly 3rd) integer pipe, clock it up, up the caches, add branch prediction and add an FPU and it should be able to walk all over an equally clocked G4.
« Last Edit: February 16, 2015, 09:08:51 PM by matthey »