Welcome, Guest. Please login or register.

Author Topic: [UserReview] Vampire V2-128 received and it's just pure p0rn.  (Read 107331 times)

Description:

0 Members and 33 Guests are viewing this topic.

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #29 from previous page: December 12, 2017, 06:10:17 AM »
Quote from: kolla;834013
Well, that's far from smooth...


Actually it's quite impressive if you understand what you are seeing. You may have noticed that in the video smoothness changes with the rotation angle of the texture. This is because the texture is too large to fit into the dcache. This means that the texture needs to be fetched from RAM. The burst reading and the cache prefetching of the 080 are reading ahead in the texture in an attempt to have the next texel ready when needed. This works well when the texture is traversed in a direction matching the cache prefetch and not so well when the texture is traversed backwards. This can be remedied quite easily by using four textures prerotated by 90 degrees so that you can always traverse the texture in a forward direction by picking the right prerotated texture. This is a common trick in texture mapping.

Summing up: we see that texture mapping in hires and with bilinear filtering can be done by the 080 and with a decent frame rate. Not bad for a 90 MHz CPU...
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #30 on: December 13, 2017, 08:51:03 AM »
Quote from: kolla;834069
See? Now it is much smoother.
 This phenomenon is called "progress". And since you didn't like the previous video, you certainly are happy about it.  
Quote
And again, new instructions have apparently been implemented in AMMX, DXT1 and PIXMRG.
So you gather this much: these instructions are useful because they make texture processing much faster. PIXMRG even has many more uses as it mixes two sets of 8bit data.
Quote
Of course these are not described in http://www.apollo-core.com/AMMX.doc.txt yet.

I'm sure in everything you do, the documentation is always at the same level as the actual implementation...
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #31 on: February 21, 2018, 05:06:56 PM »
@kolla: what point is there to explain things if you never understand them or intentionally omit the information provided?

 68882-support:  As has been explained many times before, the important difference is that all 882 instructions will work from cold-boot. Even though some instructions are emulated (V4: anything that is 882 but not 040, V2: a few instructions more), the emulation code is similar to a ROM but not only write-protected but also not visible to user code running on the CPU. This is a much better solution than the 680x0.library approach which need to be provided by an operating system of some kind and thus is not available during cold-boot. User code cannot tell a difference between a software emulated instruction and a hardware implemented one. A fsin or fcos is equally slow whether microcoded (68882) or emulated (040, 060, 080). Again: the important difference between 040, 060 on the one hand and 080 on the other is that the 080 will not trap on the fsin like the 040 and 060 will but comes with the emulation code included and invisible to anyone but the FPU. This is really a kind of microcode.

Disagreement in the team about the FPU:  

I don't know what you are referring to. The work on AGA was already very far when Gunnar put it aside for work on the FPU because Jari showed initiative with his wonderful FEMU. No hard feelings about that. The work on AGA is not lost and will progress soon.   There is no conflict over traditional FPU vs. next-gen FPU. You make that up. Again I have explained many times that the advanced FPU is the foundation on which the traditional one is built. It is easier to build something powerful and then add a limiting layer to it than to build something limited and then add functionality to it.  

AGA or SAGA:  

I have explained to you several times that the RTG part of the core implements the new graphics features in an Amiga-way. I.e. all the registers for the RTG modes are within the address range of the Amiga custom chips and can be written by the copper reimplementation and not only by the CPU as is the case with the P96 software layer. This means that you can have smooth hardware scrolling for chunky screens, chunky screens with copper palette, multiple chunky playfields and probably some more I forgot. All this already in the core. The only thing missing to complete SAGA is in fact lame old AGA. The "S" in "SAGA" is already there.
« Last Edit: February 21, 2018, 05:09:34 PM by grond »
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #32 on: February 22, 2018, 07:53:24 AM »
Quote from: kolla;836434
Nothing evil going on, stop with the hyperbole. I was there on irc when all this went down, I still have the logs.

The fact that you keep irc logs tells more about you than about what was said on irc.
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #33 on: February 22, 2018, 09:04:59 AM »
Quote from: kolla;836419
One year ago, there already was FPU for V2, it just needed some "testing".
 There was an FPU for *Apollo Core*. How much of it was available in any publically released V2 core is a different question. And testing a CPU is the biggest part of the work because there is an infinite number of possible cases that can go wrong. So your point is based on a false assessment of how much work it is to develop a section of a CPU (a lot and few people can do something like that) and then debug it ("just some testing"). Testing and debuggin a CPU requires: writing lots and lots of testcases (assembly code) which requires a lot of knowledge and pondering about the inner workings of the CPU, simulating every single one of the testcases, looking at hundreds of digital signals in the simulator for millions of clock cycles where you can never see more than a few dozens on the screen for each testcase, modifying the simulation environment because you perhaps don't model a real computing environment accurately and crashes you see in real life just can't show in the simulator, and when you eventually see something odd happening, trace it back to the root cause which means you basically have to dig through the entire CPU logic. It's not as simple as "oh, here we have an uninitialised pointer, let's fix it and then release our App to Goodle Play".  
Quote
I suppose the testing didn't go so well, because it didn't take so long before it was clear that the very much hyped FPU would not be...
 Again you are basing your conclusions on false premises. The "just some testing" did NOT mean that testing with the team was started ever. There was NO testing in the team at all until Jari started FEMU!  
Quote
and anyways, Amiga don't need FPU, does it.
 What are you trying to say? "Gunnar is stupid because he doesn't know that an FPU can and has been used in an Amiga"??? Do you have some kind of Asperger? The statement was basically "since nobody of you wants to do any work other than talking bull%&$#?@!%&$#?@!%&$#?@!%&$#?@! on forums, you obviously don't need an FPU". Then there was a discussion about V4 which was supposed to be available much earlier and features an FPGA that has 64bit FP macros. These macros mean that you can either have an extended precision (that "nobody needs") FPU at very slow speed (i.e. not using the 64bit FP macros) or an NG FPU with "just" 64 bit precision at one FOP per clock cycle (!!!) per FP macro. With those macros you can build SSE type FPU units easily. Then there was the usual "but it's incompatible!!!" outcry going on (yes, SSE is not 386 compatible, very clever observation...). Summing up, half of what you are mixing into your skewed view of this vicious project was actually discussion about a future hardware base.  
Quote
So started the bashing of everyone expressing needs for FPU began.
 No, bashing of people like you who only say "but this old Amiga program does not run without CPU feature X" began. This project is not about Amiga. It is about the 680x0 family of CPUs. It is run by Amiga enthusiasts, though. It does not take any work to point out that some program crashes. It is no valuable input to point out which unit of a CPU, be it MMU or FPU, is used by what software. CPU developers KNOW that kind of thing. But if you really want something, you will have to put some effort into getting it. You don't but still have the time to bicker about the project on all Amiga forums? Then you obviously don't need an FPU. Nobody did? Well, then the Amiga obviously does not need an FPU. Get it? Sometimes "honey, you look gorgeous in that dress" means "you should lose ten pounds but we'll be late if you change again".  
Quote
Then Gunnar came with AMMX, as a sort of "look, this is much more useful!".
 Because somebody actually wrote the code for RiVA piece by piece which meant there was somebody willing to put a lot of work into "just testing" AMMX. The same thing that had NOT happened for the FPU. Thus, AMMX was clearly more "needed". Jari came later and then the "just testing the FPU" started. FEMU has been a perfect tool for testing the FPU because you could move FPU instructions into the FPU one by one and test each one separately.  
Quote
But there were people also inside the team that were unhappy about the lack of FPU, and there was a rather harsh discussion over it.
 Surprise, surprise, the members of the team are Amiga users. And yes, you are right, an FPU is a useful CPU unit, even in an Amiga. Please save this to your logs, you were right all the time!!!!!! An FPU IS REALLY USEFUL!!! It was created to serve some purpose!!! The situation was that the team members who had written a lot of testcases for the integer part had very little knowledge about writing FPU code. So basically it was the same situation that became a public discussion: if nobody puts any work into it, it means you don't really want it.  
Quote
Gunnar decided if he was to implement an FPU it would be the most awesome FPU ever, and it would require bigger FPGA, V2 be damned, let there be V3... heck V4!
 The Vampire V2 known to the public is in fact the V3. There was an unreleased V2 precursor with just 64MB RAM and a 16 bit RAM bus that was "the V2" inside the team (I actually owned one and used it in my Amiga). The 128MB V2 is always referred to as the "V3" inside the team. Majsta named his product V2 anyway and Gunnar prefers V3. Since the V4 was developed by Chris, a friend and colleague of Gunnar's, it is called V4. Majsta would probably have called it V3. So your reading between the lines about how the next card is the V4 because of some personality problem on Gunnar's side fails again. And you knew all this already because I have explained it before why the V4 is the V4 and not the V3.  
Quote
FEMU is far from perfect, and it was demonstrated how both productivity software and demos did not run well enough. So again focus was changed, to improve FEMU. And this has now been going on since last summer, and had CLEARLY been a priority. Despite previous rantings about how useless FPU is on Amiga
 So all this is about your hurt feelings? Because you said "an FPU is useful" and Gunnar said "it's useless"? And then history proved what we all knew forever? That FPUs do serve some purpose and have actually been used in the Amiga?    
Quote
Meaning that there's still a bit of software emulation needed, which is fine. What's less fine is that this software is running outside the operating system, meaning noone else but Apollo Team can ever fix bugs or do improvements.
 Are you aware that the 68882 is full of software emulation? Only that the software is stored as a ROM inside the CPU and cannot be changed at all? The 080 also has some software FP emulation inside, as you observed, but as the FPGA is flashable, it can be changed. But it is nothing that a user would or should touch. Your argument is basically a pro-open source argument. Well, gifts are always nice but you can't demand them.  
Quote
Imagine if Jari had not made FEMU, and where the project would have been now, what the outspoken sentiment would have been.
 Imagine you would actually contribute to the project, where the project could be now.  
Quote
As for SAGA, it was announced that the SAGA FPGA core would be open source, but now it's not when clear what SAGA is, and half of the time it just refers to P96 support. Originally, SAGA was the Super AGA chipset that was to take over from the AGA Amiga chipset, being a superset of AGA. Pamela is the audio part of SAGA, taking over for Paula.
 What's the problem here? That you again do not understand what I explained? Pamela is already done and part of SAGA (and clearly not P96, right?). The chunky graphics features that currently are available to the Amiga software through P96 are implemented in a SAGA way. P96 does not know about hardware scrolling and many more features so using P96 you can only perceive this part of SAGA as a normal RTG feature. If you want, you can bang the registers and have chunky Amiga screen modes. It's already there. You can do this TODAY. Do you understand that you could make a P96 driver for AGA? In the case of the Vampire the P96 driver basically is a chunky variant of "P96 for AGA"
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #34 on: February 22, 2018, 01:31:57 PM »
Quote from: Thomas Richter;836446
The only drawback may be that the FPU handling is an exception, i.e. multitasking stops.

How much multitasking is there while the 882 executes a 600 clock cycle fsin? How much multitasking is there while the 040/060 executes a trapped fsin? To how much time do the clock cycles without multitasking amount in any of 030, 040, 060 and 080 considering the different base clock frequencies? I guess we should not forget that instructions that are 882 but not 040 are deprecated and only supported for legacy reasons. Their use is discouraged either way.
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #35 on: February 22, 2018, 01:33:42 PM »
Quote from: Thomas Richter;836447
Frankly, I do not believe that anyone has a full overview, but instructions like fsin or fcos are still quite necessary for 3D games.

Not really. You need sine computations once per frame and perhaps once per lightsource and frame for more complex 3D games, but from then on it is usually all fmul/fadd.
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #36 on: February 22, 2018, 03:18:13 PM »
Some facts:
Code: [Select]
        68882                                  68080         Clock cycles     Latency @50 MHz       Clock cycles     Latency @80 MHz fadd              56             1120 ns                  1             12.5 ns fsub              56             1120 ns                  1             12.5 ns fmul              76             1520 ns                  1             12.5 ns fdiv             108             2160 ns                  2             25   ns fsqrt            110             2200 ns                222           2775   ns fmove             21              420 ns                  1             12.5 ns fsin             394             7880 ns                254           3175   ns fcos             394             7880 ns                269           3362.5 ns fsincos          454             9080 ns                317           3962.5 ns  Who can add the values for 25 MHz 040 and 50 MHz 060? :)
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #37 on: February 22, 2018, 03:19:22 PM »
Some facts:          68882                                  68080         Clock cycles     Latency @50 MHz       Clock cycles     Latency @80 MHz fadd              56             1120 ns                  1             12.5 ns fsub              56             1120 ns                  1             12.5 ns fmul              76             1520 ns                  1             12.5 ns fdiv             108             2160 ns                  2             25   ns fsqrt            110             2200 ns                222           2775   ns fmove             21              420 ns                  1             12.5 ns fsin             394             7880 ns                254           3175   ns fcos             394             7880 ns                269           3362.5 ns fsincos          454             9080 ns                317           3962.5 ns  Who can add the values for a 25 MHz 040 and a 50 MHz 060? :)
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #38 on: February 22, 2018, 03:21:20 PM »
Some facts:
Code: [Select]
       68882                                  68080
        Clock cycles     Latency @50 MHz       Clock cycles     Latency @80 MHz        ratio
fadd              56             1120 ns                  1             12.5 ns        89.6x
fsub              56             1120 ns                  1             12.5 ns        89.6x
fmul              76             1520 ns                  1             12.5 ns       121.6x
fdiv             108             2160 ns                  2             25   ns        86.4x
fsqrt            110             2200 ns                222           2775   ns         0.8x
fmove             21              420 ns                  1             12.5 ns        33.6x
fsin             394             7880 ns                254           3175   ns         2.5x
fcos             394             7880 ns                269           3362.5 ns         2.3x
fsincos          454             9080 ns                317           3962.5 ns         2.3x
Who can add the values for a 25 MHz 040 and a 50 MHz 060?
« Last Edit: February 22, 2018, 03:27:32 PM by grond »
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #39 on: February 22, 2018, 03:34:15 PM »
The above values are for the Gold 2.7 release of the 68080. The V4 release of the 68080 can schedule one fsqrt per cycle and complete it in 20 cycles.
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #40 on: February 23, 2018, 12:35:59 PM »
Quote from: kolla;836481
Apollo Core does like 68040 and hence need a similar solution as 040 and 060. No problem.

But then, do not run around claiming that NO software emulation is taking place with Apollo Core on V2, because that is simply not true. Unless 68882 is actually implemented, which would be worthy a news item.
 Nobody has ever claimed that 68882-only FP instructions will be implemented in dedicated hardware. And yet the way these instructions are implemented in the 080 is more similar to the way it is done in the 68882 than to the way it is done in the 040 and 060. There is no trap, no trap handler, there is microcode that uses more fundamental operations such as fmul, fadd, fdiv. The 68882 is exactly like that! It is all microcoded just that in the 68882 the microcode is stored as a metal mask ROM while in the case of the 080 the microcode is stored in flash memory.

Quote
Lastly, the software emulation of 68882 instructions are not running within the scope and and reach of the operating system. It's not a task of AmigaOS. It's not visible for any OS running on the Vampire card. If you use old software to try detecting what CPU there is, it may very well say it's a 040+882. Now, one can question wether it is a good or bad thing to have emulation software - or any software at all - running outside, or "under" the operating system.
 Even Intel processors are full of microcode. So even if one could question that, the answer to that would be clear anyway: it doesn't matter and it is no problem.  
Quote
There has also been talks about hyperthreading. Well, AmigaOS cannot do it by itself, so something else would be needed to do the scheduling etc of "out of bounds" threaded processes. Again, it is tempting to call that a hypervisor.
  And the problem is what? That there is also work done to use a CPU feature in AmigaOS? The alternative would be to not use the CPU feature. Possible but no advantage.  
Quote
I am skeptical though, as all experience says that running software outside the reach of the operating system complicates a number of things.
 Actually it is not SOFTWARE running outside the reach of the operating system that complicates things, it is having different processes/functional units operating concurrently that makes things complicated. The Amiga has always suffered from bad handling of blitter tasks running in parallel to the CPU and badly managed DMA. It doesn't change anything if such tasks are handled by dedicated hardware or by software running in a hyperthreaded virtual CPU core invisible to AmigaOS because it doesn't know about hyperthreading or multiprocessor environments.

Quote
As for MMU, it is already there, ir just isn't compatible with existing software and operating systems, which is a case of lost opportunities for Apollo Core.
 It is NOT a lost opportunity. It is just NOT FINISHED. Is this really so hard to understand? The 68080 has two memory controllers. No other 68k CPU has ever had a memory controller and most certainly not two running concurrently. An MMU running in such a processor needs to be different from the MMUs that previous 68k processors had. This is a technical fact. At the same time a modern MMU needs features such as marking memory as non-executable and such. Again: this project is not about the Amiga, it is about the 68k processor family. It does not matter whether AmigaOS has no concept of memory protection and thus does not need non-executable memory protection. The fact that AmigaOS does not make use of this MMU feature has no influence on the decision whether this feature will be implemented or not. But perhaps if we are lucky and patient, there will eventually be a compatibility MMU-layer on top of the units that are already there. Maybe this layer will support all MMU-features the Amiga has used, maybe only some.  

And again: ALL of this has been explained many times before.  
Quote
Also it has been mentioned that MMU is involved in for example speed up IDE and do various DMA tricks.
  Um, this sounds like you are mixing things up. I don't think the MMU has anything to do with IDE. The MMU certainly plays a role when dealing with DMA.  
Quote
Grong, no testing? Apollo-accelerators.com says/said something else, and luckily there is wayback machine.

I guess you are referring to the tests of the individual FPU instructions. This is not testing the FPU. This is basically using pseudo-random generators to produce e.g. two sources of input data and feeding that into the hardware unit that does the FADD. Then compare the output of the FP-adder to the expected result (precomputed or some other reference, e.g. calculated by some Intel or 68k FPU). The comparison is done automatically. Such testing was done for millions of testcases by Chris who developed most if not all of the FPU units. This testing took place but was not done by the team. Hence, it is a correct statement that there had never been any FPU testing done by the team until FEMU was started. Furthermore, this automated testing, although certainly difficult to set up, is less work than testing how program flow behaves, whether flags are set and evaluated correctly, whether operand prefetch goes wrong in certain situations, whether pipeline forwarding between consecutive FP instructions works and so on. The latter is what has been going on in the last few months.

Just because you don't understand how the work of developing a CPU is done in general and in this specific project, doesn't mean that there are self-contradicting statements, hidden truths and dirty secrets that would require YOU to unveil them.
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #41 on: February 23, 2018, 02:04:44 PM »
Sure, all analogies and comparisons fail if you apply a higher zoom factor. That doesn't prove the point moot. And I wrote that the way it is done on the 080 is more similar to the 882 than to the 040. And this is still correct. The emulation code could not be executed by any of 68000 through 68060 because it makes use of features unique to the 080. E.g. it uses additional FP registers and instructions that do not exist on any previous 68k FPU. In fact the correct technical term is not "microcode" but "millicode" and is widely used in modern CPUs. Did we really need to discuss this?
 

Offline grond

  • Full Member
  • ***
  • Join Date: Feb 2016
  • Posts: 154
    • Show all replies
Re: [UserReview] Vampire V2-128 received and it's just pure p0rn.
« Reply #42 on: February 26, 2018, 01:57:08 PM »
Quote from: Lord Aga;836680
Yes, this is exactly how Apollo team works. The more you bitch and belittle their work the more they rush to cater to your wishes. It's not like they have their own roadmap and agenda.

Did you know that you can induce superstitious behaviour in chicken?

  https://www.jstor.org/stable/40032017?seq=1#page_scan_tab_contents  

“In one of my college psychology courses we did an experiment that shows how superstitious behavior can be induced in chickens. The routine is to feed the chicken at random intervals. Unaware of what is triggering the appearance of bits of food, the chicken naturally assumes that its behavior is meaningful, so it tends to repeat whatever it was doing in the moment before food appeared. Before long the bamboozied bird is repeating whole series of behaviors in an attempt to do whatever it is that needs to be done to get more eats.”

 Replace "Food" with "core release" or "feature" and "behavior" with "forum rants" and it becomes clear that the Apollo project itself is responsible for all those rants from superstitious chickens...