Amiga.org
The "Not Quite Amiga but still computer related category" => Amiga Emulation => Topic started by: whabang on October 09, 2003, 09:08:08 AM
-
Has anyone got any suggestions? :-?
-
None.
-
Bummer! :-(
-
It was a reply to the subject.
You don't need 68040.library with WinUAE. WinUAE 040 emulation doesn't emulate MMU at all, and afaik proper cachemode setup is done by UAE itself (gfxcard framebuffer, custom, cia are noncacheable and so on...).
PS. There is a experimental MMU patch for UAE, but it slows emulation down quite abit, and is less stable (I've heard). With this version you could use 68040.library, I suppose. No idea which one.
-
Why not just use 020 emulation, since it'll run more or less the same speed as 040 emulation, and support more instructions? :-?
-
xeron wrote:
Why not just use 020 emulation, since it'll run more or less the same speed as 040 emulation, and support more instructions? :-?
040 optimized apps run faster... (just a little)
-
040 optimized apps run faster... (just a little)
Not at all. In fact 040 emulation is slower than 020 emulation. The more complex the emulated machine is, the slower is the emulation. The speed gain of a 040 about a 020 mainly is the cache. And in WinUAE cache is adjusted by the JIT slider, not by the processor type. Another improvement is the FPU, but AFAIK the builtin FPU of any processor has less commands than the external FPU of the 68020+FPU setting. (As mentioned above, FPU emulation might slow down emulation compared to a processor without FPU).
You need 040 emulation only if your program does not work on a 020. Most programs do.
The same is true for 040 <-> 060 issue. This is the reason why there is no 060 emulation: every 060 program will work on a 040, too. The only differences are in hardware that is not really emulated (caches, pipelines, number of integer units etc.). On emulation there would be no difference in speed.
Bye,
Thomas
-
whabang wrote:
040 optimized apps run faster... (just a little)
Errmm.. under emulation? It shouldn't make any difference.
-
Thomas wrote:
Not at all. In fact 040 emulation is slower than 020 emulation. The more complex the emulated machine is, the slower is the emulation. The speed gain of a 040 about a 020 mainly is the cache. And in WinUAE cache is adjusted by the JIT slider, not by the processor type. Another improvement is the FPU, but AFAIK the builtin FPU of any processor has less commands than the external FPU of the 68020+FPU setting. (As mentioned above, FPU emulation might slow down emulation compared to a processor without FPU).
Whilst I don't pretend to understand why, this isn't my experience at all. 68040 optimised code runs about 5% faster on my installation of UAE running in 040 mode than the 020/882 optimised version of the same code in 020+68882 emulation mode.
Perhaps it could be that the 040 has fewer user-mode instructions to support than the 020/6882 model but I'm only guessing.
-
Karlos wrote:
Whilst I don't pretend to understand why, this isn't my experience at all. 68040 optimised code runs about 5% faster on my installation of UAE running in 040 mode than the 020/882 optimised version of the same code in 020+68882 emulation mode.
Run the 040 code on the 020/882 emulation mode and see if it makes any difference.
-
xeron wrote:
Karlos wrote:
Whilst I don't pretend to understand why, this isn't my experience at all. 68040 optimised code runs about 5% faster on my installation of UAE running in 040 mode than the 020/882 optimised version of the same code in 020+68882 emulation mode.
Run the 040 code on the 020/882 emulation mode and see if it makes any difference.
Well, I forgot to say - the 040 optimised code running in 020/882 mode runs at no noticable speed difference to the 020/882 code in 020/882 mode.
Basically all I am saying is that on my (old and a bit slow PC) system, 040 optimised code running in 040 emulation mode is measurably faster (sometimes upto 10%) than any other combo I've tried...
-
Karlos wrote:
Basically all I am saying is that on my (old and a bit slow PC) system, 040 optimised code running in 040 emulation mode is measurably faster (sometimes upto 10%) than any other combo I've tried...
How very odd. I personally havent seen the UAE sources, but I can't imagine they're using a different emulation core if you select 040 instead of 020.
I don't know why they don't just emulate a generic "680x0" processor that supports all instructions and addressing modes of the whole range.
-
040 optimised code is only faster on a real 040, because it uses tricks that the 040 hardware can do better than other CPUs. It's not faster under emulation, trust me. I notice no difference between 020/040/060 optimised code on the fastest 68k emulation currently available - MorphOS Trance.
In fact, UAE's 040 emulation can be a wee bit buggy too, another reason to stick to 020 code.
-
@Kenny,
I guess the phrase we are looking for is 'Your Mileage May Vary'...
MOS speed advantage is not just down to its 68K emulation - the fact that the OS calls (usually the most time consuming) are PPC native helps just as much. No doubt register allocation is a trivial operation on PPC too ;-)
Anyway, back to WinUAE. I've written code and tested under pretty strict conditions. On my PC at least, 040 optimised code running in 040 emulation mode (all other UAE settings unchanged) is faster than than 020/882 code in 040 emulation mode. This is unusual, unless the 040 emulation is written differently in some way.
The 040 optimised code running in 020/882 mode is no different from 020/882 optimised code in 020/882 mode.
Naturally any small set of tests are subject to random fluctuations, but the differences I've noticed on my PC are totally systematic over many repetitions.
-
Karlos wrote:
MOS speed advantage is not just down to its 68K emulation - the fact that the OS calls (usually the most time consuming) are PPC native helps just as much. No doubt register allocation is a trivial operation on PPC too ;-)
I meant more the fact that Trance is Hotspot JIT, the fastest variant available.
On my PC at least, 040 optimised code running in 040 emulation mode (all other UAE settings unchanged) is faster than than 020/882 code in 040 emulation mode. This is unusual, unless the 040 emulation is written differently in some way.
I suppose then it depends on the PC's CPU and memory bandwidth. If you say that 040 code is faster on WinUAE for you I believe you, but there's no logical reason why it should be faster (at least that I can think of).
-
KennyR wrote:
I meant more the fact that Trance is Hotspot JIT, the fastest variant available.
Yeah, I wouldn't mind seeing that for myself but alas MOS1.4 for blizzard is not publically available.
I suppose then it depends on the PC's CPU and memory bandwidth. If you say that 040 code is faster on WinUAE for you I believe you, but there's no logical reason why it should be faster (at least that I can think of).
Agreed - it is pretty weird and not at all what I expected.
-
I was also kinda suprised. I noticed a small speed increase when fiddeling around with the WinUAE settings. As I'm no programmer, I have no idea why.
I'm going to do some more testing...
-
I´ve done some testing. Surfing with Aweb (v3.4, 040 optimized) is much smoother when using 040-emulation.
-
I can't understand why the 040 emulation should be any faster. Surely if its a different emulation core, it would be a good idea to just add the missing instructions and use the same core for both so that it runs faster whatever mode you select? :-?
-
maybe it's a thing with maintaining chipset timing?