Welcome, Guest. Please login or register.

Author Topic: p96 is unbelievably Slow!  (Read 6316 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline Karlos

  • Sockologist
  • Global Moderator
  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 16867
  • Country: gb
  • Thanked: 4 times
    • Show only replies by Karlos
Re: p96 is unbelievably Slow!
« Reply #14 on: December 19, 2010, 03:10:33 AM »
If I recall correctly, on my 68040/BVision, I can get up to 17MB/s copy to VRAM (using a loop unrolled move16 transfer), using a regular move.l based copy is around 15 or so.

On my 040/Mediator/Voodoo setup, the speed was around 9-11MB/s maximum and that's a slightly faster 68040.
int p; // A
 

Offline wawrzonTopic starter

Re: p96 is unbelievably Slow!
« Reply #15 on: December 19, 2010, 03:17:58 AM »
im evaluating it.
here is my current kuklomenos port that has been the reason of all that uproar:
http://www.daten-transport.de/?id=XfGNZPE2AaEn
it is compiled for hwsurface, that means the graphics go ito the vram, source included.
 

Offline wawrzonTopic starter

Re: p96 is unbelievably Slow!
« Reply #16 on: December 19, 2010, 03:20:49 AM »
@karlos: the figures are slightly higher that i would even expect for voodoo, its 7mbs for me with a4k/060. bvision sounds likely though. but thats thanks to your optimized code i take it.
 

Offline Karlos

  • Sockologist
  • Global Moderator
  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 16867
  • Country: gb
  • Thanked: 4 times
    • Show only replies by Karlos
Re: p96 is unbelievably Slow!
« Reply #17 on: December 19, 2010, 03:27:48 AM »
Don't take the Voodoo figures too seriously, they are off the top of my head. I need to find them (or retest, but that machine is currently in need of attention). The BVision figures are good though.

I experimented a lot with move16 for both copying and other operations, like byteswap copying. Here, you allocate a cache aligned block on the stack, read data from the source swapping as you go, then using move16 to copy the block out to the VRAM. If you allocate enough cache-aligned space (say 64 bytes) you can unroll your transfer loop 4x which was about ideal (with some carefully optimised routines you could handle misaligned data since you do that reading from the source rather than transfering to the bitmap).

Not sure why move16 was faster on BVision VRAM and also it wasn't on every system tested. However, it was never slower. On some other cards, IIRC, like the CVision64, it was slower though.

All very hardware-dependent.
int p; // A
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show only replies by matthey
Re: p96 is unbelievably Slow!
« Reply #18 on: December 19, 2010, 04:24:24 AM »
3kT060-75/Voodoo4/P96/16bitPC

                          320x240  320x240  640x480  640x480
                          software hardware software hardware
Slow points (frames/sec): 0.236337 0.236337 0.029795 0.0297915
Fast points (frames/sec):  14.1695  14.1695  3.56382  3.56298
   Rect fill (rects/sec):  489.601  489.601  117.307  117.196
 32x32 blits (blits/sec):  2318.05  2318.05  2195.07  2214.05

***

3kT060-75/Voodoo4/P96/16bit

                          320x240  320x240  640x480  640x480
                          software hardware software hardware
Slow points (frames/sec): 0.236686 0.236798 0.0299196 0.0299196
Fast points (frames/sec):  14.0659  14.0659  3.54566  3.54492
   Rect fill (rects/sec):   500.55   500.55  120.177  120.234
 32x32 blits (blits/sec):  2340.57  2340.57  2234.59  2214.05
« Last Edit: December 19, 2010, 04:46:44 AM by matthey »
 

Offline wawrzonTopic starter

Re: p96 is unbelievably Slow!
« Reply #19 on: December 19, 2010, 04:25:40 AM »
another quick attempt on an sdl test application.
http://www.daten-transport.de/?id=EXKChmTTKWvf
this should run without ixemul as well. ive compiled it for 020 without any optimizations or weird options. works very well on my cgs/cv64 setup. it crawls on pIV/p96. and it runs on uae. what should i say?
 

Offline wawrzonTopic starter

Re: p96 is unbelievably Slow!
« Reply #20 on: December 19, 2010, 04:28:43 AM »
@matthey: not bad even if cv still looks better. but i start to suspect the test doesnt reflect sdl reality very well. try the above one.
« Last Edit: December 19, 2010, 04:49:33 AM by wawrzon »
 

Offline Gulliver

Re: p96 is unbelievably Slow!
« Reply #21 on: December 19, 2010, 04:38:16 AM »
@wawrzon

Have you tested this P96 version? http://lilliput.amiga-projects.net/Picasso96.htm

Give it a try, maybe it performs a bit better. Anyway, if you do, just let me know about your findings.
 

Offline wawrzonTopic starter

Re: p96 is unbelievably Slow!
« Reply #22 on: December 19, 2010, 05:03:28 AM »
now, updated out of fooly without backup and there goes the damn sfs again!!! grr!! access outside the partition and the like. i have not used the damn machines for more then a year and forgot they still are infected with this superfu**ed_up_filesystem!

gulliver, just to be sure: you have tested the update with muforce on?
« Last Edit: December 19, 2010, 05:53:05 AM by wawrzon »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show only replies by matthey
Re: p96 is unbelievably Slow!
« Reply #23 on: December 19, 2010, 05:39:39 AM »
The pig game "feels" like it runs at full speed here. I get...

Average rendering frame rate: 7.4 fps
Average logic frame rate: 20 fps

@Gulliver
I am using those updates. Wawa needs to recheck his installations and maybe reinstall it because of that not so smart filesystem.

There is a site that has P96 speed tests. Compare the Voodoo 3 + G-Rex + CGFX to the Voodoo 4 + Mediator + P96. The G-Rex set up should be faster but it's not. This looks to me like the Mediator is faster because P96 is faster. The Voodoo 4 is not significanty faster than Voodoo 3 and in many cases on the Amiga slower.

http://www.amigaspeed.de.vu/

P.S. I added 16 bit "BE" test for sdlbench with my 16 bit "LE" PC test above.
 

Offline wawrzonTopic starter

Re: p96 is unbelievably Slow!
« Reply #24 on: December 19, 2010, 05:49:50 AM »
ok i can report that the new libraries of gulliver give me significant speedup with pig and sdlbench (details after the sun rises) but not with kulomenos alas. i have an impression they are quite bugged though and demand serious cleanup. i will probably have to backup and replace my filesystems on the p4 machine here before i go on with it. which i will not do before end of the year. but i might check the stuff more carefully on the voodoo system later today.

may i ask, who made these fixes?

@matthey: strange i get much higher results with these libs on p4 than you on voodoo, is that to be trusted? sorcery?
« Last Edit: December 19, 2010, 06:20:16 AM by wawrzon »
 

Offline Gulliver

Re: p96 is unbelievably Slow!
« Reply #25 on: December 19, 2010, 11:29:32 AM »
@wawrzon

I havent tested those libraries with muforce. So I was expecting some feedback from a devs point of view ;)

All these updates were obtained from post 2.1b Picasso96 updates, Prometheus drivers, Amithlon/AmigaOSXL updates, WinUAE, users submissions, ect. And were collected thru the years and added alltogether.

For various comments regarding this subject see http://eab.abime.net/showthread.php?t=50410&highlight=picasso96

@matthey

So with this new libraries in your system, now that you tested them for a while, how do think they perform?
« Last Edit: December 19, 2010, 11:34:03 AM by Gulliver »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show only replies by matthey
Re: p96 is unbelievably Slow!
« Reply #26 on: December 19, 2010, 02:16:39 PM »
Quote from: Gulliver;599878

@matthey
So with this new libraries in your system, now that you tested them for a while, how do think they perform?


I think the core P96 libraries in the update are good. I haven't ever received a MuForce/MuGaurdian Angel hit, they fix a few bugs, and they are a little faster. I can't say about the gfx card specific drivers as I don't have much if any experience with them. You did good :). Thanks.
 

Offline wawrzonTopic starter

Re: p96 is unbelievably Slow!
« Reply #27 on: December 20, 2010, 12:03:22 AM »
okay ive updated my voodoo3 setup as well. by hand this time. was not much to do, most libs were up to date. also my picture.datatype is newer than the contained in the archive. i dont recall where it is from. maybe it is some wos version, will have to take look at version number. all is working well, no hits, not much difference to before, excep a slight speedup in "pig" on little endian mode. in kuklomenos im getting 10fps in big endian and 7 in little.

on p4 things are quite different. there is quite a boost, but not in kuklomenos. i wonder why is this so slow. i suspect p96 is quite slow drawing the lines. if i inderstood you correctly, matt, also to draw lines in 3d cgx or p96 is used. is this correct? and this too is explicitely slow in w3d.

apart of that i now get with picassoIV system also a hit on startup. but that might be due to corrupted filesystem. i have to bring that in order first. here is a log, i dont think it will indicate anything without the sources though.
--------------------------------------
30-Sep-08  22:29:33
LONG READ from 00000020                        PC: 07373FE8
USP : 070B0FAC SR: 0010  (U0)(-)(-)  TCB: 070B0728
Data: 00000000 00000060 00000044 00000084 073579EC 07357ABA 07357ADE 00000000
----> 073579EC - "Work:Libs/Picasso96/rtg.library"  Hunk 0000 Offset 0000016C
----> 07357ABA - "Work:Libs/Picasso96/rtg.library"  Hunk 0000 Offset 0000023A
----> 07357ADE - "Work:Libs/Picasso96/rtg.library"  Hunk 0000 Offset 0000025E
Addr: 00000000 FFFFFFFF 0738D354 0738D354 0735766E 0738D354 000046CC 07002340
Stck: 070008D4 0738A18C 00F81CBA 00000400 00000000 00000000 00000000 00000000
Stck: 00000000 07357ADE 0735766E 073586A6 00000000 073581BB 070B06B0 00F81100
Stck: 07357884 0738A18C 00FD56BE 070B0728 070008D4 00FD560E 00F8A06E 00000800
Stck: 72616D6C 69620000 00000000 070B0710 070B0772 00000000 00000001 070B1030
Stck: 00000014 00000002 003E26A2 003E3AFA 003F5571 00000000 00000000 00000038
Stck: 00000001 FFFFFFFF 070B11CC 01C50F87 00000001 00000001 00F8F5DE 00F8F6C2
Stck: 00F8F6B6 070B104C 00000000 000000C4 FEDCBA98 00000000 00000000 070113D8
Stck: 0000A4A4 10100000 00010005 00000000 00000000 0000A4A4 00101008 4DEF0000
----> 07373FE8 - "Work:Libs/Picasso96/rtg.library"  Hunk 0002 Offset 00017D58
----> 00F81CBA - "ROM - exec 45.20 (6.1.2002)"  Hunk 0000 Offset 00001C0C
----> 07357ADE - "Work:Libs/Picasso96/rtg.library"  Hunk 0000 Offset 0000025E
----> 073586A6 - "Work:Libs/Picasso96/rtg.library"  Hunk 0000 Offset 00000E26
----> 073581BB - "Work:Libs/Picasso96/rtg.library"  Hunk 0000 Offset 0000093B
----> 00F81100 - "ROM - exec 45.20 (6.1.2002)"  Hunk 0000 Offset 00001052
----> 07357884 - "Work:Libs/Picasso96/rtg.library"  Hunk 0000 Offset 00000004
----> 00FD56BE - "ROM - ramlib 40.2 (5.3.93)"  Hunk 0000 Offset 000003E6
----> 00FD560E - "ROM - ramlib 40.2 (5.3.93)"  Hunk 0000 Offset 00000336
----> 00F8A06E - "ROM - dos 40.3 (1.4.93)"  Hunk 0000 Offset 000005B2
----> 00F8F5DE - "ROM - dos 40.3 (1.4.93)"  Hunk 0000 Offset 00005B22
----> 00F8F6C2 - "ROM - dos 40.3 (1.4.93)"  Hunk 0000 Offset 00005C06
----> 00F8F6B6 - "ROM - dos 40.3 (1.4.93)"  Hunk 0000 Offset 00005BFA
PC-8: FF0C60BE 2F0E6068 700043FA 00424EAE FDD82C40 70FF91C8 72604EAE FFB82040
PC *: 20280020 41FAFFD2 208041FA FF7C2080 41FAFF60 208041FA FF0A2080 41FAF7B8
07373fc4 :  6dff fffe ff0c             blt.l $7363ed2 ;extended opcode
07373fca :  60be                       bra.s $7373f8a
07373fcc :  2f0e                       move.l a6,-(a7)
07373fce :  6068                       bra.s $7374038
07373fd0 :  7000                       moveq.l #$0,d0
07373fd2 :  43fa 0042                  lea.l $7374016(pc),a1
07373fd6 :  4eae fdd8                  jsr -$228(a6)
07373fda :  2c40                       movea.l d0,a6
07373fdc :  70ff                       moveq.l #-$1,d0
07373fde :  91c8                       suba.l a0,a0
07373fe0 :  7260                       moveq.l #$60,d1
07373fe2 :  4eae ffb8                  jsr -$48(a6)
07373fe6 :  2040                       movea.l d0,a0
07373fe8 : *2028 0020                  move.l $20(a0),d0
07373fec :  41fa ffd2                  lea.l $7373fc0(pc),a0
07373ff0 :  2080                       move.l d0,(a0)
07373ff2 :  41fa ff7c                  lea.l $7373f70(pc),a0
07373ff6 :  2080                       move.l d0,(a0)
07373ff8 :  41fa ff60                  lea.l $7373f5a(pc),a0
07373ffc :  2080                       move.l d0,(a0)
07373ffe :  41fa ff0a                  lea.l $7373f0a(pc),a0
07374002 :  2080                       move.l d0,(a0)
07374004 :  41fa f7b8                  lea.l $73737be(pc),a0
Name: "ramlib"
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show only replies by matthey
Re: p96 is unbelievably Slow!
« Reply #28 on: December 20, 2010, 03:58:17 AM »
Quote from: wawrzon;600029
on p4 things are quite different. there is quite a boost, but not in kuklomenos. i wonder why is this so slow. i suspect p96 is quite slow drawing the lines.

If you look at the speed results at http://www.amigaspeed.de.vu/, you will see that the Voodoo 3 is almost 10 times faster at 2D line drawing than the Picasso 4 with CGFX 4. It also looks like P96 is a little faster than CGFX for line drawing with the Voodoo 3+ at least. I would expect the Picasso 4 driver to be good with P96 as well. The Picasso 4 is slow compared to Voodoo 3+ except where the gfx bus speed matters (bitmaps).

Quote
if i inderstood you correctly, matt, also to draw lines in 3d cgx or p96 is used. is this correct? and this too is explicitely slow in w3d.

No. I don't think so. Sorry if I mislead you. The Avenger libraries do call the appropriate CGFX or P96 Warp3D libraries which call the appropriate CGFX or P96 functions but these shouldn't be used for 3D lines. 3D lines need the Z value and are affected by the Z buffer. They are actually drawn as triangles as the Avenger has no support for 3D lines (or points). The Avenger does have support for 2D line drawing that is very fast in comparison. It would be very inefficient to use W3D_DrawLine() or similar to draw 2D lines.

Quote
apart of that i now get with picassoIV system also a hit on startup. but that might be due to corrupted filesystem. i have to bring that in order first. here is a log, i dont think it will indicate anything without the sources though.

I doubt it is the filesystem. It looks like a NULL pointer that is not tested. Let me translate it to something you might be able to read...

...
OpenLibrary (libName="expansion.library", version=0)
configDev = FindConfigDev (oldConfigDev=0, manufacturer=-1, product=$60)
tmp = configDev->cd_BoardAddr
globalvar1 = tmp
globalvar2 = tmp
...

See, configDev was never tested for NULL before it was used (neither was the OpenLibrary return). cd_BoardAddr offset is $20 from 0 which explains the read from address $20. I would say that this code is a patch that was added in by an assembler programmer at a later date than when it was compiled. The biggest hint is that the patch does some self modifying code and then calls exec/CacheClearU(). This is not normally done in C. I would suggest reverting back to the original rtg.library 40.3994 (08/22/04) 217988 bytes. Actually, this is the version that I have been using all along without problems. It does not contain this hackish patch.
« Last Edit: December 20, 2010, 04:02:50 AM by matthey »
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show only replies by ChaosLord
    • http://totalchaoseng.dbv.pl/news.php
Re: p96 is unbelievably Slow!
« Reply #29 from previous page: December 20, 2010, 05:50:20 AM »
MattHey FTW!
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA