Amiga.org

Operating System Specific Discussions => Amiga OS => Amiga OS -- Development => Topic started by: Karlos on August 31, 2004, 12:50:43 AM

Title: Got a PPC card? I need you!
Post by: Karlos on August 31, 2004, 12:50:43 AM
Hi folks,

I need to gather some performance information on the memory bandwidth available to classic systems using PowerPC cards.

I have written a small test app, available here (http://www.nyteshade.com/karlos/util/memspeedwos.lzx).

It requires:

AmigaOS 3.x (3.0 untested) / WarpOS 4+
Graphics card capable of supporting a 16-bit workbench
About 1.5MB fast RAM. 256KB chip RAM

It will open a 640x480 window on your 16/24/32-bit workbench (which it uses to obtain a valid VRAM offscreen surface for testing) - hence the RTG requirements.

If you can spare 5 minutes, please download it and run it. It will produce a listing like this one (from my system) that your should redirect to a file:

Code: [Select]

Memory/Bus bandwidth estimation (WarpOS/PPC) (c) Karl Churchill 2004

System Info

CPU : 603e [PVR:0x00070201]
CPU : 240.000 MHz
FSB :  60.000 MHz

Estimating VRAM access bandwidth

Read  VRAM (64-bit) : 5357.07 KB/s
Read  VRAM (32-bit) : 5167.71 KB/s
Read  VRAM (16-bit) : 2568.80 KB/s
Read  VRAM  (8-bit) : 1297.22 KB/s
Write VRAM (64-bit) : 15212.21 KB/s
Write VRAM (32-bit) : 14341.32 KB/s
Write VRAM (16-bit) : 7136.84 KB/s
Write VRAM  (8-bit) : 3573.41 KB/s

Estimating fast RAM access bandwidth [normal]

Read  FAST  (64-bit) : 70531.72 KB/s
Read  FAST  (32-bit) : 65485.35 KB/s
Read  FAST  (16-bit) : 59806.60 KB/s
Read  FAST   (8-bit) : 50414.33 KB/s
Write FAST  (64-bit) : 40211.58 KB/s
Write FAST  (32-bit) : 40062.42 KB/s
Write FAST  (16-bit) : 40174.10 KB/s
Write FAST   (8-bit) : 40063.94 KB/s

Estimating fast RAM access bandwidth [writethrough]

Read  FAST  (64-bit) : 70779.57 KB/s
Read  FAST  (32-bit) : 65424.45 KB/s
Read  FAST  (16-bit) : 59734.12 KB/s
Read  FAST   (8-bit) : 50471.28 KB/s
Write FAST  (64-bit) : 44196.65 KB/s
Write FAST  (32-bit) : 27602.69 KB/s
Write FAST  (16-bit) : 13950.28 KB/s
Write FAST   (8-bit) : 7001.60 KB/s

Estimating chip RAM access bandwidth

Read  CHIP  (64-bit) : 4280.86 KB/s
Read  CHIP  (32-bit) : 4251.75 KB/s
Read  CHIP  (16-bit) : 4238.68 KB/s
Read  CHIP   (8-bit) : 4254.50 KB/s
Write CHIP  (64-bit) : 2680.55 KB/s
Write CHIP  (32-bit) : 2695.44 KB/s
Write CHIP  (16-bit) : 2699.28 KB/s
Write CHIP   (8-bit) : 2680.32 KB/s

Estimating fast RAM [normal] to fast RAM [normal] bandwidth

FAST -> FAST (64-bit) : 39994.81 KB/s
FAST -> FAST (32-bit) : 40019.66 KB/s
FAST -> FAST (16-bit) : 39952.29 KB/s
FAST -> FAST  (8-bit) : 31528.36 KB/s

Estimating fast RAM [normal] to fast RAM [writethrough] bandwidth

FAST -> FAST (64-bit) : 27710.04 KB/s
FAST -> FAST (32-bit) : 20138.16 KB/s
FAST -> FAST (16-bit) : 11738.51 KB/s
FAST -> FAST  (8-bit) : 6379.13 KB/s

Estimating fast RAM [normal] to VRAM copy bandwidth

FAST -> VRAM (64-bit) : 12474.13 KB/s
FAST -> VRAM (32-bit) : 11891.99 KB/s
FAST -> VRAM (16-bit) : 6467.78 KB/s
FAST -> VRAM  (8-bit) : 3399.76 KB/s

Estimating VRAM to RAM [normal] copy bandwidth

VRAM -> FAST (64-bit) : 4739.20 KB/s
VRAM -> FAST (32-bit) : 4593.28 KB/s
VRAM -> FAST (16-bit) : 2426.28 KB/s
VRAM -> FAST  (8-bit) : 1257.86 KB/s

Estimating VRAM to RAM [writethrough] copy bandwidth

VRAM -> RAM (64-bit) : 4851.96 KB/s
VRAM -> RAM (32-bit) : 4461.07 KB/s
VRAM -> RAM (16-bit) : 2251.81 KB/s
VRAM -> RAM  (8-bit) : 1119.06 KB/s

Estimating FAST [normal] to CHIP copy bandwidth

FAST -> CHIP (64-bit) : 2550.21 KB/s
FAST -> CHIP (32-bit) : 2548.12 KB/s
FAST -> CHIP (16-bit) : 2546.74 KB/s
FAST -> CHIP  (8-bit) : 2530.97 KB/s

Estimating CHIP to RAM [normal] copy bandwidth

CHIP -> FAST (64-bit) : 3883.21 KB/s
CHIP -> FAST (32-bit) : 3901.75 KB/s
CHIP -> FAST (16-bit) : 3870.20 KB/s
CHIP -> FAST  (8-bit) : 3760.80 KB/s

Estimating CHIP to RAM [writethrough] copy bandwidth

CHIP -> FAST (64-bit) : 3911.78 KB/s
CHIP -> FAST (32-bit) : 3734.43 KB/s
CHIP -> FAST (16-bit) : 3285.09 KB/s
CHIP -> FAST  (8-bit) : 2632.86 KB/s


Usual disclaimer : program is a strain test and may contain bugs, run it at your own risk.

That said, I ran it dozens of times on my machine during development with no ill effects ;-)

Please email any results to karlchurcill@gmail.com

(yes, I misspelled my own surname when registering)

Thanks in advance,

K
Title: Re: Got a PPC card? I need you!
Post by: redrumloa on August 31, 2004, 12:54:34 AM
Did you get enough info from me, or do you want my slightly unfair results?  :-D
Title: Re: Got a PPC card? I need you!
Post by: Karlos on August 31, 2004, 12:57:10 AM
Oops,

Forgot to add, please give a little info on your system:

Model Amiga (A1200, A4K, A3K etc)
Model Accelerator & 680x0 CPU
Memory capacity and speed (eg 60ns, 70ns etc)
Graphics card & interface used (local bus, mediator etc)

AmigaOS / WarpOS / CGX / P96 Versions

Number of fillings, name of dentist :lol:

Thanks :-)
Title: Re: Got a PPC card? I need you!
Post by: Karlos on August 31, 2004, 01:01:42 AM
Quote

redrumloa wrote:
Did you get enough info from me, or do you want my slightly unfair results?  :-D


:-D

All information is useful.

BTW I emailed you something (nicknamed "relativity") recently but don't know if you got it :-/

I'd **really** like to see the results of that on your system if you can find a moment to run it :-D
Title: Re: Got a PPC card? I need you!
Post by: Dragster on August 31, 2004, 04:52:50 AM
Hi!

Results sent!

Cheers

Dragster
Title: Re: Got a PPC card? I need you!
Post by: Brian Hoskins on August 31, 2004, 08:25:08 AM
I got the following error when I attempted to run the program:

-------
10.System39:> Memory/Bus bandwidth estimation (WarpOS/PPC) (c) Karl Churchill 2004

Error creating context: 8
Couldn't create a context
-------

Brian
Title: Re: Got a PPC card? I need you!
Post by: Roj on August 31, 2004, 08:42:34 AM
I got the same thing when I tried running it under P96. It worked properly with CGX.
Title: Re: Got a PPC card? I need you!
Post by: Piru on August 31, 2004, 09:14:00 AM
Just out of curiosity I ran the test on my Peg2 7447 1GHz + 1 GB DDR + Radeon 9200SE. Works fine with MorphOS WOS emulation, and the results are, to put it mildly, quite crushing... :-)
Title: Re: Got a PPC card? I need you!
Post by: poweramiga2002 on August 31, 2004, 09:30:46 AM
out of curiosity i tried to run on my A1 os4 pre-release
but no luck
Title: Re: Got a PPC card? I need you!
Post by: Karlos on August 31, 2004, 09:43:12 AM
Quote

BrianJHoskins wrote:
I got the following error when I attempted to run the program:

-------
10.System39:> Memory/Bus bandwidth estimation (WarpOS/PPC) (c) Karl Churchill 2004

Error creating context: 8
Couldn't create a context
-------

Brian


Ahh. Error 8 implies it couldn't open a library - I forgot to check the CGX version required and blithely used 42 (this  has caught me out before too!). It only needs some basic v3 functionality (locking bitmaps, querying them etc).

I've recompiled and uploaded. This one should work under P96.
Title: Re: Got a PPC card? I need you!
Post by: Karlos on August 31, 2004, 09:51:34 AM
Quote

Piru wrote:
Just out of curiosity I ran the test on my Peg2 7447 1GHz + 1 GB DDR + Radeon 9200SE. Works fine with MorphOS WOS emulation, and the results are, to put it mildly, quite crushing... :-)


Interesting :-)

I couldn't help noticing your normal and writethrough timings were the same. Which made me think...

I don't expect that these results are particularly accurate depending on how big your L2 cache is :-D

The program was aimed at classics and as such it allocates:

1 bitmap 640x480 (hardware rounded) at 16 bit (or higher depending on WB depth). Thats 600K VRAM (more for 24/32 bit of course).

1 fast ram area of 512K, normal cache settings
1 fast ram area of 512K, writethrough cache settings
1 chip ram area of 256K (not cahceable on classic)

IIRC, the G4 has sufficient L2 cache (512K) to completely invalidate this test.

I can recompile a version that opens a much larger window and allocates larger fast ram areas (better still, make it user definable!), which should give more control over these things.

@poweramiga2002

There's no WOS emulation in OS4 at the moment. I plan to make an OS4 native version anyway :-)
Title: Re: Got a PPC card? I need you!
Post by: Piru on August 31, 2004, 09:56:47 AM
@Karlos

Yeah, 7447 has 512KB L2 cache, so the fastmem buffer seems to fit it completely. Ah, regarding writethough, you use the special WOS memory allocation flag for that? If so, I think wosemu just ignores the flag for now...

VRAM is noncacheable always, so buffersize doesn't really matter (other than timing accuracy).
Title: Re: Got a PPC card? I need you!
Post by: Karlos on August 31, 2004, 11:32:18 AM
@Piru

I use an accumulation based benchmark, where I time a variable number of iterations until a total time (in this case, 2 seconds) is exceeded (the actual time is recorded, not assumed to be 2 seconds of course):

The pseudocode is

Code: [Select]

iterations = 0;
while (totaltime < sampletime)
{
    lock hardware;
    disable task switching & interrupts;

    reset timer;
    perform operation;
    get elapsed time;

    enable task switching & interrupts;
    unlock hardware;
    totaltime = totaltime + elapsed time;
    iterations = iterations + 1;
}


The result is then (iterations * datasize) / totaltime

I find that this gives the most reliable overall results compared to fixed iteration/variable time (which obviously gets less accurate for faster systems).

The actual tests (read/write/copy) are written in PPC asm using a Duffs Device (my favourite) style unrolled loop. They should be about as fast as possible.

On classic, the tests are also run with 680x0 task switching and interrupts disabled (just for each iteration, not the complete time). The time recorded (using GetSysTimePPC()) is just for the operation itself, hence context switch timings are hopefully irrelavent.

The writethrough memory is obtained using MEMF_WRITETHROUGH for AllocVecPPC().

I'll knock up a version that should more fairly strain a Pegasos.
Title: Re: Got a PPC card? I need you!
Post by: Karlos on August 31, 2004, 05:09:39 PM
:bump:

I know there's still more PPC users out there :-D
Title: Re: Got a PPC card? I need you!
Post by: rayt on September 01, 2004, 12:29:45 AM
I have just downloaded it and tried it out but the "error 8" problem with p96 seems to be still there.
Title: Re: Got a PPC card? I need you!
Post by: Karlos on September 01, 2004, 11:19:03 AM
Strange :-/

I'll have another look tonight. Probably I need to lower the cybergraphics.library version again. Or better still, make it prefer picasso96 directly if it finds it. I plan to add some more tests to it (scattered memory read/write for one), and change the test to allocate as large as possible memory buffers. This will allow the Peg users who tested it to get a fairer estimate of their memory speed (and not just their L2 caches ;-) ) as well as paving the way for an OS4 native version, which would have the same L2 cache issues on an A1.


Anyway, I'd like to say thanks to everybody so far for their time.
Title: Re: Got a PPC card? I need you!
Post by: PiR on September 01, 2004, 11:25:23 AM
Don't shout, don't shout, I'm comming.

Cannot open intuition.library ver.40
My hardware: PPC603@210. ;-)
Title: Re: Got a PPC card? I need you!
Post by: Karlos on September 01, 2004, 12:24:36 PM
Ah, b*llocks. I must have uploaded the original version twice instead of the recompiled library version fix :lol:

There is no other reason for it wanting to open v40 inuition, the last build *definately* requests v39.

I'll re upload it tonight (complete with some sort of large buffer allocation for those with plenty of RAM and/or Peg I/II).
Title: Re: Got a PPC card? I need you!
Post by: Karlos on September 02, 2004, 11:06:34 AM
Hi again,

I've uploaded the (hopefully) P96 friendly version (uses CGX3 only). It might also work on OS3.0 (I lowered the graphics/intuition requirements to v39).

Didn't yet write a large memory version so anybody wanting to measure their L2 performance on a G4 peg can still do so :-D

I will fix that when I get a moment.
Title: Re: Got a PPC card? I need you!
Post by: poweramiga2002 on September 02, 2004, 12:34:36 PM
hows the OS4 version going ?
Title: Re: Got a PPC card? I need you!
Post by: Karlos on September 02, 2004, 01:09:50 PM
Quote

poweramiga2002 wrote:
hows the OS4 version going ?


When it's done :-D

Seriously, the code needs updating a little before that. Benchmarking for A1 (and Peg) has to account for the large caches they have.

I was primarily interested only in the classic performance, but I will have to have to ask an A1 owner I know if I can write it on his since my own OS4 installation is a tad bit ropey at present :-(
Title: Re: Got a PPC card? I need you!
Post by: rayt on September 02, 2004, 07:49:27 PM
Quote
I've uploaded the (hopefully) P96 friendly version (uses CGX3 only). It might also work on OS3.0 (I lowered the graphics/intuition requirements to v39).


Ok now it works on my system with P96, have just sent you the results.
Title: Re: Got a PPC card? I need you!
Post by: Karlos on September 13, 2004, 10:29:50 PM
@all

Still no OS4 native version :-( but I did update the code in preperation for a port.

It now allocates up to 32MB buffers for testing (to mitigate the effects of L2 caches) and can be given width/height parameters on the command line to open a larger window (default is still 640x480) to get a larger VRAM surface (or indeed smaller for people using 4MB cards).

usage: test width height

Would those Pegasos users who ran the original consider running the updated version here (http://www.nyteshade.com/karlos/util/memspeedwos.lzx) ?

Thanks :-)

The previous version reported totally unrealistic memory speeds on systems with 512K L2 cache since the test buffer was only 512K ;-)
Title: Re: Got a PPC card? I need you!
Post by: itix on September 13, 2004, 10:50:19 PM
Results sent :)
Title: Re: Got a PPC card? I need you!
Post by: Karlos on September 14, 2004, 12:43:57 AM
Quote

itix wrote:
Results sent :)


Cheers :-)
Title: Re: Got a PPC card? I need you!
Post by: Acill on September 14, 2004, 12:49:57 AM
I just emailed you my results as well from my Pegasos II G4 system.

EDIT: I guess your mail didnt like it, here is the results below:

Memory/Bus bandwidth estimation (WarpOS/PPC) (c) Karl Churchill 2004

System Info

CPU : 7447/7457 (G4) [PVR:0x80020101]
CPU : 1000.000 MHz
FSB : 133.333 MHz

Fast [cache normal] allocated : 33554432 bytes at 0x214AA960
VRAM [noncacheable] allocated :  1228800 bytes at 0xE8985C00

System running MorphOS WarpUP emulation

Estimating VRAM access bandwidth

Read  VRAM (64-bit) : 28914.40 KB/s
Read  VRAM (32-bit) : 16266.58 KB/s
Read  VRAM (16-bit) : 3615.66 KB/s
Read  VRAM  (8-bit) : 1414.92 KB/s
Write VRAM (64-bit) : 220626.04 KB/s
Write VRAM (32-bit) : 131799.60 KB/s
Write VRAM (16-bit) : 34544.66 KB/s
Write VRAM  (8-bit) : 16834.74 KB/s

Estimating fast RAM access bandwidth [normal]

Read  FAST  (64-bit) : 223107.84 KB/s
Read  FAST  (32-bit) : 223039.96 KB/s
Read  FAST  (16-bit) : 222790.42 KB/s
Read  FAST   (8-bit) : 200725.95 KB/s
Write FAST  (64-bit) : 446959.69 KB/s
Write FAST  (32-bit) : 446791.03 KB/s
Write FAST  (16-bit) : 442847.01 KB/s
Write FAST   (8-bit) : 225898.79 KB/s

Estimating fast RAM [normal] to fast RAM [normal] bandwidth

FAST -> FAST (64-bit) : 177491.57 KB/s
FAST -> FAST (32-bit) : 168491.36 KB/s
FAST -> FAST (16-bit) : 153469.40 KB/s
FAST -> FAST  (8-bit) : 121654.90 KB/s

Estimating fast RAM [normal] to VRAM copy bandwidth

FAST -> VRAM (64-bit) : 159743.28 KB/s
FAST -> VRAM (32-bit) : 107766.68 KB/s
FAST -> VRAM (16-bit) : 34014.99 KB/s
FAST -> VRAM  (8-bit) : 16397.64 KB/s

Estimating VRAM to RAM [normal] copy bandwidth

VRAM -> FAST (64-bit) : 28514.57 KB/s
VRAM -> FAST (32-bit) : 16066.22 KB/s
VRAM -> FAST (16-bit) : 3614.28 KB/s
VRAM -> FAST  (8-bit) : 1414.71 KB/s
Title: Re: Got a PPC card? I need you!
Post by: Karlos on September 14, 2004, 01:12:36 AM
Maybe you spelled my name right when you emailed me. Alas I didnt spell it right when I registered it :lol:

The G4 figures look far more realistic now :-)
Title: Re: Got a PPC card? I need you!
Post by: JKD on September 14, 2004, 07:42:53 AM
Same systems as Acill's except I have a Voodoo3 3500 AGP as opposed to
his Radeon 7000 (?) AGP:

Memory/Bus bandwidth estimation (WarpOS/PPC) (c) Karl Churchill 2004

System Info

CPU : 7447/7457 (G4) [PVR:0x80020101]
CPU : 1000.000 MHz
FSB : 133.333 MHz

Fast [cache normal] allocated : 33554432 bytes at 0x21437720
VRAM [noncacheable] allocated :   614400 bytes at 0xEC5AA970

System running MorphOS WarpUP emulation

Estimating VRAM access bandwidth

Read  VRAM (64-bit) : 17497.09 KB/s
Read  VRAM (32-bit) : 13054.57 KB/s
Read  VRAM (16-bit) : 7258.11 KB/s
Read  VRAM  (8-bit) : 3831.75 KB/s
Write VRAM (64-bit) : 224737.84 KB/s
Write VRAM (32-bit) : 140810.40 KB/s
Write VRAM (16-bit) : 39660.33 KB/s
Write VRAM  (8-bit) : 19243.93 KB/s

Estimating fast RAM access bandwidth [normal]

Read  FAST  (64-bit) : 222906.23 KB/s
Read  FAST  (32-bit) : 222879.27 KB/s
Read  FAST  (16-bit) : 222625.05 KB/s
Read  FAST   (8-bit) : 200591.57 KB/s
Write FAST  (64-bit) : 447184.60 KB/s
Write FAST  (32-bit) : 447251.85 KB/s
Write FAST  (16-bit) : 443442.91 KB/s
Write FAST   (8-bit) : 226074.08 KB/s

Estimating fast RAM [normal] to fast RAM [normal] bandwidth

FAST -> FAST (64-bit) : 177374.39 KB/s
FAST -> FAST (32-bit) : 168457.04 KB/s
FAST -> FAST (16-bit) : 153520.09 KB/s
FAST -> FAST  (8-bit) : 121715.30 KB/s

Estimating fast RAM [normal] to VRAM copy bandwidth

FAST -> VRAM (64-bit) : 172023.31 KB/s
FAST -> VRAM (32-bit) : 119516.86 KB/s
FAST -> VRAM (16-bit) : 39370.00 KB/s
FAST -> VRAM  (8-bit) : 19291.89 KB/s

Estimating VRAM to RAM [normal] copy bandwidth

VRAM -> FAST (64-bit) : 17421.77 KB/s
VRAM -> FAST (32-bit) : 13017.88 KB/s
VRAM -> FAST (16-bit) : 7222.08 KB/s
VRAM -> FAST  (8-bit) : 3826.98 KB/s
Title: Re: Got a PPC card? I need you!
Post by: Karlos on September 15, 2004, 12:21:18 PM
Hi,

Does anybody know if the April fix has any effect on the memory write speed for Peg1 ?

The only peg1 results I have seen so far are for a non-april version (G3 600MHz / 100MHz FSB) and I was extremely surprised that the write times were much slower than read times. Normally you'd expect similar (or better) write performance.

In fact, the write performance, 80MB/s to normal copyback memory, was only fractionally higher than a CSPPC with (604e 266MHz / 66MHz FSB) at 72MB/s, which would actually put the CSPPC in front MHz for MHz (for FSB speeds) :-?