Welcome, Guest. Please login or register.

Author Topic: Curse of the SDL  (Read 24821 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« on: August 03, 2011, 02:25:40 PM »
@utri007

It's not just a problem of having a decent SDL library port, SDL games seem to be pretty unoptimized too.

A proper Scumm port without using SDL nor ixemul would be a nice thing to have. Perhaps graphics could be converted on first run to planar and stored in a "cache" directory to avoid using chunky graphics too. And using paula sound instead of AHI would be also a nice addition.

And rewritting some parts may be required to get decent performance too: at first look Scumm seems to lack functions to load a big bitmap in gfxram and scrolling the viewport simply exchanging a pointer (something like ScrollVPort() ). Instead it uses the usual peecee bruteforce approach copying the entire screen to the gfx buffer each frame.

Scumm also seems unable to keep gfx in gfx ram and draw them on screen using some kind of blitter. For modern games the dirtybox approach it uses it's probably better (for fast cpus without gfx acceleration) but for old games like Monkey Island 1 or 2 you don't move hundreds of objects simultaneusly and using our dear blitter to draw the graphics seems more interesting, specially taking in account that you won't be throwing 50 different frames per second but you could keep in chipram most used images cached.

PS: please note I have not studied much Scumm sources but these are my first impressions when looking at the sources. You could claim Scumm is multiplatform but if it had been designed from scratch thinking in something else than a 64KB VGA framebuffer it would have been way faster with decent hardware (like our miggies). It's funny that you need a fast cpu to play games like Monkey Island that looked, sounded and played perfectly on unaccelerated classics
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #1 on: August 03, 2011, 03:24:57 PM »
Quote from: yakumo9275;652818
you know you can write your own backends for scummvm that dont require sdl... there is even wiki pages for devs for scummvm here. I was a scummvm dev for a while, and making scummvm so portable requires different kinds of tradeoffs.


First of all thank you for contributing to Scumm, it's a nice piece of software. Nice to see there's somebody from Scumm team here :-)

But I have to add that IMHO some functions for scroll could have been written so platforms without hardware scrolling capabilities could have used a software backend and platforms with hardware scrolling could have used hardware scroll. With the current implementation it's always software mode.

I think the same about blitting objects located in gfx ram. Old games only move a few sprites or bobs so copying them every frame to gfx ram eats too much bandwitch. I doubt that making an AGA backend would help much regarding this. Using blitter would just involve blitter performing masking operations (as long as the animation frames fit in chipram).

Using planar would be useful too.
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #2 on: August 05, 2011, 10:37:49 AM »
Quote from: fishy_fiz;653137
Oh, and in regards to ScummVM, latest 68k ports dont use sdl.


Chunky2planar isn't a great idea on AGA miggies for these kind of 2d games anyway.

Quote
It's a shame I lost the warpsdl sdk though. Its both much faster and works with aga.


warpsdl was quite smooth on my miggies, what a pity. Chaozer doesn't reply any email :-(
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #3 on: August 09, 2011, 11:00:31 AM »
Quote from: NovaCoder;653734
Quite a few RTG 060 users have said the AGA version of ScummVM is much faster on their RTG boxes than the SDL based RTG release so yes, I hope when I do a 'native' RTG version we'll see the speed increase a bit (it should of course run faster than the AGA version at the same resolution/color depth).

The AGA version may get a chance to catch up again with the RTG version if I can get direct-chunky working (eg Graffiti) and skip the C2P step ;)
 
I don't the 68k SDL is a curse though, it was very helpful to me when I did my intial port (the 68k SDL source code was made open source).   Most of the performance issues are probably down to the fact that it's no longer maintained.


On both cgx/p96/sdl you could lock the bitmap and draw directly to it, most of times you'll get a nice speedup (although with SDL you may have slowdowns compared to cgx/p96 if you use doublebuffer functions). If you could tune up rendering to write chunks of 4 or more pixels it would be even faster :-)
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #4 on: August 09, 2011, 01:13:32 PM »
Quote from: bernd_afa;653771
sdl is able to work in same way as you told, when use HWSURFACE.but then many modern games run slower.


If you use SDL doublebuffer functions even if you enable HWSURFACE it will run slower than cgx using double buffer functions (at least on UAE).

Quote

todays games use alphachannel for all objects.
this mean before calc a pixel, the pixel data need read, the object data is add and then the pixel is written.


That depends on the game generation and the coder. Reading from gfx ram is slow even on pc so it's better to perform the calculations in ram and write the pixels with the new value directly.

There are decent games like Diablo, Age of Empires or Starcraft that run happily on a 640x480 8bit screen.

Quote

but because amiga GFX card access is so slow, and a gfx card read is more slower than write, this slowdown alot.


The problem is coding 2d games using graphic card as a big framebuffer instead of taking advantage of accelerated blitting and scrolling. And the problem for modern games is that you could be using Warp3D to move 2D images using hardware resources instead of transfering one dozen of megabytes per second. Just upload your gfx to RTG ram configured as textures and move them using Warp3D instead of cpu: you'll reduce bus bandwitch required to move the gfx and will require a less powerful cpu

Quote
a better test to compare between a native port and sdl is maybe quake


what? sorry, I strongly disagree because Quake on classics is mostly cpu limited and its use of SDL is limited to:
a) locking display bitmap, writting the pixels directly to gfx ram or...
b) performing a raw copy (just like WritePixelArray on RTG)

so there's hardly any difference with CGX code. Any difference in performance could be due SDL handling of double buffer/vsync in a different way but it's caused by the SDL port. A better test would involve using SDL blitting functions, instead of simply copying raw data through amiga bus to rtg ram.
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #5 on: August 09, 2011, 08:15:47 PM »
Quote from: bernd_afa;653790

>There are decent games like Diablo, Age of Empires or Starcraft that run happily on a >640x480 8bit screen.

this are not so speed critical stuff, as a action shooter or jump and run with scrolling.


Well, if your scrolling is not hw accelerated and blitting functions are non existant you´ll also depend on bus bandwitch unless you have some gfx memory to store some parts like backgrounds and draw them using blitter.

Quote

>I quite happily play 14bit multi channel ADPCM music on my A1200/030, and it sounds >brilliant. And the machine has plenty of bandwidth for other tasks.

can you post a link of such a song and player ?.and the 7 khz filter of your amiga need set to off.have you hear the music over headphone or a good stereo sound system ?


Do you use the horrible audio filter? I never met anyone who liked it :-) there are plenty of players that will allow you to output more than 7Khz sounds (e.g. HippoPlayer). In fact many games use higher frequency sounds.

Use paula directly at 28Khz 8bit instead of AHI and you´ll get a nice speedup.

Quote

@Crumb
>what? sorry, I strongly disagree because Quake on classics is mostly cpu limited and >its use of SDL is limited to:
>a) locking display bitmap, writting the pixels directly to gfx ram or...
>b) performing a raw copy (just like WritePixelArray on RTG)

and this copy operations are limit by GFX Bus access.


On Amiga it´s limited by cpu speed, not bus access. If you use AGA or a CV64 you won´t see much difference. If you are talking about emulators... well, uae behaviour will be different than a real Amiga.

Quote

sdl is very fast, it convert any bitmap format to screen bitmap format.
if somebody want write a game for amiga RTG, he must write code for do this.P96 or CGX functions are not so fast, because i test what happen when i use instead of SDL blit operations the CGX functions direct.was slower seem CGX do also have more calling overhead.a simple 1 pixel blit was slower in P96 too


Are you talking about real Amiga with real CGX/P96 drivers or WinUAE p96 driver that draws all the stuff in memory and later copies the graphics to the graphic card? I mean: I don´t believe blitting a rectangle in the screen with SDL on a real Amiga is faster than copying it using real blitter with CGX functions. With a CV64/CV3D/Picasso4/AnythingPCI? I seriously doubt it. WinUAE results are FAKE because the p96 driver doesn´t act like a real Amiga driver. BTW, I hope you don´t mean you tried to blit a 1pixel x 1pixel bitmap with p96, that would be useless and ridiculous... why would you want to do that? use 8x8 or 16x16 at least.

Quote

and when you write a game that run on RTG you have of course no copper or can do some display tricks smooth scroll etc.


You can use ScrollVPort() to change the base address of the screen even with RTG cards, and you can also design your game to maximize the use of rtg blitter and avoid touching the bus as much as you can: load the most (recently) used graphics in graphics memory and use blitter to draw backgrounds and so on.

you can also keep a copy of the graphics in fastram to avoid reading and just compose the parts you need and update only the parts with transparent graphics on top. If that sounded problematic you could use MMU to mark what parts you need to transfer and which ones you don´t.

If we talk about 3D games you should simply use Warp3D or OpenGL and avoid software rendering. And if we talk about games using 3d card you could use it for 2d stuff too: Kas1e and Karlos have done that, the former with his mag and the later with his small apps.

Classic Amiga results != WinUAE results
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #6 on: August 09, 2011, 08:29:13 PM »
Quote from: bernd_afa;653796
>The problem is coding 2d games using graphic card as a big framebuffer instead of taking >advantage of accelerated blitting and scrolling. And the problem for modern games is >that you could be using Warp3D to move 2D images using hardware resources instead of >transfering one dozen of megabytes per second. Just upload your gfx to RTG ram >configured as textures and move them using Warp3D instead of cpu: you'll reduce bus >bandwitch required to move the gfx and will require a less powerful cpu

thats theory, who have a fast RTG system that is able to use warp3d ?


Anyone who has a Mediator for example.

Quote

when you use SDL 3D Games then all can do with GFX card.but when 3d games are written for SDl, PC have speed at 500 MHZ and more and much faster GFX bus and no limit of 256 Pixel texture size of Voodoo 3.


Quake2&Wipeout run fine on my A4000. For 2D stuff 256x256 tiles are not a problem.

Quote

wawa have work much to port some usable 3D games to classic.but the 256 pixel texture limit was the problem in voodoo 3 and the limit mem in compare to modern systems, because every game use a background texture of screen resolution.this mean at least 640*480.


With all my respect to wawa, recompiling pc games is not the same as developing native ones designed to run on Classic Amigas with Warp3D or MiniGL.

Quote

and last very few users own a voodoo 3.most users have a GFX card with only 2 or 4 megabyte of RAM.thats too few


Most of Mediator/GREX/Prometheus users have Voodoo3 with 16MB. That´s enough to make good games.

Quote

any help is welcome to make this games playable on a classic.in amiblitz you can write asm code and some routines are asm optimized.but if the GFX bus can not transfer more data is the main problem


If you want to have good results on classics you can´t simply treat them as a PC even if you use a RTG card: you have to design your engine to avoid copying to graphics mem as much as you can, just load the most used graphics at the beginning and transfer through the bus the least used and smaller ones. WinUAE probably won´t give you the right impression about the real bottlenecks.
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #7 on: August 11, 2011, 06:56:42 PM »
Quote from: bernd_afa;654061
I am not the author of netsurf SDL.Artur do a alot and great work on netsurf to add features that netsurf sdl version(in mains source do not have).In newest netsurf artur use agar GUI for GUI in some gadgets

http://libagar.org/

, i do not like the mix, i have told Artur, because i fear some hang problems due to message lost.I am not familar with that

But the sdl version can better update with new core, and if the standard core not run on amiga, i help Artur.


I think both of you are doing a good job but I wonder why you don´t adapt OS4 Reaction/ClassAct code as that would probably bring a notable improvement in usability and a good speed up in GUI too. In addition to that I would get rid of nasty ixemul and would use libc2 or libnix instead.

Uploading dependencies sources to Aminet would be helpful too I guess, perhaps that way somebody gets more interested in helping.
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #8 on: August 13, 2011, 11:46:45 AM »
"I only want see if SDL is fast now.and it is fast, speed is near same as quake68k.how fast clickboom quake is and if it have some optimized 68k asm routines i dont know.sdlquake have of course no asm code."

As I told you previously, there are little functions of SDL involved in SDL Quake:
-Opening Screen
-Locking Bitmap to copy raw data or copying the raw data directly without ANY conversion involved.
-Unlocking Bitmap in case you locked it.
-Switwing buffers
-Closing Screen

And no Blit or RGB conversion functions so comparing CGX with SDL using Quake is both absurd and useless.

In order to compare CGX and SDL performance you would need to compare RGB conversion, blitting, blitting with a mask... but not performing a plain 8bit pixel copy to a 8bit screen! You won´t get important differencies, the only ones you may find would be that ClickBoom Quake may be configured to write directly to a bitmap resulting in higher performance if you have good bandwitch to gfx ram and SDLQuake will probably perform a copy of the pixels from the rendering buffer to the screen bitmap but that doesn´t require a single conversion so trying to compare SDL&CGX performance using Quake is ridiculous.

If you are so interested in comparing CGX&SDL performance it´s not complex to write a small app that performs operations colour conversion, blitting with different sizes, blitting with mask... and then compare the results.

But please stop trying to reach conclusions about CGX&SDL performance using Quake because it´s the worst game you could use to compare.

Following your logic I could use an AROS native version of Quake, set it to write directly to the screen buffer and claim that CGX blitting and masking functions are as efficient and feature rich as DirectX hardware accelerated ones because I would get similar performance as Windows version: just because Quake is the wrong test!
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #9 on: August 23, 2011, 07:43:18 AM »
@NovaCoder

Good job! :-)
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #10 on: August 23, 2011, 10:05:01 AM »
Quote from: NovaCoder;655825
Yep it's down to gcc I'm afraid, I can only link it so many engines before the compiler has a sulk and spits the dummy.   There's not much point supporting SVGA games for the AGA version anyway because 640x480x256 is too slow in AGA and I only support Curse of Monkey Island in AGA.   For the RTG version I will also add support for Broken Sword 2 and maybe Broken Sword 1 as well :)

Are you using MMU to avoid performing c2p and redrawing the non updated parts? IIRC ADoom had some code to do that and it could be quite helpful on machines with MMU. Scrolling would be slow but other scenes would be fast. Perhaps you could add some optional frameskipping in games that use 640x480 (and with MMU onlyenable it on scenes with many changes -let's say more than 50%-, I guess that looking at MMU code from ADoom it would be possible to add a counter of modified chunks and if more than half of chunks have been modified activate frameskipping).

There was a video driver for Shapeshifter called AGABoost that worked without MMU and only drawed modified parts although MMU support would be great anyway.
« Last Edit: August 23, 2011, 10:24:11 AM by Crumb »
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)
 

Offline Crumb

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 1786
  • Country: 00
    • Show all replies
    • http://cuaz.sourceforge.net
Re: Curse of the SDL
« Reply #11 on: August 23, 2011, 02:50:20 PM »
Quote from: itix;655849
IIRC ADoom didnt use MMU to avoid drawing non-updated parts but instead marked frame buffer as non-cacheable. Could be I remember wrong but ADoom indeed ran faster on certain setups.


You may be right... although ADoom uses a comparison buffer so I guess it does something like I described. To get some advantage of the MMU it probably creates various MMU pages pointing to parts of the chunkybuffer and marked as read only and when an "Exception" is raised in the memory area of that page, unmark it as "read only ram" and execute the instruction that caused the exception. Then at the end of the writes check out the small table of "zone-flags" and perform the c2p of that parts of the display. I guess it works in 4KB slices (it would be nicer with smaller chunks).

Looking at some Mac Emu driver could be handy. This one is done by the same author of the c2p (it uses a delta buffer so I'm fairly sure it works as I thought ADoom worked):
http://aminet.net/package/misc/emu/TurboEVD-src
The only spanish amiga news web page/club: Club de Usuarios de Amiga de Zaragoza (CUAZ)