Welcome, Guest. Please login or register.

Author Topic: newb questions, hit the hardware or not?  (Read 63362 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« on: July 14, 2014, 12:07:29 PM »
Quote from: OlafS3;768906
If you want that your software runs everywhere it is a safer bet to use the OS.
You can't always do that if you want your software to run well on current low to mid end systems. Things such as planar CPU blitting (in cases where it's faster than c2p+chunky blitting). You just don't want to use the OS for that.

Making your software system friendly is one thing (I'm all for that), but getting it to run everywhere isn't always a good idea (depending on what the software is).
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #1 on: July 14, 2014, 04:52:36 PM »
Quote from: Thomas Richter;768926
There is no problem using the blitter if the Os calls do not offer you want you need
I actually meant blitting with the CPU (because it's faster than the CPU on 68020+).

Quote from: Thomas Richter;768926
By "you are sure what you are doing" I mean that you cannot, in general, expect that you have a simple old planar bitmap for your screen.
So, you'd have to sacrifice performance on actual native hardware just because the OS might not open a planar screen on non-native hardware? That means that basically everything you'd write ends up running like crap on actual Amigas without GFX boards. Sounds like a bad idea.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #2 on: July 14, 2014, 06:48:48 PM »
Quote from: Thomas Richter;768938
The core function (BltBitmap()) is pretty much low-level and bare to the metal. I wonder whether you can do any better yourself.
If this call uses the blitter, then you can do much better with a 25 mhz 68020/30, depending on what you want to do.

An example is a Fire Emblem/Advance Wars style game, where you have several layers of tiles (all of which can be animated) and sprites (all animated) that are aligned to 16 pixels on the x-axis. With the CPU you can do nice pipelined 32 bit blits that do two tile positions at the same time by reading data for both, and doing a single transpose on that data. Much faster than what the blitter can do.

Quote from: Thomas Richter;768938
The amount of additional checks needed for non-native hardware are a single pointer (in the bitmap). How much performance does that cost?
Nothing, because you simply do that once when the software starts. Not a problem.

Quote from: Thomas Richter;768938
Sorry, but that's a typical argument of the "cycle counter party": Just hack the hardware because we don't know any better and argue "it's for speed".
In the case of the blitter it's already known that it's slower than a 25 mhz 68020/30 with fastmem, especially for the simple blits I described above.

Quote from: Thomas Richter;768938
Rule number one for performance optimizations: Measure first. Then optimize.
Obviously :)
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #3 on: July 14, 2014, 10:36:34 PM »
Quote from: Thomas Richter;768958
Whether it does or does not use the blitter is the matter of circumstances. But that's exactly the reason for *using* this function. If you hack the hardware yourself, you cannot take advantage of improvements or configurations which would otherwise help you to keep your program operating (say on the graphics card) or working fast (by taking advance of a replacement implementation).
Took a look at BltBitmap() in the autodocs, and it looks like a generic blit routine. Definitely not going to be faster for the example I gave above than an optimized, specialized, pipelined 32 bit CPU blit routine. The example above is too specific to handle properly with a generic routine.
« Last Edit: July 14, 2014, 10:38:56 PM by Thorham »
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #4 on: July 15, 2014, 08:09:28 AM »
Quote from: Thomas Richter;768970
You still do not get it. BltBitmap() may use the CPU, and may run into an optimized, specialized, pipelined 32 bit CPU blit routine. Actually, it most certainly does if P96 is installed.
I get it perfectly fine, but you don't seem to get the difference between generic, and non-generic blitting routines. BltBitmap() has a source bitmap, a mask and a destination bitmap. How is that going to be optimal when you have a background tile, sprite+mask, overlay+mask and on top of that some pixels from a sprite one tile below, because the sprites are 24 pixels high and another mask for that (tiles are 16x16, sprites are 24 pixels high)? With a specialized routine, you can read ALL of that data in one go, AND you can do it for two tile positions at the same time. With BltBitmap() you need multiple calls to do one tile position.

Basically you're trying to tell me that a generic, one size fits all routine is better than specialized code that has been written specifically for the job. It's just not true.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #5 on: July 15, 2014, 11:51:41 AM »
Quote from: Thomas Richter;768982
It is at least better than writing a software that does not work for some configurations. That is what the operating system is good for.
It's not when your target is AGA+68020/30 and possibly A1200+trapdoor fastmem. Why do we have to take higher end systems in account, and make sacrifices on the low end, while the software could work perfectly fine on low end?

This is Amiga we're talking about, not some high end computer where making use of the OS for everything might make sense. For low end (68020/30) using the OS is fine. Well behaved software is nice, after all, and killing the OS (except for WHDLoad and demos) isn't a good thing, same for not using the OS's screen open functions. But blitting has to be fast on low end machines, or the software is going to run like crap.

Why does everything have to be adapted for high end machines and custom expansion boards? Want to run Amiga software? Use an Amiga.

Quote from: SamuraiCrow;768983
The main advantage of using the OS functions for blitting is that it works asynchronously so that the CPU can multitask while the Blitter is still plotting graphics.
With the blitter, yes, but the blitter is too slow.

Quote from: SamuraiCrow;768983
Thorham's approach is the old-fashioned way of using the CPU to copy pixels.  It works best when you have sufficient cache memory and CPU time to do it that way.
It's also the only way to get good performance on low end machines. It's old fashioned because the hardware is old, and many Amiga users use this hardware.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #6 on: July 15, 2014, 12:10:50 PM »
Quote from: SamuraiCrow;768989
Hand-optimizing is a pain
680x0 assembly language is one of my computer related hobbies, and optimizing is part of the fun :) Actually, it's probably my favorite part of 680x0 coding. It's also the reason I stick to my 50 mhz 68030: It's challenging and interesting to get certain things to be fast on such systems.

Quote from: SamuraiCrow;768989
but being able to reinstall and reoptimize software from some bytecode that is optimized and compiled at install time will get your cake and let you eat it too!  (At the cost of some install time.)
Do we currently have such tools? How fast would that be for 68020/30 compared to hand optimized code?

Quote from: nicholas;768991
It's not like we can't detect the machine we are  running on and use a different codepath for each architecture. Best of  both worlds. :)
Indeed.
« Last Edit: July 15, 2014, 12:13:47 PM by Thorham »
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #7 on: July 15, 2014, 02:25:36 PM »
To SamuraiCrow:

That's all very nice, but I like programming in 680x0 assembly language, it's one of my computer related hobbies, and I'm not the only one.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #8 on: July 15, 2014, 09:58:13 PM »
Quote from: LiveForIt;769029
It depends on what you want to do
Write software that's fast on 20s and 30s with AGA, and when possible OCS/ECS, while keeping it system friendly, but without sacrificing speed (no need for syskill practices, custom screens and what not, of course).

Quote from: LiveForIt;769029
Writing a game for a modern graphic card, and using CIAA/CIAB timers, will exclude many people from using your software (AmigaONE-*/Sam4*0), there for its not a good idea.
I'm only interested in writing Amiga software anyway.

Quote from: LiveForIt;769029
Poking around in $DFF00A and $DFF00C
On 20s and 30s there's no need for that. Especially the mouse and keyboard are very easy to handle properly with the OS.

Quote from: LiveForIt;769029
Other things people do that is annoying is that they assume they know BytesPerRow
For a game that opens a fixed sized screen this isn't a problem. On Amigas this is generally not a problem, and if some user can't use this, then I wonder how their Amiga is set up.

Quote from: LiveForIt;769029
this forces the game or program to only run in one screen resolution, and fixed mode ID's that can't be changed.
You can at least use different monitor IDs so that double scan is supported. For non-games software it's a definite no-no.

Quote from: LiveForIt;769029
For examples if you write game/program to use Paula sound chips, then it might be faster on a MC68020, where don't have lot of CPU power, but you will not allow your game/program to use a modern sound cards like Prelude 1200 or any other sound card.
Not a problem when you run the software on an actual Amiga, unless someone doesn't have Paula output connected, and that's their own problem.

Quote from: LiveForIt;769029
Sticking to graphic.library will make your game or program run on any hardware but, it will most likely run slower then if you wrote directly to memory, instead of using graphic draw functions, and so on.
Only using the OS in general is unacceptable for me, because it only helps NG users. Why should we take NG users in account when writing Amiga software? If people want to run Amiga software, then let them use Amiga computers or emulation.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #9 on: July 16, 2014, 08:20:03 AM »
Quote from: matthey;769066
With a little bit of learning, it's possible to write code that is fairly optimal on the 68020-68060. Instruction scheduling for a 68060 generally doesn't affect 68020/68030 performance but can double 68060 performance with some code.
Can't do it. 20s and 30s have priority for me. Not to mention that instruction scheduling sucks. My goal is to get something to run well on the lower end machines (25 mhz 68020). When something runs well on such machines, why would I need to optimize for 68060s? For me anything above 68030 is irrelevant in terms of optimizing, because if a '30 can run it fast enough, then so can a '40 or '60. I also don't have a '40 or '60.

Quote from: matthey;769066
Learning about modern CPU pipelining, superscalar execution, caches,  hazards/bubbles, etc. (not just 68020/68030 timings and specifics) will  improve your code for the 68020/68030 also.
Really? So, you're telling me that on 20/30 there's more than cache+timings+pipeline? Interesting!
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #10 on: July 16, 2014, 09:52:54 AM »
Quote from: matthey;769110
More performance is always useful. Settings with better gfx, more sound effects/music and more options can be turned on a 68040/68060. Some games are nicer at 30fps than 20fps even if they are playable and fun at 20fps on a 68020/030. It does take a little more time to instruction schedule code but the code become re-usable for more and expanded projects.
Sure, but it depends on what you're writing. I'm writing a Fire Emblem clone with a full tiled display with several layers (16x16 pixel anim tiles, 16x24 map sprites, 16x8 status icons) and it's already much faster than needed on my 50 mhz '30 (you only need around seven FPS to get the animations to look like the original), and that's with out pipelining. I see no reason to optimize for 40/60 at all in this case, and will instead try to get as close as I can to A1200 with trapdoor fastmem.

Another example might be a text editor that has to run well in 640x512+overscan in eight colors (nice for double scan modes). Getting this to run as fast as possible on 20/30 obviously has priority over 40/60 where optimized 20/30 code will be fast enough anyway.

There is some software where it might be important, but for those cases it might be a better idea to write separate loops for 20/30 and 60. It's not as if it's much extra work.

Quote from: matthey;769110
The 020/030 is friendly being lightly pipelined but performance is affected by alignment and data sizes (32 bit is sometimes faster than 16 bit) at least. Unfortunately, documentation is lacking in general for 68k instructions in regards to hazards/bubbles and instruction scheduling. I know the 020/030 has some instruction overlap but I don't know if it's enough to affect resource availability from instruction to instruction. Contrary to most 68k Amiga programmers, I have studied the 040 and 060 more (and I know more about the AmigaOS functions than banging the Amiga hardware also). Avoiding general slow downs for the 040/060 rarely hurts and sometimes helps 020/030 performance. This in contrast to the 68000 where optimizing for 68000-68060 is difficult as the 68000 is a 16 bit processor.
Should be interesting to check out.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #11 on: July 16, 2014, 11:28:54 AM »
Quote from: SamuraiCrow;769115
Sprites may be able to overlap this though.
That won't work well for double scan users, because of the limited number of sprites. I think there may actually be only one sprite available in double scan modes :(
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #12 on: July 16, 2014, 12:13:28 PM »
Quote from: SamuraiCrow;769117
for what I'm planning to do
What exactly are you planning?
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #13 on: July 16, 2014, 08:56:39 PM »
Use the right algorithm? Really? How obvious :rolleyes: And because we're now using the right algorithm, optimizing it's performance isn't necessary, implying any crappy implementation will do. Yeah, right :rolleyes:

Use the right algorithm, AND write a PROPER implementation of it. Why bother with anything less? And besides, cycle counting is fun. Nothing beats hand optimizing tight loops on 20s and 30s :p Waste of time? Not to me, it's a hobby :p
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show all replies
Re: newb questions, hit the hardware or not?
« Reply #14 on: July 16, 2014, 10:26:56 PM »
You tell'm, matthey :D