Welcome, Guest. Please login or register.

Author Topic: AmigaOS and the Console Development - Part 1  (Read 13459 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: AmigaOS and the Console Development - Part 1
« on: September 30, 2015, 08:06:16 PM »
Code: [Select]
   //    lea    cd_RastPort(a6),a1
    //    move.b    cu_Mask(a2),rp_Mask(a1)
    IGraphics->SetWriteMask(rastPtr, unit->cu_Mask);

    //    moveq    #0,d0
    //    move.b    d6,d0
    //    LINKGFX    SetDrMd
    IGraphics->SetDrMd(rastPtr, oldMode);

Progress! The replacement code is probably 1/3 of the speed of the original. Calling functions has lots of overhead. The macros in includes/graphics/gfxmacros.h could be used from C as well.

Quote
The original 68K assembler version of the console occupied 16,212 bytes. Compare that with the last 68K C version (V50.26) which was 42,564 bytes! It just shows how squashed the assembler version was (but that C version did have some debug code as well).

... Also, the assembler version had been written to minimise code size, rather than to optimise speed or code legibility.

The 68k assembler code was optimized for size instead of speed yet the example shown is much faster and much smaller in assembler. Why don't they just write the code in C++ (for better maintainability) also and see if they can make a semi-modern $2000 PPC computer slower than a 25 year old 68060?

Edit: Also, the 5 lines of assembler code, if properly optimized, should be 4 lines on an existing 68k CPU and could be 3 lines on an enhanced 68k CPU.

Code: [Select]
   //    move.b    cu_Mask(a2),rp_Mask+cd_RastPort(a6) ; this works on existing 68k
    IGraphics->SetWriteMask(rastPtr, unit->cu_Mask);

    //    mvz.b    d6,d0  ; ColdFire instruction
    //    LINKGFX    SetDrMd
    IGraphics->SetDrMd(rastPtr, oldMode);
« Last Edit: September 30, 2015, 09:53:48 PM by matthey »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: AmigaOS and the Console Development - Part 1
« Reply #1 on: October 01, 2015, 06:32:20 AM »
Quote from: Thomas Richter;796669
And why do you bother? Is this in a speed-critical part of the device? Even worse, the assembler function does not go through an interface, hence anything that probably sits on top of SetWriteMask() is ignored.


The macros are perfectly legal and they are a lot faster. Is it better coding practice to use slower code for little if any benefit? It is true that the macros don't go through the LVO so a setpatch couldn't replace the code but this is true of much of the AmigaOS. It didn't stop P96/CGFX from patching everything. Your newest layers.library still uses the Forbid/Permit macros in most places instead of the function calls. Also, some of your layers.library function calls go through the LVO and some use a regular BSR instruction. It looks pretty inconsistent to me.

Quote from: Thomas Richter;796669

Despite all that, the best optimization *here* would have been to simply drop the SetWriteMask() alltogether. The reason why the console.device uses this trick is to speed up scrolling in case the current color selection only requires a single plane to be modified, but everything on the PPC does not use planar graphics in first place, so any attempt to modify the write mask will make the code slower.


AmigaOS 4.x can be used on a CSPPC with AGA graphics so maybe it is not completely useless. There is sometimes a cost to backward compatibility.

Quote from: Thomas Richter;796669

So yes, the translation to C code is probably well-intended, the 1/3 speed of a C function call is surely not relevant here. What is relevant is that this function is - for any sane graphics organization - useless and the whole code part that depends on this should have better been moved to trash in first place.


A couple of functions aren't going to make much of a difference. I just found it funny that the example wasn't consistent with the article comments. It is also funny that the 68k AmigaOS with all its optimizing for space really wasn't so well optimized from their example either.

Quote from: Thomas Richter;796669

The speed critical part was not the function call, but the problem it solved in first place, a problem absent at the target platform.

Again, this rather shows that such a low-level view on optimizations is highly misleading.


Yea, but you wouldn't have removed the call because the code is working. I know you too well. Making the safe optimizations to gain what I could while leaving planar support intact sounds pretty good to me. It would be better if the code was in C with the programmer using the macros and the compiler doing the optimizations though. This is not speed critical code after all ;).

Quote from: Thomas Richter;796669

With C++ code, the code would have been better, and probably faster than the C code, ...


Not likely faster. Not likely at all. Better is debatable.
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: AmigaOS and the Console Development - Part 1
« Reply #2 on: October 01, 2015, 09:39:10 PM »
Quote from: olsen;796689
Now, what exactly will SetWriteMask() do in this situation? It has the effect of making display updates more costly if an interleaved bitmap is used (which is the standard case). Instead of moving one consecutive chunk of bitmap data, the blitter operations have to be broken up into individual planes again. Available bandwidth is used less effectively. This can be noticeably slower.

Good explanation. So the SetWriteMask() is only an optimization for non-interleaved planar bitmaps? It now sounds like SetWriteMask() should be removed in all cases this "optimization" is used for AmigaOS 3+.

Quote from: Thomas Richter;796702
Please define "a lot". Or rather, take your time and benchmark console with the macro and without it, measure the difference and tell me. I haven't done that, admittedly, but just from your gut, what would you expect? 10% speed improvement? Not seriously. My gut feeling is: Below the statistical error. Way below. Don't waste your time in details.

I was talking about the macros vs the functions not the overall code speed which would be difficult to measure (it waits most of the time but should be fast when it doesn't wait). If you look at the macro vs function code on say the 68060, the overhead of a function through the LVO is at least 14 cycles (ignoring cache misses where inlined code like the macro has an advantage) for the JSR+JMP+RTS. For short functions like SetWriteMask() where the code is only a couple of cycles, this is significant. There is more overhead in setting up for a function call than using the macro also. The newer PPC processors are likely to have a link stack which might cut the function overhead in half but the macro is still significantly faster.

Quote from: Thomas Richter;796702
Compatibility? Readability? Maintainability? Actually, that's quite worth something. What I do not like about macros is that they require knowledge of the implementation details to work. In other words, they expose the internal structure you are manipulating to the compiler, making it impossible to change that without recompiling the code. Again, admittedly, it's already too late for graphics to fix that, graphics is all over with code that exposes internal interfaces where it should not, but please! can we avoid this problem in future code somehow? There are interface functions, and here we have a perfectly fine "setter" function for an internal property. It's good practice to use that.

Compatibility is a false claim if the code works fine with the macros (the original did). Readability is arguable. I agree that the library functions have maintenance advantages which is valuable. IMO, it comes down to maintenance vs speed. Maintenance may be the better choice on powerful processors but then some of the speed and responsiveness advantage the AmigaOS gives is lost so now a faster CPU is needed to make up for it. Maybe that would be ok if people weren't comparing a PPC Amiga with 20 year old embedded PPC CPU design to a modern PC with modern x86_64 CPU costing a fraction of the price.

Quote from: Thomas Richter;796706
Would "VeryNewCon"  be better? That's the initial name it was.

Why not ECON for Enhanced CON or Editor CON?
« Last Edit: October 02, 2015, 01:54:41 AM by matthey »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: AmigaOS and the Console Development - Part 1
« Reply #3 on: October 03, 2015, 05:28:27 AM »
Quote from: kolla;796719
Why must it have a new name anyways, just want something that replaces CON: and RAW: and that is more capable - call it console.handler with an appropriate version string.


If the new handlers can be enhanced while maintaining full backward compatibility, then keeping the same name and bumping the version is appropriate. Otherwise, a new name is required so that programs needing compatibility can use the old handler.

Quote from: QuikSanz;796756
Looks like your saying macros are a disaster in the waiting, Virus entry point!


Actually, the macros would reduce the chances of a virus as there are less functions to hook into, not that this is a concern in this case. The concern about macros is that the program accesses the elements of structures instead of calling a function to do it. The OS developers may decide to change the structures and the functions with it. This is more object oriented and easier to maintain but also significantly slower. Most Amiga programs already access some OS structures directly, with or without macros, so changing the structures will break compatibility. The macros are no more a disaster than the majority of AmigaOS programs which use macros and access the structures directly. It may be unfortunate that so many OS structures were documented and modified directly but it was fast and easy. To go away from this would be to break direct compatibility and instead provide compatibility through a sandbox. This may be necessary if moving to a completely alien CPU architecture after PPC finishes dying off.