Author Topic: AmigaOS and the Console Development - Part 1 (Read 26048 times)

ssolie · « **on:** September 30, 2015, 05:50:17 PM »

There is a new blog on how the Amiga's console evolved from 68K assembly to PowerPC.

See http://blog.hyperion-entertainment.biz/

eliyahu · « **Reply #1 on:** September 30, 2015, 06:22:47 PM »

@ssolie

since i spend so much time in the shell, i'm looking forward to reading the next installment. the new console in final edition is so much nicer than kingcon (which i had used before). tony did a really nice job.

-- eliyahu

Pyromania · « **Reply #2 on:** September 30, 2015, 07:20:02 PM »

Nice read.

matthey · « **Reply #3 on:** September 30, 2015, 08:06:16 PM »

Code: [Select]

    //    lea    cd_RastPort(a6),a1
    //    move.b    cu_Mask(a2),rp_Mask(a1)
    IGraphics->SetWriteMask(rastPtr, unit->cu_Mask);

    //    moveq    #0,d0
    //    move.b    d6,d0
    //    LINKGFX    SetDrMd
    IGraphics->SetDrMd(rastPtr, oldMode);

Progress! The replacement code is probably 1/3 of the speed of the original. Calling functions has lots of overhead. The macros in includes/graphics/gfxmacros.h could be used from C as well.

Quote

The original 68K assembler version of the console occupied 16,212 bytes. Compare that with the last 68K C version (V50.26) which was 42,564 bytes! It just shows how squashed the assembler version was (but that C version did have some debug code as well).

... Also, the assembler version had been written to minimise code size, rather than to optimise speed or code legibility.

The 68k assembler code was optimized for size instead of speed yet the example shown is much faster and much smaller in assembler. Why don't they just write the code in C++ (for better maintainability) also and see if they can make a semi-modern $2000 PPC computer slower than a 25 year old 68060?

Edit: Also, the 5 lines of assembler code, if properly optimized, should be 4 lines on an existing 68k CPU and could be 3 lines on an enhanced 68k CPU.

Code: [Select]

    //    move.b    cu_Mask(a2),rp_Mask+cd_RastPort(a6) ; this works on existing 68k
    IGraphics->SetWriteMask(rastPtr, unit->cu_Mask);

    //    mvz.b    d6,d0  ; ColdFire instruction
    //    LINKGFX    SetDrMd
    IGraphics->SetDrMd(rastPtr, oldMode);

kamelito · « **Reply #4 on:** September 30, 2015, 08:19:55 PM »

Nice read can we see episode 2?

Kamelito

ssolie · « **Reply #5 on:** September 30, 2015, 08:35:53 PM »

Quote from: kamelito;796655

Nice read can we see episode 2?

We are working on it now. It shouldn't be too long.

amigakit · « **Reply #6 on:** September 30, 2015, 08:38:20 PM »

A very interesting read. Thanks!

guest11527 · « **Reply #7 on:** October 01, 2015, 01:23:44 AM »

Quote from: matthey;796652

Progress! The replacement code is probably 1/3 of the speed of the original.

And why do you bother? Is this in a speed-critical part of the device? Even worse, the assembler function does not go through an interface, hence anything that probably sits on top of SetWriteMask() is ignored.

Despite all that, the best optimization *here* would have been to simply drop the SetWriteMask() alltogether. The reason why the console.device uses this trick is to speed up scrolling in case the current color selection only requires a single plane to be modified, but everything on the PPC does not use planar graphics in first place, so any attempt to modify the write mask will make the code slower.

So yes, the translation to C code is probably well-intended, the 1/3 speed of a C function call is surely not relevant here. What is relevant is that this function is - for any sane graphics organization - useless and the whole code part that depends on this should have better been moved to trash in first place.

The speed critical part was not the function call, but the problem it solved in first place, a problem absent at the target platform.

Again, this rather shows that such a low-level view on optimizations is highly misleading.

Quote from: matthey;796652

Calling functions has lots of overhead. The macros in includes/graphics/gfxmacros.h could be used from C as well.

But would not have isolated the interface as well. Anyhow, neither the macro nor the function call is the right solution. The right solution here is to learn what the purpose of the write mask was in first place, and that it serves no purpose on the target platform.

Quote from: matthey;796652

The 68k assembler code was optimized for size instead of speed yet the example shown is much faster and much smaller in assembler. Why don't they just write the code in C++ (for better maintainability) also and see if they can make a semi-modern $2000 PPC computer slower than a 25 year old 68060?

With C++ code, the code would have been better, and probably faster than the C code, but as always, "faster" does not come with the language automatically, it comes with "understanding the problem" that is to be solved. Neither C, C++ or assembler do that by themselves. C++ helps because it allows you to get a cleaner view on the problem. Here, however, the actual problem was not understood correctly, and the code was "blindly" translated to C.

Quote from: matthey;796652

Edit: Also, the 5 lines of assembler code, if properly optimized, should be 4 lines on an existing 68k CPU and could be 3 lines on an enhanced 68k CPU.

That's just a useless micro-optimization. It wouldn't have created any noticable difference. Setting the mask on a planar graphics system - *that* makes a noticable difference. Instead, setting a mask on a chunky graphics system is a notable pessimisation because the graphics emulation has much more work to emulate the mask in first place. It is not "how to set the mask" that makes the console faster or slower.

It is "why do I need this code" that makes the difference. The analysis here should have showed "No, I do not need this code". In fact, this would probably have been more obvious if the code would have been in a high-level language in first place - instead the author got lost in details.

kolla · « **Reply #8 on:** October 01, 2015, 05:21:54 AM »

I just want a console-handler with a wee bit of scroll back buffer and tab competition that can be put in kickstart even on a 68000, for those occations when I boot without startup-sequence. Or rather, tab-expansion in shell and only scroll back buffer in console-handler. For what it is worth, AROS has something like this already, "real" AmigaOS is lacking behind these days.

matthey · « **Reply #9 on:** October 01, 2015, 06:32:20 AM »