Hi Matt,
Whilst I appreciate the sentiment, the project in question is the amiga OS3.x 680x0 backend implementation my C++ framework.
Hence portability of this particular code is not an issue. C++ versions of this code already exist in the "generic" part of the tree, the asm used here replaces these for speed gains *specifically* on the 680x0 - other implementations may or may not have similarly optimised code.
Now, the asm used therein is sparse but essential for performance. They are used in memory copy / set / endian swap routines. No matter how good your compiler or algorithm, some things C/C++ just cannot achieve with the same level of performance.
For instance, you can byteswap a 32-bit longword with just 3 instructions in asm thanks to instructions like "rol.w" and "swap", for which C/C++ have no equivalent operators.
I also have some pixel conversion routines etc. that are up to 5x slower in the generic C++ code than the asm, despite loop unrolling, longword only accesses and stuff, again simply due to the lack of rotate instructions.
I'm moving the code to gcc, purely so the generic branch of it is easier to port cross platform and more advanced C++ features can be used.
Any given backend, be it Amiga680x0/PPC, AROS, Linux, Win32, etc. will always use whatever it can to get the best performance, only the public front needs to be completely platform intependent.