And why do you bother? Is this in a speed-critical part of the device? Even worse, the assembler function does not go through an interface, hence anything that probably sits on top of SetWriteMask() is ignored.
The macros are perfectly legal and they are a lot faster. Is it better coding practice to use slower code for little if any benefit? It is true that the macros don't go through the LVO so a setpatch couldn't replace the code but this is true of much of the AmigaOS. It didn't stop P96/CGFX from patching everything. Your newest layers.library still uses the Forbid/Permit macros in most places instead of the function calls. Also, some of your layers.library function calls go through the LVO and some use a regular BSR instruction. It looks pretty inconsistent to me.
Despite all that, the best optimization *here* would have been to simply drop the SetWriteMask() alltogether. The reason why the console.device uses this trick is to speed up scrolling in case the current color selection only requires a single plane to be modified, but everything on the PPC does not use planar graphics in first place, so any attempt to modify the write mask will make the code slower.
AmigaOS 4.x can be used on a CSPPC with AGA graphics so maybe it is not completely useless. There is sometimes a cost to backward compatibility.
So yes, the translation to C code is probably well-intended, the 1/3 speed of a C function call is surely not relevant here. What is relevant is that this function is - for any sane graphics organization - useless and the whole code part that depends on this should have better been moved to trash in first place.
A couple of functions aren't going to make much of a difference. I just found it funny that the example wasn't consistent with the article comments. It is also funny that the 68k AmigaOS with all its optimizing for space really wasn't so well optimized from their example either.
The speed critical part was not the function call, but the problem it solved in first place, a problem absent at the target platform.
Again, this rather shows that such a low-level view on optimizations is highly misleading.
Yea, but you wouldn't have removed the call because the code is working. I know you too well. Making the safe optimizations to gain what I could while leaving planar support intact sounds pretty good to me. It would be better if the code was in C with the programmer using the macros and the compiler doing the optimizations though. This is not speed critical code after all

.
With C++ code, the code would have been better, and probably faster than the C code, ...
Not likely faster. Not likely at all. Better is debatable.