Now, what exactly will SetWriteMask() do in this situation? It has the effect of making display updates more costly if an interleaved bitmap is used (which is the standard case). Instead of moving one consecutive chunk of bitmap data, the blitter operations have to be broken up into individual planes again. Available bandwidth is used less effectively. This can be noticeably slower.
It won't be more costly in the standard case when only one bitplane is scrolled, if more than one is scrolled then the overhead is just waking the CPU up and getting it to kick off the next blit and in some cases you can do it with one blit anyway.
In fact
not calling SetWriteMask() would make "the display updates more costly" in the "standard case" as "available bandwidth is used less effectively" (for the simple reason that you are copying up to 8 times the data with the blitter).
The way the autodoc is written I would only expect the call to only have an effect on planar screens, so there isn't a clear benefit to removing the call.