You are moving 16 bits of one plane, so after operating on all 8 planes you have moved 16 bytes which corresponds to 16 pixels. Which is the same number you move with chunky pixel mode.
16 bits of one plane only equals 16 pixels in 1 bit per pixel mode. If you have more than 1 bit per pixel, you're handling parts of pixels. Part of pixel != pixel :p
That is the only situation where planar does require extra bandwidth.
Which is a very common situation: Blitter based sprites (bobs).
While it's nice to have everything as fast as possible, it would usually only be statistically significant on small blits whose time is mostly taken up by the blitter startup.
It's absolutely NOT insignificant. When you're blitting 16 pixel wide bobs (quite common), you're wasting half the blitter time on the 16 extra bits per bob.
Anyway, assuming AGA chunky, I was wrong, because the blitter would still have to access whole 32bit words in memory to get to the individual bytes. That it would get rid of the need for extra shift bits is irrelevant. Only a faster blitter would make sense.