As I said, in this instance its wasted as a speed optimisation, it affects the first interation only. Surely a size optimisation then - it would save 2 bytes overall compared to a bra.b into the loop body after the add instruction, right?
Yeah.. But in cases where two set of instructions are equally fast, if the other reduces to fewer memory/cache inst fetches, the smaller code is generally faster. This is highly academic though, as it much depends on the state of the cache, and the total size of the code being executed, aswell as the target CPU. Anyway, in essense size optimization that doesn't slow down execution can be considered speed optimization, too.
Given that on the 060 it may invoke an exception, it's hardly a speed optimisation anymore
Well, it was ok for 68020/68030 at least, possibly with 68040. But if the code is to be executed on 68060, it's obviously not recommended.