The main reason a NOP can make an executable faster is that it can allow a loop to be aligned to a row of cache memory in the code cache. This allows the entire cache to be filled with code from the loop and speed things up substantially. The code cache on an '040 is 4k and '060 is even bigger.