Well, first of all 68040 branch prediction (speculative execution) is turned off. It hurts performance quite a bit.
Second, all memory accesses (reads, writeback) will go to memory directly, without making use of cache. This will make the CPU stall a lot, waiting for memory accesses to finish.
Third, due to direct memory acceses it is likely that the 68040 pipeline (which is deeper than 030 one) can't be kept full all the time (that is the CPU is starved).
Fourth, 68040 improved performance depends quite a bit on copyback cache. Disable this and the performance is hurt a lot.
In short, 68040 depends on the cache to move the data cacheline at a time (16 bytes at a time) and to remain efficient. Remove this, and the CPU is limping badly. I doubt the 68040 was really optimized for the case where caches are deliberately turned off.
68030 on the other hand is basically 68020 + datacache + memory management unit. It's far simpler design and doesn't depend on the cache that much.