It's not like we can't detect the machine we are running on and use a different codepath for each architecture. Best of both worlds.
We could if we coded for a static VM and let the final optimization take place at install time.
Do we currently have such tools? How fast would that be for 68020/30 compared to hand optimized code?
Mostly all that exists at this point is compiler middle-ware like LLVM or fixed-function compilers like GCC and VBCC. But I know LLVM has a PBQP register allocator that is smart enough to stuff multiple small values in a large register when it makes sense to do so, for example:
move.w (a0,var1), d0 ;load var1
swap d0 ;stuff it in top half of d0
move.w (a0, var2), d0; load var2 in the bottom half of d0
and so on. Compilers can be smart if they are programmed to be.
Once implemented, a virtual machine based on this could make a difference for anybody with generally unsupported hardware! This technology is already in use by the PNaCl VM in the Google Chrome browser but for little-endian machines only. This is why low-level coding is dying out, not only because high-level code is cheaper to make, but the optimization can be automated!
I've been pushing for this stuff since 1998 and if Amiga, Inc. hadn't been so pig-headed stubborn, we'd have had this on the Amigas by now! AmigaDE could have made Amiga much more compatible without the need to alter the hardware. Since LLVM is Apple-supported open-source freeware and the backend used by the PNaCl VM is also (except-Google supported), it is almost in reach again!