On one hand a cpu that was compatible with 68000-68060 + CF software would be nice. On the other it is not actually possible to do that 100%, especially because of stack frames & MMU differences.
The 68000-68060 were not compatible with each other as far as stack frames and MMU differences. Adding "normal" instructions that operate in the same way as previous would not affect these. It's the user level compatibility that we want and it's quite good...
"In most cases, an instruction/addressing mode which does exist in ColdFire behaves exactly like its 680x0 equivalent, which makes it easy for experienced 680x0 programmers to understand ColdFire code. It also means that user-mode code written for ColdFire can generally run unchanged on a 680x0 processor, provided the new ColdFire-only instructions are not used.
However, there are a few subtle cases where the ColdFire instruction is not exactly the same as its 680x0 counterpart. The most important of these is that multiply instructions (MULU and MULS) do not set the overflow bit. This means that a 680x0 code sequence which checks for overflow on multiply may assemble and run under ColdFire, but give incorrect results.
ASL and ASR also differ in that they do not set the overflow bit - but this is less likely to cause problems for real programs!"
MULU/MULS/ASL/ASR will not be a compatibility problem for the 68k as they will continue to be set the 68k way. ColdFire programs would be slightly incompatible because of this but it's extremely rare for a program to use the overflow flag. It's entirely possible to make a 68k+CF CPU that's more compatible with the 68k line than the 68060 was.
Extending the instruction set beyond what is available now is a dangerous game though. I wouldn't necessarily even add CF, I'd spend the time and gates on getting the instruction rate up.
No, it's not dangerous. You would need to double the instruction rate to do the work of a mvs or mvz instruction and they would be common. You would need to at least triple the instruction rate to do the work of a byterev which is less common but used intensely in some drivers and data conversions for loaders/pictures etc. The code reduction also allows the cache to be used more effectively and reduces branch sizes improving overall efficiency. The CF instructions make the job of developers and compiler writers easier also.
In terms of AGA enhancements, higher bandwidth and chunky pixels is all you really need. Copper and blitter etc can stay register compatible.
That's true. It's easy to keep the old and add more while staying compatible. The same applies to the instruction set. Poorly written software will find a way to break no matter what. They will have to be fixed instead of the hardware held back.