Cache breaks compatibility but if you go with unified cache with snooping, self modifying code is even possible (to a certain extent : you have to take into account the instruction prefetch and the pipeline depth).
Self-modifying code should work with snooping but would be slow. The caches only need to be invalidated if using writethrough caches (writethrough caches are probably not much slower than copyback caches with the fast memory and high memory bandwidth). It should be possible to have self-modifying code and cache compatibility better than the 68040 or 68060 with much larger cache sizes.
If the Amiga chipset is implemented inside the FPGA, you can even snoop DMAs and keep cache coherency over Chip RAM.
Yea, there are several issues with the Amiga chipset that should be solvable with the chipset and CPU in the same fpga. Maybe even multi-threading/SMD would work with a little trickery. It's easy to duplicate an fpga core.
Due to the way Exec detects CPU, you can have a core with 68000 exception frame and '020 user instructions (long branches, bitfields, 64-bit MUL/DIV and extra EAs).
This is basically the way the Phoenix core in the Vampire will work although some of the 68020 features don't fit in the Cyclone II of the Vampire. There is no 64 bit MUL/DIV (although 32 bit longword versions were added) and it's missing most if not all bitfield instructions and most if not all double indirect addressing modes. The specs may change.
By the way, the IPC of Phoenix will be limited by only 1 integer pipe but should be close to 1. Simulation has showed that much more is possible with more pipes (each pipe can be stronger than the 68060). I don't think there is any way that more than 3 integer pipes would be useful. The instruction fetch becomes large, there are too many memory accesses between 3 instructions and the CPU clock speed slows down some with each pipe. Even 3 pipes may not be an advantage although a few tricks may make this possible without OoO execution

.