Also I have been thinking of a way to make the instruction translation do branch predication in the case a conditional branch skips only a few instructions.
Be careful with the predication on the 68k. It might be possible to get it to work as 1 conditional instruction sometimes. It doesn't work well with multiple instructuctions, multicycle instructions or addressing modes that update the base register like (An)+ and -(An). The data to be predicated ends up having to be examined for suitability. IMO, this would only be worthwhile with very common code. Image handling this:
beq skip
movem.l d0-d7/a0-a6,-(sp)
skip:
move.l d0,-(sp)
The N68k fpga CPU is supposedly conditional 3 op internally making predication easier. There were enough problems on the 68k that we decided adding SBcc and SELcc were easier. Even this takes some logic but the 68k already has Scc which is handled much the same way.
Actually something just occurred to me. If the most common instruction is "tst", it should be possible to know whether a branch will be taken or not some time in advance. Because "tst" only looks at a single register, the contents of that register must have been determined some time before. So you could look ahead in the instruction queue for a "tst/bcc", and inform the branch predictor well in advance. "tst" instruction then takes effectively NO cycles.
The 68000 (16 bit) code in a console is going to be very different from 68060 optimized code for a dynamic OS today. I very much doubt TST is going to be number 1 any more. I expect MOVE to be #1. MOVE sets the condition codes so a TST should not be needed too often with optimized code. Folding a TST, CMP, or SUB/SUBQ with a branch is something the 68060 does to help achieve 0 cycle branch prediction although I don't know which specifically it does. TST has a higher likely hood of testing a register that has not been modified for a time than MOVE which sets the cc. Many processors do try to determine the branch rather than predict it. The PPC is especially good at this. It also provides several cc's that can be selectively set and branched on later. Most PPC processors have a fairly short pipeline too so branching on a condition set 3 or 4 instructions ago or testing and immediately branching on an instructions that hasn't changed recently may be enough to determine the branch without prediction. It probably helps, especially if the compilers can generate good code, but it obviously hasn't helped PPC destroy x86 like was predicted 20 years ago

.
Not strictly true. Can also do "cmp (Ax)+,(Ay)+"
addx, subx, abcd and sbcd can use predecrement for both operands.
All of these are two cycle instructions.
Yes, they are more complex on the 68060 but no they don't use 2 EAs. They are special cases that do not calculate even 1 EA. The plus of (An)+ is added after the EA is used and is not part of the calculation.