The problem with CF68Klib, if you check the documentation, is that even in supervisor mode it doesn't trap instructions that run differently on the Coldfire than they do on the 68K.
I'm sure there's a way around this, but while CF68Klib looks promising, it may not be the only answer.
The best bet is probably something like a simple tracing JIT translator where most instructions translate 1-to-1. You could do it mostly "in place" by scanning block by block and rewriting any offending instructions, and replace Bcc, JSR, JMP etc. with traps if you can't show they point to "safe" (already JIT'ed) code, then you jump to the code. If/when you trap again, you continue JIT'ing and patch the instruction that brought you there back to its original.
As an example, lets assume this completely bogus example sequence:
MOVE.L D0,-(SP)
SOME_BROKEN_INSTRUCTION
RTS
You'd decode all three instructions, then rewrite it like this:
MOVE.L D0,-(SP)
BSR some_free_location (unless SOME_BROKEN_INSTRUCTION is long enough for you to be able to "emulate" it inline, in which case you have it easy)
RTS
some_free_location:
[code to emulate SOME_BROKEN_INSTRUCTION]
RTS
The biggest problem is if SOME_BROKEN_INSTRUCTION is too short to provide enough space to branch elsewhere, in which case you have two choices: Resort forcing a trap or rewriting following instructions too. The latter quickly makes things trickier as you then have to deal with rewriting branches etc. that may point to the later instructions that you move.
You pay the additional cost of the JIT process, but once critical paths have been JIT'd, it'll run at near optimal native speed. "Near" because you get the extra overhead of potentially having extra branches to account for "patch sites" where there was no space in the original code to plug in the modified instructions, unless you go to the potentially significant extra trouble of rewriting the whole thing.
Note that this is not foolproof. Self modifying code etc. or code that intentionally jump into the middle of an instruction could still cause trouble and is much harder to deal with.
The JIT could be very fast, as for any instructions deemed "safe" it'd just need to recognize them and move on to the next instruction, and recognizing them could be done with a very small, compact decoder since it could discard large groups of instructions as safe with a few simple bit masks.