I'm not suprised that a 1-2 persons in their home can beat a commercial development department. Usualy those institutions are hampered with red-tape, clueless managers that doesn't get the task at hand etc.. Ofcourse the original creators had to do things in hardware without any good simulation tools. But the former still applies, even now.
lou_dias, Cache helps, but the 16-byte cache is way smaller than the original 68030 256-byte cache. Increasing from 16 to 256 bytes gives cirka 7% more performance. But cost lots of matrix estate. So not worth it. So any performance from cache on the TG68 comes really from another section. The only noteworthy is that the data cache gave TG68 a 2x boost. I suspect there are other parts that can be optimized. But one has to stay compatible too. As soon as some software uses tight I/O loops with external hardware or multimedia chips there might be a problem otherwise.
I see a common problem with both VHDL and Verilog. They both want to be imperative language look-alike. Personally I don't like ADA, but like the ability to do serious flank setups in VHDL. The problem with imperative and even functional language is that they are designed with a sequential processing in mind.
A language which is centered around a logic array configuration would be far superior.
(wonder when those new linear regulators show up at mikej

)