@NovaCoder
GCC 3.4.0, last time I checked, was calling utility.library for integer math like 32x32=64 which is good. The library function call is a little slower than an inline but then utility.library can be patched with the fastest 68060 specific code like ThoR's Mu 68060.library does for 32x32=64. I believe GCC 3.4.0 was a special Amiga specific GeekGadgets version of GCC. Perhaps you can try GCC 3.4.0 with AmidevCPP to see if the utility.library is used?
I did some DMIPS test compiles that showed vbbc (vc -cpu=68060 -O2) at 55 DMIPS error free with 68060@60MHz and GCC 3.4.0 (gcc -fomit-frame-pointer -noixemul -m68060 -O2 -mregparm=4) at 56 DMIPS with register passed function variables that generated several bugs. Vbcc was using stack passed variables. The DMIPS code used no 64 bit integer math, which if trapped in GCC could make vbcc faster. The testing was with the latest version of vbcc and the latest beta version of vasm (has some nice peephole optimizations added as well as bug fixes). Vbcc is getting better but it's still not very GCC compatible, it has some bugs above -O1 (although -O1 has much better performance than GCC -O1) and it's slow to compile.
GCC 2.95.3 is still probably the best at code generation. It handles the trapped instructions with local replacement functions (no utility.library). I believe the newest 68k GLQuake is compiled with this as it is one of the best optimized 68k programs I've seen. It's very rare that I find a 68k program that I look through the disassembly and not grimace. Unfortunately, GCC 2.95.3 is not very compatible with newer versions of GCC.
@Karlos
68060 compiled code I've seen should run without problems on a 68040 although at a slower speed. Using the utility.library for integer 32x32=64 is nearly as fast for the 68040 but the missing FINT/FINTRZ in the 68040 for floating point is a performance killer. What were the Motorola Engineers thinking? Programs using floating point should provide separate 68040 and 68060 compiled versions because of this.