I wish I could say yes, but that was something that really upset Commodore. We would often bypass the OS or prevent multitasking in order to get reliable throughput even on slower CPUs.
It was a win for users of 030's for example, but yeah, we weren't exactly system friendly.
We did love the way the OS worked though and when we moved on we modeled a lot of our internal code similarly.
Most people don't realize it, but the Flyer card is a self contained 68k computer that talks to the Amiga through the ZorroII bus. It runs its own Exec clone, much of which was also used in development of the small box I mentioned earlier. (yes, that little box has two independent 68k CPUs) You can find a version of our Exec and references to the stand alone box software in the Flyer part of the OpenVT source distribution.
It's pretty amazing what the Amiga and NewTek could do with a 68030. Turning off multitasking during high CPU loads is understandable.
I honestly don't know. The 3D and video teams are very separate and I'm on the video side.
I disassembled LightWave 5.20a and I'm not sure they knew either. The optimization level is poor even for a 6888x. It's awful for a 68040 and 68060. I see instructions like this:
fdiv.w #2,fpn ; 6 bytes
which can be replaced exactly by:
fmul.s #0.5,fpn ; 8 bytes
The 6888x drops from ~130 cycles to ~90 cycles. The 68060 drops from 40 cycles to 4 cycles (the 68040 FPU is similar to the 68060). The code is littered with other F
.w #imm,FPn instructions also which are slower to unnecessarily convert from int->fp. This does save a little code at the cost of speed but SAS/C also could not compress the double precision to single precision fp which would likely save more space overall. SAS/C is basically doing no scheduling of instructions to take advantage of the parallel operations of the integer and FPU units on all 68k FPUs either. Maybe the compilers weren't as good back then but there were still ways to do deal with these problems. My disassembly is clean enough I could spend a week optimizing and then reassemble with vasm's optimizer (vbcc's assembler) and probably double the speed on a 68040/68060 compared to OxyPatcher. More would be possible with an actual vbcc compile and my new fp math support. It would be interesting to see how much faster it would be with no traps at all. LightWave probably could have been 2-4 times faster on the 68040/68060 back then if NewTek just would have hired me. Ok, I wasn't that good fresh out of high school back then but it's too bad we can't compile the LightWave 5.20a sources with vbcc to get an idea of what the 68040 and 68060 were capable of back then (vbcc still doesn't have an instructions scheduler for the 68060 to really maximize performance).