It highly depends on what you mean by "until DMA completes". If bus arbitration is for a single cycle then it is that (of course there can't be multiple users on the bus simultaneously). Usually there's a "burst", so the bus get allocated for a maximum on n cycles (many Pentium era PCI boards allowed you to set a 'PCI latency' - that's the length of that burst cycle) and bus mastership doesn't change within that cycle.
However, these are pratical limitations. In theory each bus cycle could be arbitrated independently, so a longer DMA operation (without buffering and bursts) not saturating the bus could get interleaved with CPU cycles. So, in general, that prof is wrong.
Additionally, the CPU could easily run on cache alone as long as no memory cycle is required.
Furthermore, a dual (triple, ...) RAM channel design (unganged) could very well run
both DMA and CPU cycles simultaneously, or even several DMAs (Xeon EXs have up to four memory channels!).
Even more complicated, integrating the memory controller into the CPU and using a peripheral connect for I/O (like Hypertransport, QPI, ...) could even have your I/O connect saturated with the memory subsystem idling for a few cycles which could be scooped up by the CPU.
So, all in all, he's talking crap. Sorry.

Could the CPU work on Fast memory while the Custom chips did DMA with the Chip memory simultainously ?
YES! YES! YES!
why are some A1200 & CD32 have 70ns chip ram to the normal 80ns chip ram?
That doesn't matter. 80 ns is fast enough, there isn't any way to go faster unless you're overclocking the chipset (yes, I've tried that once

).