P96 or CGX doesn't matter as all Quake-versions optimized at all locks the vram with functions in CGX/P96 and write by themselves directly to it.
This applies to the normal software-rendering Quake and in that case the card with the fastest bus-interface will the be fastest, which is the CyberVision64 (not the 3d version, which is the slowest of all Zorro3 cards).
Nevertheless the video-card will _not_ be the bottleneck, but as others have said the cpu will be the ultimate bottleneck, even if the GL version is used.
This because Quake is rather cpu-demanding and even the GL version doesn't offload much of the cpu, it rather loads the cpu extra at certain occasions because the GL version was a hack just to prove that Quake could run under GL and does lots of runtime conversions to adapt Quake to hardware acceleration.
For software-rendering 68k ports of Quake I found Clickbooms version to be the fastest (and most buggy). I seem to remember that I only got one 68k version of the GL-version running, which is faster than Clickbooms software rendering one (on a CVPPC gfxcard), but as mentioned, very uneven in its framerate.
For PPC-ports, Frank Wille's software-rendering version, running in 320x200 beats all other PPC-ports.
(edit:)
A friend of mine made a fix for Quake which speeds it up quite a bit, you can download it
here.
As far as I know it will work on all Quake versions. To use it, just create a directory in your Quake directory, put the file there and start Quake with the extra argument "-game the_directory_you_created".ยจ
You should also be able to use the pakman tool and replace the progs.dat version in pak0.pak, but I don't really know how to do that.
As a comparison - if I am using the clickboom version in fullscreen and at the very start of the first level, without moving I execute a "timerefresh", I get 9fps without the fix and 15fps with the fix.
/Patrik