if you've a GFX Card and this one have a blitter isn't the P96 or CGX supposedly use that in place of the Amiga Blitter freeing the CPU for other things?
Indeed. P96 is smart enough not to allocate bitmaps in chip memory if it does not have to. There is no reason to mess with the graphics library (especially Cosmos) given that P96 already replaces most of it, and is smart enough to off-load graphics operations to the CPU or the graphics board blitter if available.
Actually, one could write a P96 driver for the native chipset which could then offload all the blitting to the CPU if required. P96 is smart enough to shuffle bitmaps between the graphics board memory (here "chip mem") and fast memory as required.
The advantage of Blitter is freeing the CPU if you replace Blitter functions by CPU routines then you loose that CPU. If the CPU is doing nothing that fine but if you render an image for instance in the background and using you Amiga aren't you loosing time?
Potentially, however, it's not quite that simple. In principle, the CPU can trigger the blitter and then return to the caller while the blitter is still running. However, if a screen has multiple bitmaps (so more than two colors), which is the norm, then normally the blitter has to be triggered multiple times. There are a couple of exceptions (screen bitmap is interleaved, operation is simple enough), but in general, the CPU has to wait for n-1 bitmaps to be completed before it returns with the blitter working on the last bitmap. So you do not gain much.
Even worse, the current implementation of the graphics library runs into a busy-wait to let the blitter complete, it does not perform smart things such as sending the CPU into a Wait(), where it could allow another task to take over.
The decision to use such a (IMHO stupid) design was based on the slow 68K CPU. Back then, the overall overhead of the CPU reacting on a blitter interrupt, and managing and mainting such interrupts was higher than just to run into WaitBlit(), so the design remained as stupid as it is today. Whether this estimate still holds for faster CPUs, i.e. busy-wait vs. Wait(), is unclear to me and is something that is probably worth trying. At least it could free the (much faster) CPU doing something useful instead of wainting for the blitter.