Thank u for all the delicious details!
For compatibility reasons the AGA blitter is 16-bit like its real counterpart.
Smart decision.
But I have implemented another 32-bit blitter to accelerate RTG operations. It's much faster.
Excellent decision!
You are very intelligent!
I can't believe I wasted 20 hours trying to convince Gunnar to make a better blitter when I should have directed my typing at you instead.
How many Mhz does your 32-bit RTG blitter run at?
Does the RTG blitter have some internal SRAM buffer space it can use to speed up blitting?
The reason my blitting routines are so fast is that I mix multiple layers of gfx inside the CPU registers. A cpu register is way the hell faster than fastram or chipram.
So my blits work like this:
Blit(source1, source2, source3, source4, source5, source 6, source7, source8, destination)
So I save massive amounts of memory bandwidth over the oldskool Natami blitter.
Using Natami blitter or AGA blitter I must do it the lame way:
Blit(source1,destination);
Blit(source2,destination);
Blit(source3,destination);
Blit(source4,destination);
Blit(source5,destination);
Blit(source6,destination);
Blit(source7,destination);
Blit(source8,destination);
This wastes massive amounts of memory bandwidth and bus bandwidth.
So its way faster for me to do blitting with CPU on 030. Wayyyy faster on 060.
If ur blitter has some internal SRAM to work with then you could implement a multisource to one destination blitter that would massively increase blitting power.
Something for u to think about

If u can't do it that's ok. I can keep using my 060 to do the blitting.