Well what this shows is that on a stock 68020 at 14MHz, too much time is spent in a software C2P routine, so much that the Akiko actually helps massively, despite being a pretty dumb chip AFAIK.
So in essence on the CD32, Akiko was meant to save money on including fast RAM in the system. A small cheap bit of silicon versus more expensive memory.
I don't even think Akiko can do DMA C2P on a range of memory. I think you have to write the chunky data into 32 32-bit registers on the system, and then read the resulting planar data back from it. I.e., it's probably a dual-ported SRAM with different bit access modes for write and read.
It's just that in code, that particular C2P operation takes too much time on a 68020. A 68060 can effectively do it transparently at memcpy speed (but hey, 68060!).
So a chunky game on CD32:
1. Render game in chunky in RAM.
2. Cycle over rendered chunky scene (might be a tile/column/row rather than full screen) in memory
2a. Write chunky memory into Akiko
2b. Read planar memory from Akiko (saves a lot of bit fiddling/shifting/rotating)
2c. Write to planar display memory (backbuffer)
3. Until scene finished. Switch backbuffer to front. Start again.
Note that DMA would have automated all of 2a,2b,2c. Maybe Akiko did have that?
Interested to see that it would mean a difference from around 12fps Gloom to around 40fps. A major difference.