What prevents darn fast planar modes on modern FPGA?
Learning to walk before you can run. The FPGA projects has so far had enough to do with getting stable, compatible and feature complete. (Well, available too for that matter.)
Next, getting the cpu part up to speed(hah!) is typically also more important as users might be used to faster solutions from back in the day, and because compatibility is good across different cpu versions and speeds.
Throw compatibility to the wind and you can go as fast as you can make it. The overarching(?) problem is the fixed clock and access slots in the chip architecture. Modern memory likes to do sequential access to get effective bandwidth but you can change pointers every 2(IIRC) lowres pixels with original timings which ruins what your memory can do for you. If you make a chipset you can access faster then it just gets "worse".
Any new chipset wants to be both modern and compatible and so it has to abide by the original rules and also present its own new ones - and these new rules have to be less flexible necessarily, much like AGA probably. The logic that was so clear with OCS starts falling apart when a register update doesn't take effect before long after when many many more probably have been done.
The only through-and-through high-end solution for a modern chipset would probably be SRAM based which is deterministic for any access but still has the same problems with price and memory size as it has always had...